Reinforcement learning with musculoskeletal models

Support from Google Cloud Platform

nips

27 Jul 2018

Google Cloud Platform generously offered to support our NIPS 2018 AI for prosthetics challenge with cloud credits. Submit a solution with a positive (>0) score and get $250 credits! (Read more details on the crowdAI platform.)

Join our OpenSim webinar on August 7th 10 am PST to hear more on reinforcement learning for musculoskeletal models. Learn how to get started with the challenge!

Here is the current top solution. See more in the leaderboard.

Can you do better than that?

Version 2.0 released

release

17 Jun 2018

kidzik

This release includes changes for the NIPS 2018 AI for prosthetics challenge:

ProstheticsEnv with 4 modes: 2D/3D and with/without a prosthesis
Dictionaries as observations instead of vectors
Refactoring of the classes

Version 1.5 released

release

20 Dec 2017

kidzik

Grader now accepts only this version. In order to switch to the new environment you need to update the osim-rl scripts with the following command:

pip install git+https://github.com/stanfordnmbl/osim-rl.git -U

This release includes following bugfixes

Fixed first observation (previously it wasn’t showing the first obstacle correctly). (issue #53)
Fixed geometries for the right leg. (issue #75)
Activations from outside [0,1] are clipped to [0,1] (issue #64)

Version 1.4.1 released

release

02 Oct 2017

kidzik

After discussing the way the reward function is computed (issue #43), we decided to further update the environment. Uptill version 1.3, the reward received at every step was the total distance travelled from the starting point minus the ligament forces. As a result, the total reward was the cummulative sum of total distances over all steps (or discreet integral of position in time) minus the total sum of ligament forces.

Since, this reward is unconventional in reinforcement learning, we updated the reward function at each step to the distance increment between the two steps minus the ligament forces. As a result, the total reward is the total distance travelled minus the ligament forces.

In order to switch to the new environment you need to update the osim-rl scripts with the following command:

pip install git+https://github.com/stanfordnmbl/osim-rl.git -U

Note that this will change the order of magnitude of the total reward from ~1000 to ~10 (now measured in meters travelled). The change does not affect the API of observations and actions. Moreover the measures are strongly correlated and a good model in the old version should perform well in the current version.

Version 1.3 released

release

15 May 2017

kidzik

Due to the errors described in issue #31 we updated the environment introducing the following changes:

added velocities of joints (previously we had a duplicate of positions of joints)
added the left psoas in muscles (previously there was only right psoas twice)
moved obstacles closer to the starting point (so that it’s easier to train on them)
extended the trials to 1000 iterations (if they don’t trip/fall earlier as it was before)

In order to switch to the new environment you need to update the osim-rl scripts with the following command:

pip install git+https://github.com/stanfordnmbl/osim-rl.git -U

Since the observation vector changed, you may need to retrain your existing model to account for these new changes. However, the old observation is in fact a subset of the new observation so if you want to submit the old model

for j in range(6,12):
    observation[j+6] = observation[j]
observation[36] = observation[37]
observation, reward, done, info = env.step(my_controller(observation))

Yet, with new information, your controller should be able to perform betters, so we definitely advise to retrain model.