Solver instead of RL - the Reacher model

Henryk · Feb 16, 2017

Regarding the model

https://github.com/openai/gym/blob/master/gym/envs/mujoco/assets/reacher.xml

Using the code https://github.com/joschu/modular_rl I trained an RL algorithm (TRPO) which effectively learns the task

https://gym.openai.com/envs/Reacher-v1

(the task consists in getting the arm to the marked point and at the same time minimize the torque - rewards are given per step and a precise definition of a reward is in https://github.com/openai/gym/blob/master/gym/envs/mujoco/reacher.py#L14)

Is it a realistic idea to phrase the task as a constraint for the solver inside of MuJoCo and get a similar solution not via the RL algorithm, but directly from the solver? What would be the right starting point? Overall, I would be glad to see a non-RL solution to this simple task (if such a solution exists). Sorry for a naive question.

Emo Todorov · Feb 16, 2017

Yes this is a trivial control problem, normally solved by Jacobian methods in Robotics. You can also trick MuJoCo to solve the problem for you. Add a soft equality constraint between the hand and the target location. This will generate constraint forces (mjData.qfrc_constraint) that propel the arm to the target and make it stop there. Now you can take these forces and use them as controls in the original model which doesn't have the equality constraint.

Log in

Solver instead of RL - the Reacher model

Henryk

Emo Todorov Administrator Staff Member

Log in

Solver instead of RL - the Reacher model

Henryk

Emo Todorov Administrator Staff Member

Useful Searches