Hello everyone, everything good? I'm trying to apply Q-Learning to the Humanoid environment. How are you getting started with a Box(17,) environment? Because in many examples I see that they only do with a Discrete environment...
Improving the question As you take the size of the spaces (observation and action) in a Box environment (376,) -> Observation and Box (17,) -> Humanoid Action Example: action_space_size = env.action_space.n state_space_size = env.observation_space.n q_table = np.zeros ((state_space_size, action_space_size))
Hi, First of all, your question is not related to Mujoco, but to OpenAI Gym, so it is more appropriate to post your question here: https://github.com/openai/gym/issues. However, to answer your question, Q-learning is working with discrete action spaces. Box is for a continuous action space, for which you have to use other algorithms, like DDPG.
Hi, thanks very much for anwser!!! Is true, for environments more complex and input data Q-Learning not is good, I'm studying o TRPO. And i'm so sorry for delay, is because I had already given up on having an answer here on the forum. Thanks!!!