A standard approach in reinforcement learning is to train multiple agents in parallel. Neural networks can produce multiple actions with a single forward pass, but stepping the environment in parallel requires multiprocessing. I am wondering if it is possible to design a patch for mujoco that allows multiple simulations to be run in parallel without multiprocessing. In principle, it should be possible to partition each vector in mjData into n chunks and apply simulation updates to each each chunk. I would like to do some work on this as a final project for a parallel architectures class that I am taking. I was hoping you might suggest a feasible approach to this and grant my team access to the subset of the source code that we would need to work on. Thanks!