Design and train reinforcement learning and imitation learning policies for movement and control tasks
Run experiments on physical hardware and close the sim-to-real gap through systematic debugging and domain adaptation
Build and maintain simulation environments and data pipelines that support fast policy iteration
Instrument deployments and analyse failure modes, feeding what you learn back into training
Work closely with hardware and firmware engineers to understand physical constraints and improve policy robustness
Requirements
Around 2 to 3 years of relevant experience; exceptional recent graduates with a genuinely strong portfolio and internship background will also be considered
Strong foundations in reinforcement learning or imitation learning, with hands-on experience training policies that run on real physical systems (not simulation only)