Ideal for someone with deep scientific judgement, strong applied ML skills, and a practical bias toward methods that work in real customer and product contexts
PhD or MSc in Computer Science, Mathematics, Statistics, Machine Learning, or a related field
3+ years of applied ML, AI research, or data science experience with demonstrated real-world impact
Experience with human-in-the-loop AI systems, including RLHF, annotation pipelines, data quality modelling, judgement aggregation, benchmarks, or AI evaluation
Fluency with modern LLM and agentic techniques, such as Retrieval-Augmented Generation (RAG), LLM-as-judge, multi-agent workflows, synthetic data generation, and automated quality review
Strong Python skills and the ability to quickly build, test, and iterate on working prototypes
Good judgement on when to use simple statistical methods, classical ML, LLMs, or agentic approaches