🍁 SearchCanadaJobs.com

Senior Site Reliability Engineer (GPU & ML Infrastructure)

Company

WomenTech Network

Location

Paris, Île-de-France

Type

Full-time

What You'll Do:

At Criteo, the Platform Core group builds the foundational infrastructure powering our global advertising platform. We design and operate large-scale, resilient systems supporting real-time decision-making and data processing across thousands of services.


As we expand our distributed computing and ML infrastructure capabilities, we are building a new team focused on GPU platforms and high-performance model serving technologies.


As a Site Reliability Engineer in the GPU team, you will help design, operate, and scale the infrastructure powering machine learning training and inference workloads.


You will work on technologies such as:


Ray on Kubernetes



  • Build and operate scalable Ray clusters running on Kubernetes.




  • Develop reliable self-service distributed computing platforms for ML workloads.




  • Improve provisioning, observability, reliability,...

  • 🍁 Ready to Apply?

    Take the next step in your Canadian career

    Apply Now