🍁 SearchCanadaJobs.com

Senior HPC and LSF Operations Engineer

Company

NVIDIA

Location

Santa Clara, CA

Type

Full-time

As a member of the Hardware Infrastructure EDA Compute team, you will optimize, scale, and support workload scheduling systems that directly impact design velocity and infrastructure efficiency. Success in this role requires both operational precision along with developing and supporting forward-looking resource management solutions that address evolving compute demands. Beyond day-to-day operations, the role drives improvements in observability, service reliability, and automation, ensuring the EDA compute environment remains resilient, measurable, and aligned with long-term engineering demands.

What you'll be doing:
+ Manage, scale, and optimize job scheduling systems (LSF, Slurm, etc.) in a large-scale, multi-site environment supporting EDA and other compute-intensive workloads
+ Analyze scheduler and infrastructure performance data to identify systemic bottlenecks and drive measurable improvements in utilization, throughput, and turnaround time
+ Lead problem solv...

🍁 Ready to Apply?

Take the next step in your Canadian career

Apply Now