At NVIDIA, we aren't just powering the AI revolution— we're accelerating it. We are accelerating LLM inference across the stack and across all open source LLM frameworks like TensorRT LLM, vLLM and SGLang . With demand for AI exploding, particularly in the realm of large language models (LLMs) and vision language models (VLMs, VLAs), we are significantly expanding our team.
We're seeking a highly skilled and driven Engineering Manager to take the lead in accelerating the next generation of LLM/VLM/VLA inference software technologies that will define the future of AI. This is a high-impact, hands-on leadership role at the intersection of deep technical expertise and world-class management. You won't just manage; you'll architect and guide a brilliant team of engineers who are pushing the performance of LLM inference. Your work will be highly collaborative, interfacing directly with NVIDIA Researchers, GPU Architects, and o...