We are looking for Infrastructure Engineer responsible for building, operating, and improving reliable, scalable, and high‑performing production systems. This role combines infrastructure engineering expertise with SRE practices, focusing on system availability, automation, incident management, and continuous improvement.
Key Responsibilities - Manage high severity incidents and high customer impact incidents focusing on fast recovery
- Champions production resilience and availability, focusing on superior client experience, by working with the operation team and technology development teams
- Drive the implementation of Site Reliability Engineer (SRE) and Chaos Engineering design for all strategic systems
- Drive effective communication between business and technology with regards to production service reliability and performance
- Drive continuous improvements in processes or systems leveraging Site Reliability Engineering method...