We are looking for a highly motivated Site Reliability Engineer (SRE) to improve system reliability, scalability, and performance of mission-critical applications. The ideal candidate should have strong experience in cloud platforms, automation, monitoring, and incident management. Key Responsibilities:
Design, build and maintain highly available and scalable production systems.
Monitor system performance, availability, and reliability.
Implement automation to reduce manual efforts (Infrastructure as Code).
Manage CI/CD pipelines and deployment processes.
Troubleshoot production issues and perform root cause analysis (RCA).
Improve system observability using logging, monitoring, and tracing tools.
Collaborate with development teams to enhance application...