🍁 SearchCanadaJobs.com

DevOps Engineer / Site-Reliability Engineer

Company

manus ai

Location

singapore, singapore

Type

Full-time

Key Responsibilities
Cluster Operations & Management Manage and maintain container clusters (Kubernetes, Docker) and open-source component clusters (Kafka, Redis, Elasticsearch) across multiple business units Ensure optimal performance, scalability, and reliability of distributed systems Infrastructure Platform Development Design, build, and enhance infrastructure operation platforms Develop and maintain systems for infrastructure management, CI/CD pipelines, monitoring/alerting, and centralized logging Drive platform standardization and automation initiatives High Availability & Reliability Ensure maximum uptime for production services through proactive monitoring and incident response Continuously optimize service architecture, deployment strategies, and operational processes Implement and maintain SLA/SLO frameworks and reliability engineering practices Automation & Process Improvement Lead the development of automated operations and maintenance systems Create self-service tools ...

🍁 Ready to Apply?

Take the next step in your Canadian career

Apply Now