At Nebula Tech Solutions , we’re expanding our global reliability engineering team to support mission-critical systems for our US-based enterprise clients  during night shifts only .
We’re looking for experienced DevOps/SRE professionals (5+ years)  who bring hands-on depth in Kubernetes, monitoring/metrics, and coding  — not just infrastructure management.
 This is a role for engineers who thrive on troubleshooting, automation, and continuous improvement  in high-availability environments.
   
What You’ll Do
✅ Build, optimize, and maintain Kubernetes clusters (EKS/GKE/AKS) for scalability and resilience
✅ Design and improve CI/CD pipelines (Jenkins, ArgoCD, FluxCD, Harness, GitHub Actions)
✅ Implement and extend observability using Prometheus, Grafana, OpenTelemetry, and custom metrics
✅ Develop and maintain internal tools and automations using Python, Go, or similar programming languages
✅ Drive incident response, RCA, and reliability improvements across services
✅ Collaborate with global teams to ensure continuous uptime and performance