Job Overview
            
                
                
                
                    Category
                    Computer Occupations
                 
                
             
            
            
         
        
            Ready to Apply?
            
                Take the Next Step in Your Career
                Join CodeKarma and advance your career in Computer Occupations
             
            Apply for This Position
            
                Click the button above to apply on our website
            
         
        
            Job Description
            
                Site Reliability Engineer (Multi-Cloud Deployments)   Location:  Bangalore / Remote
Experience:  4–10 years
Type:  Full-time (6-month probation)
About CodeKarma  CodeKarma is redefining how engineering teams understand and evolve complex systems — bringing production context directly into the developer’s workflow.
Our platform runs both as  SaaS  and as  sub-account / on-prem deployments  within our customers’ cloud environments.
We’re looking for engineers who can take ownership of these deployments end-to-end — from setup to monitoring, upgrades, and ongoing reliability.
What You’ll Do  You’ll be responsible for managing CodeKarma’s distributed deployments across client environments — ensuring reliability, security, and performance at scale.
Deploy and manage CodeKarma clusters  across AWS, GCP, and Azure customer sub-accounts.
Monitor, upgrade, and maintain  Kubernetes clusters and related infrastructure.
Implement  observability, alerting, and disaster recovery  for each deployment.
Handle  CI/CD automation  for platform releases, patches, and version upgrades.
Work closely with  client engineering teams  to adapt deployments to their environments, policies, and security constraints.
Diagnose and resolve environment-specific issues across networking, storage, and configuration layers.
Build and maintain  infrastructure playbooks, Helm charts, and Terraform modules  for standardized deployment.
What We’re Looking For   Strong experience managing  Kubernetes clusters  (EKS, GKE, AKS, or on-prem equivalents).
Deep understanding of  Kubernetes internals, Helm, ingress controllers, networking, and storage classes .
Hands-on experience with  CI/CD tools  (GitHub Actions, ArgoCD, or similar).
Familiarity with  monitoring and alerting stacks  (Prometheus, Grafana, Loki, ELK, etc.).
Working knowledge of  cloud infrastructure  across AWS / GCP / Azure.
Ability to  work directly with client engineering and DevOps teams , understanding their constraints and helping them integrate CodeKarma.
Strong debugging and communication skills — you’ll often be the bridge between CodeKarma and client infrastructure.
Why Join Us   Manage real, large-scale production environments across multiple enterprises.
Work directly with founders and senior engineers to shape how CodeKarma scales across clients.
High ownership, fast-moving environment, and exposure to deep-tech systems.
How to Apply  Please share:
A short summary of  your Kubernetes experience  (cluster management, scaling, debugging, etc.).
Any  automation or deployment tooling  you’ve built or maintained.
Links to your  GitHub / GitLab / blog posts  (if available).
            
         
  
  
  
        
        
        
        
        
            Don't Miss This Opportunity!
            
                CodeKarma is actively hiring for this Site Reliability Engineer position
            
            Apply Now