Job Title: SRE / DevOps Engineer – Cloud (AWS & Azure)
Location: Remote
Department: Engineering / DevOps
TE – 9 Yrs+ (6 + years of experience relevant )
Experience: 6+ years
About the Role
We are looking for a skilled and experienced SRE/DevOps Engineer to join our cloud infrastructure team.
The ideal candidate will have deep expertise in AWS and Azure cloud platforms, strong networking knowledge, and hands-on experience with modern CI/CD pipelines, Git workflows, and GitOps tools.
You will play a key role in designing, implementing, and maintaining reliable, scalable, and observable infrastructure to support our development and operations teams.
Key Responsibilities
- Design, deploy, and manage scalable cloud infrastructure on AWS and Azure, focusing on compute, networking, and storage components.
- Implement and maintain networking solutions such as VPCs, VNets, subnets, routing, VPNs, security groups, network security, and load balancers across AWS and Azure.
- Develop and manage CI/CD pipelines using GitHub Actions, integrating automated build, test, and deployment workflows.
- Implement Git workflows to enable smooth collaboration, including branching strategies, pull requests, and code reviews.
- Manage Kubernetes deployments using Argo CD for GitOps-based continuous deployment on AWS EKS and Azure AKS.
- Build and maintain observability and monitoring solutions using Datadog, Prometheus, and Fluent Bit to ensure system health, performance, and availability.
- Collaborate closely with software engineering, infrastructure, and security teams to ensure reliability, scalability, and security best practices.
- Automate infrastructure provisioning and configuration using Infrastructure as Code (Terraform preferred).
- Troubleshoot incidents and provide root cause analysis to improve system resilience and reduce downtime.
- Optimize cloud resource utilization and cost management on AWS and Azure.
Qualifications
- Bachelor's degree in Computer Science, Engineering, or related field, or equivalent experience.
- Required Skills 6+ years of experience as an SRE or DevOps Engineer working with AWS and Azure cloud environments.
- Build and maintain observability and monitoring solutions using Datadog, Prometheus, App Dynamics and other market leading tools to ensure system health, performance, and availability.
- Strong hands-on experience with AWS components: VPC, EC2, ELB, Route 53, IAM, CloudWatch, Lambda, S3, RDS, EKS.
- Strong hands-on experience with Azure components: Virtual Network (VNet), Azure Compute (VMs, App Services), Azure Load Balancer, Azure DNS, Azure AD, Azure Monitor, AKS.
- Deep networking knowledge including TCP/IP, DNS, VPN, Load Balancers, Security Groups, NSGs, routing, and firewall configurations.
- Expertise in CI/CD pipeline creation and management using GitHub Actions.
- Proven experience with Git workflows including feature branching, pull requests, merges, and conflict resolution.
- Experience with Kubernetes and GitOps deployment using Argo CD.
- Proficient with monitoring and observability tools: Datadog, Prometheus, Fluent Bit, and Grafana.
- Skilled in scripting and automation (Python, Bash, PowerShell).
- Experience with Infrastructure as Code tools such as Terraform or ARM templates.
- Experience with container orchestration beyond EKS and AKS.
- Strong problem-solving skills and experience in incident management and root cause analysis.
- Excellent communication skills and ability to collaborate across multiple teams.
Preferred Qualifications:
- Certifications such as AWS Certified Solutions Architect, Azure Administrator Associate, or Certified Kubernetes Administrator (CKA).
- Knowledge of security best practices for cloud-native environments.
- Familiarity with additional DevOps tools such as Jenkins, Ansible, or Helm.
- Any prior knowledge about mainframe system