Job Overview
Category
Computer Occupations
Ready to Apply?
Take the Next Step in Your Career
Join D2KSS and advance your career in Computer Occupations
Apply for This Position
Click the button above to apply on our website
Job Description
<p><p><b>Description :</b><br/><br/><b>Key Responsibilities :</b></p><p><p><b><br/></b></p>- Manage and maintain Kubernetes clusters (EKS) and ensure high system reliability and scalability.<br/><br/>- Implement and manage AWS services including IAM, EC2, EKS, CloudWatch, and S3.<br/><br/>- Build automation tools to enable self-healing and self-monitoring systems.<br/><br/>- Develop and maintain monitoring solutions to track system performance and alert for low-latency applications.<br/><br/>- Troubleshoot application-specific, network, system, and performance issues in real time.<br/><br/>- Perform Linux debugging, performance tuning, and optimization for production systems.<br/><br/>- Apply SRE principles monitoring, alerting, error budgets, fault analysis, capacity planning, and toil reduction.<br/><br/>- Collaborate with cross-functional teams to improve reliability, performance, and deployment processes.<br/><br/><b>Must-Have Qualifications :</b></p><p><p><b><br/></b></p>- Bachelors degree in Computer Science or a related field.<br/><br/>- Minimum 5+ years of experience in DevOps / Site Reliability Engineering roles.<br/><br/>- Strong hands-on experience with Kubernetes and container orchestration.<br/><br/>- In-depth knowledge of AWS services (IAM, EC2, EKS, CloudWatch, S3).<br/><br/>- Proficiency in at least one programming/scripting language Python or Shell.<br/><br/>- Excellent understanding of Linux systems, debugging tools, and performance tuning.<br/><br/>- Strong problem-solving, troubleshooting, and analytical skills.<br/><br/>- Ability to work collaboratively in a fast-paced, evolving technology environment.<br/><br/><b>Preferred Skills :</b></p><p><p><b><br/></b></p>- Experience with CI/CD pipelines and automation frameworks.<br/><br/>- Familiarity with Infrastructure as Code (IaC) tools such as Terraform or CloudFormation.<br/><br/>- Understanding of networking concepts, system architecture, and distributed systems.<br/><br/><b>Key Traits :</b></p><p><p><b><br/></b></p>- Strong ownership and accountability.<br/><br/>- Excellent communication and collaboration skills.<br/><br/>- Willingness to continuously learn and adapt to new technologies.</p><br/></p> (ref:hirist.tech)
Don't Miss This Opportunity!
D2KSS is actively hiring for this Site Reliability Engineer - Elastic Kubernetes Service position
Apply Now