Job Description
<p><p><b>Who You Are :</b></p><p><br/> A passionate and experienced engineer with a proven track record of identifying and resolving reliability and scalability challenges in large-scale, containerized applications.<br/><br/> A curious and collaborative team player who thrives in a fast-paced environment, eager to explore, learn, and improve processesparticularly around Kubernetes deployments and management.<br/><br/> An efficiency enthusiast, skilled at automating solutions and continuously innovating container orchestration and management.<br/><br/> A nimble learner, capable of grasping complex Kubernetes concepts and an excellent communicator who can advocate for best practices in Kubernetes Cloud Infrastructure Engineering Team : </b></p><p><br/> Our team consists of passionate Platform Engineers and SREs dedicated to reliability, automation, and scalability.</p><br/> We leverage an agile framework to prioritize business needs and use both private and public cloud solutions based on data-driven decisions.<br/><br/> We are relentless in automating repetitive tasks, freeing ourselves for impactful projects that propel the business forward.<br/><br/> Our primary focus is leveraging Kubernetes for efficient and reliable deployment of containerized Responsibilities :</b><br/><br/><b>Kubernetes Deployment & Automation :</b><br/><br/> - Design, deploy, and manage highly available and scalable Kubernetes clusters on AWS EKS using Terraform and/or Cross plane.<br/><br/> - Implement Infrastructure-as-Code (IaC) best practices for managing EKS clusters and related infrastructure.<br/><br/><b>Kubernetes Operations & GitOps :</b><br/><br/> - Configure and maintain Kubernetes deployments, services, ingresses, and other resources using YAML manifests or GitOps workflows.<br/><br/> - Implement GitOps practices with FluxCD for automated deployments and configuration management of containerized applications.<br/><br/><b>Reliability, Security & Scalability :</b><br/><br/> - Proactively ensure the reliability, security, and scalability of AWS production systems, with a particular focus on Kubernetes clusters and containerized applications.<br/><br/> - Resolve complex problems across multiple platforms and application domains, using advanced system troubleshooting techniques.<br/><br/><b>Operational Support & Monitoring :</b><br/><br/> - Provide primary operational support and engineering expertise for all cloud and enterprise deployments, with a focus on Kubernetes.<br/><br/> - Monitor system performance, identify downtime incidents, and diagnose underlying causes, particularly related to Kubernetes cluster and container health.<br/><br/><b>Cost Optimization :</b><br/><br/> - Design and develop cost-effective Kubernetes solutions within allocated budgets, ensuring efficient resource Responsibilities :</b></p><p><br/><b>Collaboration & Process Improvement :</b></p><br/> - Work closely with developers, testers, and system administrators to ensure smooth deployments and operations of containerized applications.<br/><br/> - Champion the implementation of new processes, tools, and methodologies to enhance efficiency throughout the software development lifecycle (SDLC) and pipeline management.<br/><br/><b>Security Integration :</b><br/><br/> - Integrate robust security measures into the development lifecycle, considering the specific security requirements of containerized Qualifications :</b><br/><br/> - 5 to 9 years of experience building, scaling, and supporting highly available systems and services.<br/><br/> - Min 3+ years of experience managing and operating Kubernetes clusters in production.<br/><br/> - Proven experience in building and managing AWS platforms, with a strong focus on Amazon EKS (Elastic Kubernetes Service).<br/><br/> - Deep knowledge of Kubernetes architecture, core concepts, best practices, and security considerations.<br/><br/> - Expertise in Infrastructure-as-Code (IaC) tools like Terraform and Cross plane.<br/><br/> - Familiarity with GitOps principles and experience with FluxCD (a plus).<br/><br/> - Proficiency in at least one scripting/programming language (Python, Go, Ruby, Shell).<br/><br/> - Experience in Site Reliability Engineering (SRE) and DevOps principles, including CI/CD and version control (Bitbucket, GitHub, etc.<br/><br/> - Familiarity with telemetry, observability, and modern monitoring tools (Prometheus, Alertmanager, Grafana, etc.), particularly for Kubernetes monitoring.<br/><br/> - Strong expertise in system visibility to facilitate rapid detection and resolution of issues within Kubernetes Behaviors & Abilities Required :</b></p><p><br/> - A strong ability to learn and adapt in a fast-paced environment, especially as Kubernetes and container orchestration technologies evolve.</p><br/> - Excellent teamwork skills, collaborating effectively across cross-functional teams including developers, testers, and system administrators.<br/><br/> - Strong prioritization and problem-solving skills, adept at troubleshooting complex Kubernetes-related issues.<br/><br/> - Ability to manage multiple projects simultaneously, ensuring projects stay on track with clear progress updates.<br/><br/> - Ability to handle unexpected challenges while effectively context-switching between tasks.<br/><br/> - Willingness to participate in rotational on-call duties to ensure continuous monitoring and support of Kubernetes clusters.<br/><br/> - A strong work ethic and commitment to continuous learning and improvement in Kubernetes and container orchestration Athenahealth :</b><br/><br/> Our vision: In an industry that becomes more complex by the day, we stand for simplicity.<br/><br/> We offer IT solutions and expert services that eliminate the daily hurdles preventing healthcare providers from focusing entirely on their patients powered by our vision to create a thriving ecosystem that delivers accessible, high-quality, and sustainable healthcare for all.<br/><br/> Our company culture: Our talented employees or athenistas, as we call ourselves spark the innovation and passion needed to accomplish our vision.<br/><br/> We are a diverse group of dreamers and do-ers with unique knowledge, expertise, backgrounds, and perspectives.<br/><br/> We unite as mission-driven problem-solvers with a deep desire to achieve our vision and make our time here count.<br/><br/> Our award-winning culture is built around shared values of inclusiveness, accountability, and support.<br/><br/> Our DEI commitment: Our vision of accessible, high-quality, and sustainable healthcare for all requires addressing the inequities that stand in the way.<br/><br/> That's one reason we prioritize diversity, equity, and inclusion in every aspect of our business, from attracting and sustaining a diverse workforce to maintaining an inclusive environment for athenistas, our partners, customers and the communities where we work and serve.</p><br/></p> (ref:hirist.tech)