Job Description
<p><p><b>As a DevOps Engineer,</b> youll partner with development and IT teams to build reliable, scalable platforms and smooth, automated delivery from commit to production.<br/><br/>Youll focus on automation, system reliability, and a strong CI/CD cultureanchored in GitHub Actions, Terraform, and Kuberneteswith must-have experience configuring ELK or the Grafana/Prometheus/Loki observability stack.<br/><br/><b>Key Responsibilities :</b><br/><br/>- Own CI/CD with GitHub Actions : Design, implement, and maintain reusable workflows for build, test, security scanning, and multi-env deployments.</p><p><br/>- Kubernetes at scale : Operate and harden K8s clusters (managed or self-managed), including workload orchestration, autoscaling, ingress, secrets, and RBAC; streamline app delivery (e.g., Helm/Kustomize).<br/><br/></p><p>- Infrastructure as Code with Terraform : Model and provision cloud infrastructure (AWS/Azure/GCP) using Terraform modules, workspaces, and policy guardrails.<br/><br/></p><p>- Cloud platform engineering : Build secure, scalable, cost-efficient environments; implement networking, identity, and storage patterns.<br/><br/></p><p>- Observability you can trust : Configure and maintain either ELK (Elasticsearch, Logstash, Kibana) or Grafana/Prometheus/Loki stacks for metrics, logs, dashboards, and alerting.<br/><br/></p><p>- Reliability & incident response : Monitor SLOs/SLIs, tune performance, run post-mortems, and drive resiliency (autoscaling, HPA, PDBs, chaos testing where applicable).</p><p><br/>- Security & compliance : Bake in security (secrets management, image scanning, policy as code, least privilege) across pipelines and platforms.<br/><br/></p><p>- Business continuity : Implement backups, disaster recovery, and restore testing for critical systems.<br/><br/><b>Must-Have Skills :</b><br/><br/>- GitHub Actions workflow design (composite actions, reusable workflows, environments, OIDC to clouds).<br/><br/></p><p>- Terraform (strong) : Modules, state management/backends, CI-driven plans/applies, policy controls (e.g., Sentinel/OPA).<br/><br/></p><p>- Kubernetes (strong) : Production ops, networking (CNI/ingress), security (RBAC/PSa), packaging (Helm/Kustomize), and troubleshooting.<br/><br/></p><p>- Observability stack configuration : ELK or Grafana + Prometheus + Loki (end-to-end setup, dashboards, alerting, retention).<br/><br/><b>Additional Requirements :</b><br/><br/>- Bachelors in CS/Engineering or equivalent experience.<br/><br/></p><p>- 2+ years in DevOps/SRE or similar.<br/><br/></p><p>- Proficiency with at least one major cloud (AWS, Azure, or GCP); multi-cloud a plus.<br/><br/></p><p>- Experience with config management/automation (Ansible/Puppet/Chef; Ansible preferred).</p><p><br/></p><p>- Strong Git fundamentals and trunk-based/GitOps practices.<br/><br/></p><p>- Solid grounding in networking, Linux system administration, and security principles.<br/><br/></p><p>- Containerization expertise (Docker) and image lifecycle (build, scan, sign, SBOM).<br/><br/><b>Nice to Have :</b><br/><br/>- GitOps tooling (Argo CD/Flux).<br/><br/></p><p>- Policy as code Secrets management (Vault/SM/Key Vault).<br/><br/></p><p>- Cost optimization and FinOps awareness.</p><br/></p> (ref:hirist.tech)