Job Description
<p>Role : Network Automation Engineer<br/><br/>Location : Bangalore Only<br/><br/>Years of Experience : 7 to 15 Years<br/><br/>Primary : Ansible / Python<br/><br/>Mandatory : Private Cloud & Networking skills<br/><br/>Job Description : <br/><br/>We are looking for an experienced Network Automation Engineer to design, implement, and optimize automation solutions for our Private Cloud datacenter network, which underpins large-scale AI/ML GPU and TPU workloads.
</p><p><br/></p><p>This role focuses on automating configuration, provisioning, and monitoring of high-performance networking devices to ensure low latency, high throughput, and reliability in a mission-critical environment.
This role involves automating network device management as well as OS-level network configurations on servers.
Expertise in Ansible and Python is essential, and experience with GoLang is a strong plus.<br/><br/>Key Responsibilities : <br/><br/>- Develop and maintain network automation frameworks for large-scale datacenter environments supporting AI/ML workloads.<br/><br/>- Build Ansible playbooks, roles, and modules to automate device configurations, software upgrades, and compliance checks across multi-vendor environments.<br/><br/>- Design and implement Python-based automation scripts and tools to integrate with APIs, orchestration platforms, and monitoring systems.<br/><br/>- Automate OS core networking configurations on servers (Linux / Windows / Hypervisor) including bonding, VLANs, routing tables, kernel network parameters, MTU tuning, and NIC performance optimization.<br/><br/>- Collaborate with cloud infrastructure, network engineering, and DevOps teams to deliver seamless provisioning and scaling of GPU/TPU clusters.<br/><br/>- Ensure network automation solutions meet high-performance computing (HPC) requirements such as low latency, high throughput, and fault tolerance.<br/><br/>- Participate in network architecture reviews to provide automation insights and recommendations.<br/><br/>- Document automation processes, workflows, and operational guidelines for the datacenter network.<br/><br/>- Stay updated on emerging technologies in network automation, SDN, and private cloud networking.<br/><br/>Required Skills & Experience : <br/><br/>- Expertise in Ansible (playbook development, dynamic inventory, custom modules) for large-scale network automation.<br/><br/>- Strong proficiency in Python for scripting, API integrations (REST, NETCONF, gNMI), and device interaction (e.g., NAPALM, Netmiko, Paramiko).<br/><br/>- Hands-on experience with high-performance datacenter networking devices (Cisco Nexus, Arista, Juniper, Mellanox/NVIDIA Networking).<br/><br/>Knowledge of Linux / Windows / Hypervisor OS core networking, including : <br/><br/>- Network stack configuration (sysctl tuning, TCP/UDP parameters).<br/><br/>- NIC bonding, SR-IOV, DPDK, and kernel bypass techniques.<br/><br/>- VLANs, routing tables, MTU adjustments, jumbo frames.<br/><br/>- Performance tuning for HPC/AI workloads.<br/><br/>- Deep understanding of networking concepts including BGP, EVPN-VXLAN, MPLS, QoS, and leaf-spine architectures.<br/><br/>- Experience in Private Cloud environments with a focus on supporting HPC/AI workloads.<br/><br/>- Familiarity with CI/CD pipelines (GitLab, Jenkins) for deploying automation at scale.<br/><br/>- Knowledge of network observability, telemetry, and streaming protocols (gRPC, sFlow, SNMP, InfluxDB, Prometheus).<br/><br/>- Strong problem-solving skills and ability to operate in a high-availability, mission-critical datacenter environment.<br/><br/>Good to Have : <br/><br/>- Golang experience for building scalable and high-performance automation tools.<br/><br/>- Familiarity with Infrastructure-as-Code (IaC) tools like Terraform or Pulumi.<br/><br/>- Exposure to Kubernetes networking (CNI plugins) and containerized workloads.<br/><br/>- Understanding of AI/ML workload characteristics and their impact on network design and performance.<br/><br/>- Experience with SDN solutions (e.g., Cisco ACI, VMware NSX, NVIDIA Cumulus).</p> (ref:hirist.tech)