Know ATS Score
CV/Résumé Score
  • Expertini Resume Scoring: Our Semantic Matching Algorithm evaluates your CV/Résumé before you apply for this job role: Principal Software Engineer Network Reliability Engineering AI/ML.
India Jobs Expertini

Urgent! Principal Software Engineer - Network Reliability Engineering - AI/ML Job Opening In India, India – Now Hiring Oracle

Principal Software Engineer Network Reliability Engineering AI/ML



Job description

Oracle Cloud Infrastructure (OCI) provides mission-critical cloud services to enterprises worldwide.

The Network Reliability Engineering(NRE) Automation, Reporting, and Tooling team builds innovative solutions that boost the productivity and efficiency of the Global Network Operations Center (GNOC).

Our tooling empowers the GNOC and Network Reliability Engineering (NRE) teams with observability, automation, and actionable insights at hyperscale.

As a Principal Software Developer, you will design, build, and deliver scalable automation frameworks and advanced platforms leveraging AI/ML to drive operational excellence across OCI’s global network.

This includes building network event driven data (such as failures), hybrid classification, and both training and inference.

You are passionate about developing software that solves real-world operational challenges, thrive in a fast-paced team, and are comfortable working with complex distributed systems.

You value simplicity, scalability, and collaboration.

Responsibilities:

  • Architect, build, and support distributed systems for process control and execution based on Product Requirement Documents (PRDs).

  • Develop and sustain DevOps tooling, new product process integrations and automated testing.

  • Develop ML in Python 3; build backend services in Go (Golang); create command-line interface (CLI) tools in Rust or Python 3; and integrate with other services as needed using Go, Python 3, or C.

  • Build and maintain schemas/models to ensure every platform and service write is captured for monitoring, debugging and compliance
  • Build and maintain dashboards that monitor the quality and effectiveness of service execution for process as code your team delivers.

  • Build automated systems that route code failures to the appropriate oncall engineers and service owners.

  • Ensure high availability, reliability, and performance of developed solutions in production environments.

  • Support serverless workflow development for workflows which call and utlize the above mentioned services support our GNOC, GNRE, and onsite operations and hardware support teams.

  • Participate in code reviews, mentor peers, and help build a culture of engineering excellence.

  • Operate in an Extreme Programming (XP) asynchronous environment (chat/tasks) without daily standups, and keep work visible by continuously updating task and ticket states in Jira.

  • Required Qualifications:

  • 8 - 10 years of experience in process as code, software engineering, automation development, or similar roles
  • Bachelors in computer science and Engineering or related engineering fields
  • Strong coding skills in Go and Python3
  • Experience with distributed systems, micro-services, and cloud-native technologies
  • Proficiency in Linux environments and scripting languages
  • Proficiency with database creation, maintenance and code using SQL and Go or Py3 libraries
  • Understanding of network operations or large-scale IT infrastructure
  • Excellent problem-solving, organizational, and communication skills
  • Experience using AI coding assistants or AI-powered tools to help accelerate software development, including code generation, code review, or debugging.

  • Preferred Qualifications:

  • Process engineering experience (control systems, proportional integral derivative's (pid), statistical process control (SPC))
  • Proficiency with data modeling, data analysis, and reporting frameworks (., SQL, Spark, Prometheus, Grafana,
  • Experience with C, Cpp, Java, or Rust
  • Experience developing automation and tools for network or scale cloud operations
  • Background in creating dashboards, alerts, and real-time reporting platforms
  • Familiarity with workflow automation (., Apache Airflow), CI/CD pipelines, or infrastructure as code
  • Previous experience supporting or building tools for (any) hyperscale or scale could network, compute, or storage operations.

  • Knowledge of REST APIs, remote procedure calls (RPCs), and service oriented architectures (SOA)
  • Familiarity with eXtreme programming (xp), agile, and devops process
  • Experience with ticketing and version control systems (., Jira, Git)
  • Career Level - IC4


    Required Skill Profession

    Other General



    Your Complete Job Search Toolkit

    ✨ Smart • Intelligent • Private • Secure

    Start Using Our Tools

    Join thousands of professionals who've advanced their careers with our platform

    Rate or Report This Job
    If you feel this job is inaccurate or spam kindly report to us using below form.
    Please Note: This is NOT a job application form.


      Unlock Your Principal Software Potential: Insight & Career Growth Guide