Job Description
<p><p>We are seeking a highly skilled AI/ML Engineer with strong expertise in Python programming, API development, and real-time deployment of ML models.
The ideal candidate should have experience in designing, building, and optimizing machine learning pipelines and integrating models into production environments.</p><br/><p><b>Key Responsibilities :</b></p><p><br/></p><p>- Design, develop, and optimize machine learning models using Python and popular ML frameworks (TensorFlow, PyTorch, Scikit-learn, etc.).<br/><br/></p><p>- Implement end-to-end ML pipelines including data preprocessing, feature engineering, training, evaluation, and deployment.<br/><br/></p><p>- Build and manage RESTful APIs / gRPC services to expose ML models for real-time inference.<br/><br/></p><p>- Deploy and scale models in production environments (Docker, Kubernetes, cloud platforms such as AWS, GCP, or Azure).<br/><br/></p><p>- Ensure high availability, low-latency, and fault-tolerant real-time ML systems.<br/><br/></p><p>- Collaborate with data engineers and software teams to integrate ML solutions into existing applications.<br/><br/></p><p>- Conduct performance monitoring, optimization, and retraining of models as needed.<br/><br/></p><p>- Apply MLOps best practices for CI/CD pipelines, model versioning, and automated deployment workflows.<br/><br/></p><p>- Write clean, efficient, and production-grade Python code following software engineering best practices.</p><br/><p><b>Required Skills & Experience :</b></p><p><br/></p><p>- 3-8 years of hands-on experience in Python programming (advanced knowledge of data structures, OOP, multiprocessing, async programming).<br/><br/></p><p>- Strong expertise in machine learning algorithms, model training, and evaluation techniques.<br/><br/></p><p>- Experience with API development (FastAPI, Flask, Django, or similar).<br/><br/></p><p>- Proven experience in real-time model deployment and serving (TensorFlow Serving, TorchServe, MLflow, or custom solutions).<br/><br/></p><p>- Solid understanding of cloud-native deployments (AWS Sagemaker, GCP Vertex AI, Azure ML) and containerization (Docker, Kubernetes).<br/><br/></p><p>- Knowledge of streaming data frameworks (Kafka, Spark Streaming, Flink) is a plus.<br/><br/></p><p>- Familiarity with CI/CD pipelines for ML (GitHub Actions, Jenkins, or similar).<br/><br/></p><p>- Strong grasp of data engineering concepts (data ingestion, transformation, and storage).<br/><br/></p><p>- Experience in monitoring & logging (Prometheus, Grafana, ELK, or equivalent).</p><br/><p><b>Nice to Have :</b></p><p><br/></p><p>- Experience with Generative AI (LLMs, diffusion models, transformers).<br/><br/></p><p>- Exposure to GPU optimization and distributed training.<br/><br/></p><p>- Familiarity with feature stores and advanced MLOps frameworks (Kubeflow, TFX, MLflow).</p><br/></p> (ref:hirist.tech)