Job Description
<p><p>AI Data Engineer-4-5 yrs experience-Immediate Joiners.<br/><br/> The AI Data Engineer designs, develops, and maintains the data pipelines and infrastructure essential for AI and machine learning projects.<br/><br/> The role bridges traditional data engineering with the specific requirements of AIensuring models are trained on high-quality, well-prepared data and that data flows efficiently from diverse sources into AI and GenAI applications.<br/><br/> This is a full time on site job based at our office in Infopark, Kochi.</p><p><br/><b>Key Responsibilities : </b><br/><br/></p><p>- Build, test, and maintain scalable data pipelines for AI and machine learning workflows.<br/><br/></p><p>- Develop and manage architected data solutions (warehouses, lakes, streaming platforms) to support generative and predictive AI use cases.<br/><br/></p><p>- Automate data acquisition, transformation, integration, cleansing, and validation from structured and unstructured sources.<br/><br/></p><p>- Collaborate with data scientists, AI/ML engineers, and business teams to understand requirements, provision data assets, and ensure model readiness.<br/><br/></p><p>- Optimise ETL/ELT processes for scalability, reliability, and performance.<br/><br/></p><p>- Manage data quality frameworks, monitor pipelines, and address data drift, schema changes, or pipeline failures.<br/><br/></p><p>- Deploy and track real-time and batch pipelines supporting AI model inference and training.<br/><br/></p><p>- Implement security, privacy, and compliance procedures for all AI data operations.<br/><br/></p><p>- Document infrastructure, data flows, and operational playbooks related to AI solutions.</p><p><br/><b>Required Skills and Qualifications : </b><br/><br/></p><p>- Bachelors or Masters degree in Computer Science, Data Engineering, or related field.<br/><br/></p><p>- Strong expertise with data pipeline orchestration tools (e.g., Apache Airflow, Luigi, Prefect).<br/><br/></p><p>- Proficiency in SQL, Python, and experience working with big data frameworks (Spark, Hadoop).<br/><br/></p><p>- Familiarity with ML/AI frameworks (TensorFlow, PyTorch, Scikit-learn) and MLOps practices.<br/><br/></p><p>- Experience with cloud data solutions/platforms (AWS, GCP, Azure).<br/><br/></p><p>- In-depth understanding of data modelling, storage, governance, and performance optimisation.<br/><br/></p><p>- Ability to manage both batch and streaming data processes and work with unstructured data (images, text, etc.<br/><br/></p><p>- Excellent troubleshooting, analytical, and communication skills.</p><br/></p> (ref:hirist.tech)