Job Description
            
                <p><p><b>Description : </b><br/><br/><b>Job Title : </b> Senior Data Engineer  AI Enablement (Azure)<br/><br/><b>EXP : </b> 5+ Years<br/><br/><b>Location : </b> remote<br/><br/><b>Notice : </b> Immediate to 15 days<br/><br/><b>Employment Type : </b> Full-Time<br/><br/><b>Department : </b> AI Strategy & Engineering<br/><br/><b>Mandatory Skills : </b><br/>- Azure Data Services,Python, SQL, and PySpark,ETL/ELT pipelines,AI/ML pipelines(NLP, embeddings, and GenAI) , embedding databases and vector stores such as Pinecone, FAISS, or Azure Cognitive Search,CI/CD for data workflows,AI Engineers, Cloud Architects, and DevOps<br/><br/><b>ABOUT THE ROLE : </b><br/><br/>We are seeking a Senior data Engineer to support and enable the deployment of enterprise-grade AI solutions across the organization.
This role will work in close coordination with the Senior AI/ML Engineer, AI strategy, and Cloud teams to ensure that the data infrastructure, pipelines, and governance mechanisms are in place for scalable, secure, and reliable AI deploymentsparticularly in the Azure OpenAI environment.<br/><br/>The ideal candidate will be highly proficient in cloud-native data engineering, experienced with modern data platforms, and able to translate business and AI requirements into robust, production-grade data solutions.<br/><br/><b>Key Responsibilities : </b><br/><br/>A Senior Data Engineer would be responsible for : <br/><br/>- Collaborate with the AI Engineering team to understand data needs for each use case (structured, unstructured, real-time, batch).<br/><br/>- Ingest, clean, and transform datasets from various enterprise systems into AI-ready formats.<br/><br/>- Build robust ETL/ELT pipelines using Azure-native tools and prepare and maintain embedding databases for RAG (Retrieval-Augmented Generation) models using tools like Pinecone, FAISS, or Azure Cognitive Search.<br/><br/>- Support data ingestion from diverse sources (APIs, databases, SharePoint etc.)<br/><br/>- Work with Cloud and DevOps teams to operationalize AI data pipelines, integrating with ML pipelines and APIs.<br/><br/>- Ensure scalable data infrastructure to handle new and growing AI use cases across the organization.<br/><br/>- Optimize storage and compute costs while maintaining high availability and throughput for AI applications.<br/><br/>- Implement and enforce data governance best practices, including access controls, anonymization, and compliance (e.g., GDPR).<br/><br/><b>WHAT WERE LOOKING FOR : </b><br/><br/><b>Essential Criteria : </b><br/><br/>- 5+ years of experience in data engineering, preferably within enterprise environments.<br/><br/>- Deep knowledge of Azure data services : Data Factory, Synapse, Azure Storage, Azure Data Lake, Event Hubs, Databricks.<br/><br/>- Proficient in Python, SQL , PySpark etc.<br/><br/>- Experience building pipelines for AI/ML applications, especially around NLP, embeddings, or unstructured data.<br/><br/>- Strong understanding of data modelling, pipeline orchestration, and CI/CD for data workflows.<br/><br/>- Familiarity with embedding databases and vector stores used in GenAI applications.<br/><br/><b>Desirable skills : </b><br/><br/>- Azure Data Engineer Associate or equivalent certification.<br/><br/>- Experience working with data for RAG-based AI architectures and prompt-based systems.<br/><br/>- Bachelor's or Masters in Computer Science, Data Engineering, or a related field.</p><br/></p> (ref:hirist.tech)