Job Description
<p><p>Were seeking a passionate and experienced Senior Python & Machine Learning Engineer to join our Data domain team.<br/><br/> Youll work on unique, one-of-a-kind problem statements using advanced GenAI, large language models (LLMs), and modern data engineering frameworks.<br/><br/> You will help conceptualize and deliver impactful solutions that push the boundaries of data science and machine learning in finance.</p><br/><p><b>Responsibilities :</b><br/><br/></p><p>- Design, develop, and deploy sophisticated machine learning and GenAI models to solve complex data problems at scale.<br/><br/></p><p>- Implement, optimize, and scale ML solutions using Databricks, Spark, and cloud-native data ecosystems (AWS/Azure/GCP).<br/><br/></p><p>- Collaborate with other engineers, product managers, and UX teams to build robust, high-performance Python-based analytics pipelines.<br/><br/></p><p>- Develop and finetune LLMs and generative AI applications for structured and unstructured financial data.<br/><br/></p><p>- Architect data processing workflows leveraging Delta Lake, Feature Stores, and MLOps best practices.<br/><br/></p><p>- Translate cutting-edge research (papers, new ML techniques) into production solutions.<br/><br/></p><p>- Mentor junior data scientists and engineers on ML, best practices and GenAI.<br/><br/></p><p>- Work on one-of-a-kind data challenges, including entity disambiguation, real-time risk analytics, NLP, graph data modeling, and anomaly detection.<br/><br/></p><p>- Keep up-to-date with the latest in ML tooling, GenAI, Databricks, and cloud data :</b><br/><br/></p><p>- Bachelors/Masters in Computer Science, Data Science, Mathematics, or related field.<br/><br/></p><p>- 5-6 years professional experience in ML, Python programming, and data engineering.<br/><br/></p><p>- Deep expertise in Python (NumPy, Pandas, PySpark, FastAPI, etc.) and ML frameworks (TensorFlow, PyTorch, Transformers).<br/><br/></p><p>- Practical experience with GenAI: training/fine-tuning LLMs (OpenAI, HuggingFace, Google Gemini, etc.), prompt engineering, and retrieval-augmented generation (RAG).<br/><br/></p><p>- Hands-on experience with Databricks (Workspace, MLflow, Delta Lake, Notebooks).<br/><br/></p><p>- Strong knowledge of cloud data platforms (AWS, Azure, GCP) and containerization (Docker, Kubernetes).<br/><br/></p><p>- Applied experience with ETL/ELT, data lakes, real-time streaming (Kafka, Spark Streaming).<br/><br/></p><p>- Proven track record of tackling cutting-edge data problems at scale published research or open source contributions a plus.<br/><br/></p><p>- Familiarity with modern MLOps toolchains (MLflow, Airflow, Feature Store, CI/CD).<br/><br/></p><p>- Effective communicator with excellent collaboration skills.</p><br/><p><b>Tech Stack :</b><br/><br/></p><p>- Python, PySpark, FastAPI, Flask.<br/><br/></p><p>- TensorFlow, PyTorch, HuggingFace Transformers.<br/><br/></p><p>- Databricks, Delta Lake, MLflow.<br/><br/></p><p>- AWS/Azure/GCP S3, Blob Storage, EC2, Lambda, Step Functions.<br/><br/></p><p>- SQL, NoSQL.</p><br/></p> (ref:hirist.tech)