Job Description
<p><p><b>Description : </b><br/><br/>- Experience Level.<br/><br/>- 10+ years of experience in data engineering, with at least 35 years providing architectural guidance, leading teams, and standardizing enterprise data solutions.<br/><br/>- Must have deep expertise in Databricks, GCP, and modern data architecture patterns.<br/><br/><b>Key Responsibilities : </b><br/><br/>- Provide architectural guidance and define standards for data engineering implementations.<br/><br/>- Lead and mentor a team of data engineers, fostering best practices in design, development, and operations.<br/><br/>- Own and drive improvements in performance, scalability, and reliability of data pipelines and platforms.<br/><br/>- Standardize data architecture patterns and reusable frameworks across multiple projects.<br/><br/>- Collaborate with cross-functional stakeholders (Product, Analytics, Business) to align data solutions with organizational goals.<br/><br/>- Design data models, schemas, and dataflows for efficient storage, querying, and analytics.<br/><br/>- Establish and enforce strong data governance practices, ensuring security, compliance, and data quality.<br/><br/>- Work closely with governance teams to implement lineage, cataloging, and access control in compliance with standards.<br/><br/>- Design and optimize ETL pipelines using Databricks, PySpark, and SQL.<br/><br/>- Ensure robust CI/CD practices are implemented for data workflows, leveraging Terraform and modern DevOps practices.<br/><br/>- Leverage GCP services such as Cloud Functions, Cloud Run, BigQuery, Pub/Sub, and Dataflow for building scalable solutions.<br/><br/>- Evaluate and adopt emerging technologies, with exposure to Gen AI and advanced analytics capabilities.<br/><br/><b>Qualifications & Skills : </b><br/><br/>- Bachelors or Masters degree in Computer Science, Data Engineering, or related field.<br/><br/>- Extensive hands-on experience with Databricks (Autoloader, DLT, Delta Lake, CDF) and PySpark.<br/><br/>- Expertise in SQL and advanced query optimization.<br/><br/>- Proficiency in Python for data engineering and automation tasks.<br/><br/>- Strong expertise with GCP services : Cloud Functions, Cloud Run, BigQuery, Pub/Sub, Dataflow, GCS.<br/><br/>- Deep understanding of CI/CD pipelines, infrastructure-as-code (Terraform), and DevOps practices.<br/><br/>- Proven ability to provide architectural guidance and lead technical teams.<br/><br/>- Experience designing data models, schemas, and governance frameworks.<br/><br/>- Knowledge of Gen AI concepts and ability to evaluate practical applications.<br/><br/>- Excellent communication, leadership, and stakeholder management skills.<br/><br/><b>Skills : </b><br/><br/>- Google Cloud Platform (GCP), databricks, Architecture, bigquery, Google Cloud Storage, Generative AI and Dataflow architecture.<br/><br/></p><br/></p> (ref:hirist.tech)