Job Description
<p><p><b>Position Summary :</b></p><p><p><b><br/></b></p>We are seeking an experienced and forward-thinking Databricks Architect to lead the design and implementation of scalable data solutions using the Databricks Lakehouse Platform.
This role requires strong technical leadership, a deep understanding of big data and analytics, and the ability to architect solutions that empower enterprise data initiatives across data engineering, advanced analytics, and machine learning workloads.<br/><br/></p><p>The ideal candidate will have extensive experience with Apache Spark, Delta Lake, PySpark/Scala, and cloud platforms (Azure, AWS, or GCP), along with a proven ability to define best practices for architecture, governance, security, and performance optimization on Responsibilities :</b></p><p><b><br/></b></p>- Design and implement end-to-end modern data architectures leveraging Databricks Lakehouse, Delta Lake, and cloud-native technologies.<br/><br/></p><p>- Define scalable architecture for data ingestion, ETL/ELT pipelines, data processing, analytics, and data science workflows.<br/><br/></p><p>- Develop reference architectures and solution blueprints for various business and technical use cases.<br/><br/></p><p>- Lead the development of robust data pipelines and ETL frameworks using PySpark/Scala and Databricks notebooks.<br/><br/></p><p>- Enable streaming and batch data processing using Apache Spark on Databricks.<br/><br/></p><p>- Collaborate with DevOps teams to implement CI/CD pipelines for Databricks workloads using tools like GitHub, Azure DevOps, or Jenkins.<br/><br/></p><p>- Optimize Databricks clusters, Spark jobs, and data workflows for performance, scalability, and cost efficiency.<br/><br/></p><p>- Implement caching, partitioning, Z-Ordering, and data compaction strategies on Delta Lake<br/><br/></p><p>- Define and implement data governance standards using Unity Catalog, role-based access control (RBAC), and data lineage tracking.<br/><br/></p><p>- Ensure data compliance and security policies are enforced across data pipelines and storage layers.<br/><br/></p><p>- Maintain metadata catalogs and ensure data quality and observability across the pipeline.<br/><br/></p><p>- Engage with business analysts, data scientists, product owners, and solution architects to gather requirements and translate them into technical solutions.<br/><br/></p><p>- Present architectural solutions and recommendations to senior leadership and cross-functional teams.<br/><br/></p><p>- Provide technical guidance and mentorship to data engineers and junior architects.<br/><br/></p><p>- Conduct code reviews, enforce coding standards, and foster a culture of engineering Qualifications Skills :</b></p></p><p><b><br/></b>- Expert-level knowledge of Databricks, including Delta Lake, Unity Catalog, MLflow, and Workflows.<br/><br/></p><p>- Strong hands-on experience with Apache Spark, especially using PySpark or Scala.<br/><br/></p><p>- Proficient in building and maintaining ETL/ELT pipelines in a large-scale distributed environment.<br/><br/></p><p>- In-depth understanding of cloud platforms AWS (with S3, Glue, EMR), Azure (with ADLS, Synapse), or GCP (with BigQuery, Dataflow).<br/><br/></p><p>- Familiarity with SQL and data modeling techniques for both OLAP and OLTP systems</p><br/></p> (ref:hirist.tech)