Job Description
<p><p>We are looking for an experienced GCP Data Engineer with a minimum of 5+ years of professional experience in data engineering, cloud-based data solutions, and large-scale distributed systems.
This role is fully remote and requires a hands-on professional who can design, build, and optimize data pipelines and solutions on Google Cloud Platform (GCP).</p><br/><p><b>Key Responsibilities :</b></p><p><b><br/></b></p><p>- Architect, design, and implement highly scalable data pipelines and ETL workflows leveraging GCP services.<br/><br/></p><p>- Develop and optimize data ingestion, transformation, and storage frameworks to support analytical and operational workloads.<br/><br/></p><p>- Work extensively with BigQuery, Dataflow, Pub/Sub, Dataproc, Data Fusion, Cloud Composer, and Cloud Storage to design robust data solutions.<br/><br/></p><p>- Create and maintain efficient data models and schemas for analytical reporting, machine </p><p>learning pipelines, and real-time processing.<br/><br/></p><p>- Collaborate closely with data scientists, analysts, and business stakeholders to understand requirements and convert them into technical data solutions.<br/><br/></p><p>- Implement best practices for data governance, security, privacy, and compliance across the entire data lifecycle.</p><p><br/></p><p>- Monitor, debug, and optimize pipeline performance ensuring minimal latency and high </p><p>throughput.<br/><br/></p><p>- Design and maintain APIs and microservices for data integration across platforms.<br/><br/></p><p>- Perform advanced data quality checks, anomaly detection, and validation to ensure data </p><p>accuracy and consistency.<br/><br/></p><p>- Implement CI/CD for data engineering projects using GCP-native DevOps tools.<br/><br/></p><p>- Stay updated with emerging GCP services and industry trends to continuously improve existing solutions.<br/><br/></p><p>- Create detailed documentation for data processes, workflows, and standards to enable smooth knowledge transfer.<br/><br/></p><p>- Support the migration of on-premise data systems to GCP, ensuring zero downtime and efficient cutover.<br/><br/></p><p>- Automate repetitive workflows, deployment processes, and monitoring systems using Python, Shell scripting, or Terraform.<br/><br/></p><p>- Provide mentoring and technical guidance to junior data engineers in the team.</p><br/><p><b>Required Skills & Experience :</b></p><p><b><br/></b></p><p>- 5+ years of experience in data engineering with a strong focus on cloud-based data solutions.<br/><br/></p><p>- Hands-on expertise with Google Cloud Platform (GCP) and services including BigQuery, Dataflow, Pub/Sub, Dataproc, Data Fusion, Cloud Composer, and Cloud Storage.<br/><br/></p><p>- Strong proficiency in SQL, including query optimization, performance tuning, and working with large datasets.<br/><br/></p><p>- Advanced programming skills in Python, Java, or Scala for building data pipelines.<br/><br/></p><p>- Experience with real-time data streaming frameworks such as Apache Kafka or Google Pub/Sub.<br/><br/></p><p>- Solid knowledge of ETL/ELT processes, data modeling (star/snowflake), and schema design for both batch and streaming use cases.<br/><br/></p><p>- Proven track record of building data lakes, warehouses, and pipelines that can scale with </p><p>enterprise-level workloads.<br/><br/></p><p>- Experience integrating diverse data sources including APIs, relational databases, flat files, and </p><p>unstructured data.<br/><br/></p><p>- Knowledge of Terraform, Infrastructure as Code (IaC), and automation practices in cloud environments.<br/><br/></p><p>- Understanding of CI/CD pipelines for data engineering workflows and integration with Git, </p><p>Jenkins, or Cloud Build.<br/><br/></p><p>- Strong background in data governance, lineage, and cataloging tools.<br/><br/></p><p>- Familiarity with machine learning workflows and enabling ML pipelines using GCP services is </p><p>an advantage.<br/><br/></p><p>- Good grasp of Linux/Unix environments and shell scripting.<br/><br/></p><p>- Exposure to DevOps practices and monitoring tools such as Stackdriver or Cloud Excellent problem-solving, debugging, and analytical skills with the ability to handle complex </p><p>technical challenges.<br/><br/></p><p>- Strong communication skills with the ability to work independently in a remote-first team Skills :</b></p><p><br/></p><p>- Experience with multi-cloud or hybrid environments (AWS/Azure alongside GCP).</p><p><br/></p><p>- Familiarity with data visualization platforms such as Looker, Tableau, or Power BI.<br/><br/></p><p>- Exposure to containerization technologies such as Docker and Kubernetes.<br/><br/></p><p>- Understanding of big data processing frameworks like Spark, Hadoop, or Flink.<br/><br/></p><p>- Prior experience in industries with high data volume such as finance, retail, healthcare, or Background :</b></p><p><br/></p><p>- Bachelors or Masters degree in Computer Science, Information Technology, Data Engineering, or a related field.</p><p><br/></p><p>- Relevant GCP certifications (e.g., Professional Data Engineer, Professional Cloud Architect) will be highly preferred.</p><br/><p><b>Why Join Us?</b></p><p><br/></p><p>- Opportunity to work on cutting-edge cloud data projects at scale.<br/><br/></p><p>- Fully remote working environment with flexible schedules.<br/><br/></p><p>- Exposure to innovative data engineering practices and advanced GCP tools.<br/><br/></p><p>- Collaborative team culture that values continuous learning, innovation, and career growth.</p><br/></p> (ref:hirist.tech)