Job Description
Important Note (Please Read Before Applying)
Do NOT apply if:
- You have less than 10 years of experience
- You do not have hands-on GCP 5+ years of experience
- You are on a notice period longer than 15 days
- You are looking for remote only
- You are a fresher or unrelated background (e.g., support, testing only, non-Java roles)
✅ Apply ONLY if you meet ALL criteria above.
Random / irrelevant applications will not be processed.
✅ Apply ONLY if you meet ALL criteria above.
Random / irrelevant applications will not be processed.
Job Title: GCP and Python Data Engineer
Location:Bengaluru
Experience: 10+ years
Employment Type: Permanent
Notice Period: Immediate to 15 days Joiners only.
About the Company
Our client is a trusted global innovator of IT and business services, present in 10+ countries.
They specialize in digital & IT modernization, consulting, managed services, and industry-specific solutions.
With a commitment to long-term success, they empower clients and society to move confidently into the digital future.
Job description
Job Overview:
We are looking for a skilled and motivated Lead Data Engineer with strong
experience in Python programming and Google Cloud Platform (GCP) to join our data
engineering team.
The ideal candidate will be responsible for requirements gathering,
designing, architecting the solution, developing, and maintaining robust and scalable
ETL (Extract, Transform, Load) & ELT data pipelines.
The role involves working with
customers directly, gathering requirements, discovery phase, designing, architecting
the solution, using various GCP services, implementing data transformations, data
ingestion, data quality, and consistency across systems, and post post-delivery
support.
Experience Level:
10 to 12 years of relevant IT experience
Key Responsibilities:
● Design, develop, test, and maintain scalable ETL data pipelines using Python.
● Architect the enterprise solutions with various technologies like Kafka,
multi-cloud services, auto-scaling using GKE, Load balancers, APIGEE proxy API
management, DBT, using LLMs as needed in the solution, redaction of sensitive
information, DLP (Data Loss Prevention) etc.
● Work extensively on Google Cloud Platform (GCP) services such as:
○ Dataflow for real-time and batch data processing
○ Cloud Functions for lightweight serverless compute
○ BigQuery for data warehousing and analytics
○ Cloud Composer for orchestration of data workflows (on Apache Airflow)
○ Google Cloud Storage (GCS) for managing data at scale
○ IAM for access control and security
○ Cloud Run for containerized applications
Should have experience in the following areas :
○ API framework: Python FastAPI
○ Processing engine: Apache Spark
○ Messaging and streaming data processing: Kafka
○ Storage: MongoDB, Redis/Bigtable
○ Orchestration: Airflow
○ Experience in deployments in GKE, Cloud Run.
● Perform data ingestion from various sources and apply transformation and
cleansing logic to ensure high-quality data delivery.
● Implement and enforce data quality checks, validation rules, and monitoring.
● Collaborate with data scientists, analysts, and other engineering teams to
understand data needs and deliver efficient data solutions.● Manage version control using GitHub and participate in CI/CD pipeline
deployments for data projects.
● Write complex SQL queries for data extraction and validation from relational
databases such as SQL Server, Oracle, or PostgreSQL.
● Document pipeline designs, data flow diagrams, and operational support
procedures.
Required Skills:
● 10 to 12 years of hands-on experience in Python for backend or data engineering
projects.
● Strong understanding and working experience with GCP cloud services
(especially Dataflow, BigQuery, Cloud Functions, Cloud Composer, etc.).
● Solid understanding of data pipeline architecture, data integration, and
transformation techniques.
● Experience in working with version control systems like GitHub and knowledge of
CI/CD practices.
● Experience in Apache Spark, Kafka, Redis, Fast APIs, Airflow, GCP Composer DAGs.
● Strong experience in SQL with at least one enterprise database (SQL Server,
Oracle, PostgreSQL, etc.).
● Experience in data migrations from on-premise data sources to Cloud platforms.
Good to Have (Optional Skills):
● Experience working with the Snowflake cloud data platform.
● Hands-on knowledge of Databricks for big data processing and analytics.
● Familiarity with Azure Data Factory (ADF) and other Azure data engineering tools.
Additional Details:
● Excellent problem-solving and analytical skills.
● Strong communication skills and ability to collaborate in a team environment.
Education:
● Bachelor's degree in Computer Science, a related field, or equivalent experience