Responsibilities:
- GCP Solution Architecture & Implementation: Implement and architect data solutions on Google Cloud Platform (GCP), leveraging its various components.
- End-to-End Data Pipeline Development: Design and create end-to-end data pipelines using technologies like Apache Beam, Google Dataflow, or Apache Spark.
- Data Ingestion & Transformation: Implement data pipelines to automate the ingestion, transformation, and augmentation of data sources, providing best practices for pipeline operations.
- Data Technologies Proficiency: Work with Python, Hadoop, Spark, SQL, BigQuery, BigTable, Cloud Storage, Datastore, Spanner, Cloud SQL, and Machine Learning services.
- Database Expertise: Demonstrate expertise in at least two of these technologies: Relational Databases, Analytical Databases, or NoSQL databases.
- SQL Development & Data Mining: Possess expert knowledge in SQL development and experience in data mining (SQL, ETL, data warehouse, etc.) using complex datasets in a business environment.
- Data Integration & Preparation: Build data integration and preparation tools using cloud technologies (like Snaplogic, Google Dataflow, Cloud Dataprep, Python, etc.).
- Data Quality & Regulatory Compliance: Identify downstream implications of data loads/migration, considering aspects like data quality and regulatory compliance.
- Scalable Data Solutions: Develop scalable data solutions that simplify user access to massive data, capable of adapting to a rapidly changing business environment.
- Programming: Proficient in programming languages such as Java and Python.
Required Skills:
- GCP Data Engineering Expertise: Strong experience with GCP Data Engineering, including BigQuery, SQL, Cloud Composer/Python, Cloud Functions, Dataproc + PySpark, Python injection, Dataflow + Pub/Sub.
- Expert knowledge of Google Cloud Platform; other cloud platforms are a plus.
- Expert knowledge in SQL development.
- Expertise in building data integration and preparation tools using cloud technologies (like Snaplogic, Google Dataflow, Cloud Dataprep, Python, etc.).
- Proficiency with Apache Beam/Google Dataflow/Apache Spark in creating end-to-end data pipelines.
- Experience in some of the following: Python, Hadoop, Spark, SQL, BigQuery, BigTable, Cloud Storage, Datastore, Spanner, Cloud SQL, Machine Learning.
- Proficiency in programming in Java, Python, etc.
- Expertise in at least two of these technologies: Relational Databases, Analytical Databases, NoSQL databases.
- Strong analytical and problem-solving skills.
- Capability to work in a rapidly changing business environment.
Certifications (Major Advantage):
- Certified in Google Professional Data Engineer/Solution Architect.
Skills Required
Google Cloud Platform, Sql Development, Python, Hadoop, Spark, Relational Databases