Key Responsibilities
· Data Pipeline Development: Design, develop, and maintain scalable data pipelines to support various data analytics and reporting needs.
· Data Integration: Integrate data from multiple sources and ensure data quality and consistency.
· Data Modelling: Build and optimize data models using Spark to support complex data transformations and analysis.
· SQL Development: Write efficient SQL queries to extract, transform, and load data into data warehouses and other storage solutions.
· Big Data Processing: Utilize big data technologies, especially Apache Spark, to process and analyze large datasets.
· Cloud Services: Implement and manage data solutions on cloud platforms such as Azure, AWS, or GCP.
· Problem Solving: Analyze and resolve data-related issues, ensuring data accuracy and reliability.
· Collaboration: Work closely with data scientists, analysts, and other stakeholders to understand data requirements and deliver solutions.
Required Skills and Qualifications
· Education: Bachelor’s degree in Computer Science, Information Technology, or a related field.
· Experience: 3 to 5 years of experience in data engineering or a related role.
· Programming: Proficient in Python, with experience in data manipulation and automation scripts.
· Big Data: Strong knowledge of big data technologies, especially Apache Spark.
· SQL: Excellent skills in writing and optimizing SQL queries.
· Data Modelling: Experience in data modelling and data architecture, particularly using Spark.
· Cloud Platforms: Hands-on experience with at least one cloud platform (Azure, AWS, GCP).
· Data Analytics Concepts: Solid understanding of data engineering and analytics concepts.
· Problem Solving: Strong problem-solving skills with the ability to troubleshoot data issues effectively.
· Communication: Good communication skills, with the ability to work collaboratively in a team environment.
Preferred Qualifications
· Databricks: Experience with Databricks tool is a plus but not mandatory.
· Certifications: Certifications in Databricks or Azure (or any relevant certifications) are highly