We are seeking a Pyspark/Python Developer with strong design and development skills for building data pipelines.
The ideal candidate will have experience working on AWS/AWS CLI, with AWS Glue being highly desirable.
You should possess hands-on SQL experience and be well-versed with Big Data concepts.
Familiarity with DevOps Tools and advanced Unix Shell scripting are also required.
Additionally, excellent communication skills are a must for this role.
Key Responsibilities
- Data Pipeline Development: Design and develop robust and scalable data pipelines using Pyspark/Python.
- Cloud Data Engineering: Work with AWS/AWS CLI, with a strong preference for experience with AWS Glue for ETL workloads.
- Database Interaction: Utilize hands-on SQL experience for data extraction, manipulation, and analysis within data pipelines.
- Big Data Concepts: Apply a solid understanding of Big Data concepts to build efficient and high-performing data solutions.
- DevOps Integration: Work effectively with DevOps Tools to integrate data pipelines into CI/CD processes.
- Scripting & Automation: Develop and maintain automation scripts using Advanced Unix Shell scripting.
- Collaboration: Collaborate effectively with data scientists, analysts, and other engineers to understand data requirements and deliver reliable data solutions.
- Communication: Leverage excellent communication skills to articulate technical designs, challenges, and solutions to various stakeholders.
Required Skills and Experience
- Experience on Pyspark/Python design and development skills for building data pipelines.
- Experience of working on AWS/AWS CLI (AWS Glue is highly desirable).
- Hands-on SQL experience.
- Well-versed with Big Data concepts.
- Familiarity with DevOps Tools.
- Advance Unix Shell scripting.
- Excellent communication skills.
Skills Required
Pyspark, Python Programming, Sql, Aws, Big Data, Devops