Open Location - Indore, Noida, Gurgaon, Bangalore, Hyderabad, Pune
Immediate Joiners are preferred.
Qualification
- 4 years of good hands-on exposure with Big Data technologies – pySpark (Data frame and SparkSQL), Hadoop, and Hive 
- Good hands-on experience of python and Bash Scripts 
- Good understanding of SQL and data warehouse concepts 
- Strong analytical, problem-solving, data analysis and research skills 
- Demonstrable ability to think outside of the box and not be dependent on readily available tools 
- Excellent communication, presentation and interpersonal skills are a must 
- Hands-on experience with using Cloud Platform provided Big Data technologies 
- Orchestration with Airflow and Any job scheduler experience 
- Experience in migrating workload from on-premises to cloud and cloud to cloud migrations 
Roles & Responsibilities
- Develop efficient ETL pipelines as per business requirements, following the development standards and best practices.
 
 
- Perform integration testing of different created pipeline in Cloud env.
 
 
- Provide estimates for development, testing & deployments on different env.
 
 
- Participate in code peer reviews to ensure our applications comply with best practices.
 
 
-  Develops and maintains scalable data pipelines to support continuing increases in data volume and complexity.
- Collaborates with analytics and business teams to improve data models that feed business intelligence tools, increasing data accessibility, and fostering data-driven decision making across the organization.
- Writes unit/integration tests, contributes to engineering wiki, and documents work.
 
 
-  Performs data analysis required to troubleshoot data related issues and assist in the resolution of data issues.
- Works closely with a team of frontend and backend engineers, product managers, and analysts.
- Defines company data assets (data models), spark, sparkSQL, and hiveSQL jobs to populate data models.
- Designs data integrations and data quality framework.
Mandatory Skills - Any Cloud, Python Programming, SQL, Pyspark