JOB DESCRIPTION
Skill: PySpark QA
Role / Tier: Lead Software Engineer/Tier 2
Experience: 6 – 9 years
Job Description:
 Primary Skills
 
 BIG Data technology mentioned below Hadoop / Big Data (HDFS, PYTHON, SPARK-SQL, MapReduce) with PYSpark.
 build CI/CD pipelines 
 Spark APIs to cleanse, explore, aggregate, transform, store & analyse 
 installing, configuring, debugging and troubleshooting Hadoop clusters 
 
 Secondary Skills
 
 Cloud-based infrastructure , AWS services EC2, EMR, S3, Lambda, EBS, IAM, Redshift, RDS 
 deploying and managing ETL pipelines, RDBMS technologies (PostgreSQL, MySQL, Oracle, etc.).
 knowledge of data frames, Pandas , data visualization tools & data mining .
 knowledge of JIRA, Bitbucket GitHub
 
 JD
 Seeking a developer to develop, enhance, in a Pyspark -QA role.
 Responsibility
 Experience with BIG Data technology mentioned below Hadoop / Big Data (HDFS, PYTHON, SPARK-SQL, MapReduce) with PYSpark.
 Hands-on experience in build CI/CD pipelines is required Outstanding coding, debugging and analytical skills, Core problem solving skills 
 eliminate possible solutions and select an optimal solution.
 Experience in installing, configuring, debugging and troubleshooting Hadoop clusters is desired skill.