Job Description
<p><p><b>Job Title :</b> Senior Data Engineer/ DevOps Enterprise Big Data Platform</p><p><br/></p><p><b>Job Description :</b></p><p><br/>In this role, you will be part of a growing, global team of data engineers, who collaborate in DevOps mode, to enable business with state-of-the-art technology to leverage data as an asset and to take better informed decisions.<br/><br/>The Enabling Functions Data Office Team is responsible for designing, developing, testing, and supporting automated end-to-end data pipelines and applications on Enabling Functions data management and analytics platform (Palantir Foundry, AWS and other components).<br/><br/>The Foundry Platform Comprises Multiple Different Technology Stacks, Which Are Hosted On Amazon Web Services (AWS) Infrastructure Or Own Data Centers.
Developing Pipelines And Applications On Foundry Requires :<br/><br/></p><p>- Proficiency in SQL / Scala / Python (Python required; all 3 not necessary)<br/><br/></p><p>- Proficiency in PySpark for distributed computation<br/><br/></p><p>- Proficiency in Ontology, Slate, Familiarity with Workshop App basic design/visual competency<br/><br/></p><p>- Familiarity with common databases (e.g. Oracle, mySQL, Microsoft SQL).
Not all types required<br/><br/>This position will be project based and may work across multiple smaller projects or a single large project utilizing an agile project methodology.<br/><br/><b>Roles & Responsibilities :</b></p><p><p><b><br/></b></p>- Tech / B.Sc./M.Sc. in Computer Science or related field and overall 6+ years of industry experience<br/><br/></p><p>- Strong experience in Big Data & Data Analytics<br/><br/></p><p>- Experience in building robust ETL pipelines for batch as well as streaming ingestion.<br/><br/></p><p>- Experience with Palantir Foundry.
Most important Foundry apps : Code Repository, Data Lineage and Scheduling, Ontology Manager, Contour, Object View Editor, Object Explorer, Quiver, Workshop, Vertex</p><p><br/></p><p>- Experience with Data Connection, external transforms, Foundry APIs, SDK and Webhooks is a plus<br/><br/></p><p>- Interacting with RESTful APIs incl.
authentication via SAML and OAuth2<br/><br/></p><p>- Experience with test driven development and CI/CD workflows<br/><br/></p><p>- Knowledge of Git for source control management<br/><br/></p><p>- Agile experience in Scrum environments like Jira<br/><br/></p><p>- Experience in visualization tools like Tableau or Qlik is a plus<br/><br/></p><p>- Experience in Palantir Foundry, AWS or Snowflake is an advantage<br/><br/></p><p>- Basic knowledge of Statistics and Machine Learning is favorable<br/><br/></p><p>- Problem solving abilities<br/><br/></p><p>- Proficient in English with strong written and verbal communication<br/><br/><b>Primary Responsibilities :</b></p><p><p><b><br/></b></p>- Responsible for designing, developing, testing and supporting data pipelines and applications<br/><br/></p><p>- Industrialize data pipelines<br/><br/></p><p>- Establishes a continuous quality improvement process to systematically optimize data quality<br/><br/></p><p>- Collaboration with various stakeholders incl.
business and IT<br/><br/><b>Education :</b></p><p><p><b><br/></b></p>- Bachelor (or higher) degree in Computer Science, Engineering, Mathematics, Physical Sciences or related fields<br/><br/><b>Professional Experience :</b></p><p><p><b><br/></b></p>- 6+ years of experience in system engineering or software development<br/><br/></p><p>- 4+ years of experience in engineering with experience in ETL type work with databases and Hadoop platforms.<br/><br/><b>Skills :</b></p><p><p><b><br/></b></p>- Hadoop General : Deep knowledge of distributed file system concepts, map-reduce principles and distributed computing<br/><br/></p><p>- Knowledge of Spark and differences between Spark and Map-Reduce<br/><br/></p><p>- Familiarity of encryption and security in a Hadoop cluster.<br/><br/></p><p>- Must be proficient in technical data management tasks, i.e. writing code to read, transform and store data<br/><br/></p><p>- XML/JSON knowledge<br/><br/></p><p>- Experience working with REST APIs.<br/><br/>- Spark Experience in launching spark jobs in client mode and cluster mode<br/><br/></p><p>- Familiarity with the property settings of spark jobs and their implications to performance.<br/><br/>- Experience with developing ELT/ETL processes with experience in loading data from enterprise sized RDBMS systems such as Oracle, DB2, MySQL, etc.<br/><br/></p><p>- Authorization Basic understanding of user authorization (Apache Ranger preferred)<br/><br/></p><p>- Programming : Must be at able to code in Python or expert in at least one high level language such as Java, C, Scala.</p><p><br/>- Must have experience in using REST APIs<br/><br/></p><p>- SQL Must be an expert in manipulating database data using SQL<br/><br/></p><p>- Familiarity with views, functions, stored procedures and exception handling.<br/><br/></p><p>- AWS General knowledge of AWS Stack (EC2, S3, EBS, )<br/><br/></p><p><b>Specific Information Related To The Position :</b></p><p><p><b><br/></b></p>- Physical presence in primary work location (Bangalore)<br/><br/></p><p>- Flexible to work CEST and US EST time zones (according to team rotation plan)<br/><br/></p><p>- Willingness to travel to Germany, US and potentially other locations (as per project demand)<br/><br/>At YASH, you are empowered to create a career that will take you to where you want to go while working in an inclusive team environment.
We leverage career-oriented skilling models and optimize our collective intelligence aided with technology for continuous learning, unlearning, and relearning at a rapid pace and scale</p><br/></p> (ref:hirist.tech)