Job Description
<p><p><b>You will get to :</b></p><p><br/></p><p>- Design, build, and maintain high-performance data pipelines that integrate large-scale transactional data from our payments platform, ensuring data quality, reliability, and compliance with regulatory requirements.<br/><br/></p><p>- Develop and manage distributed data processing pipelines for both high-volume data streams and batch processing workflows in a cloud-native AWS environment.<br/><br/></p><p>- Implement observability and monitoring tools to ensure the reliability and scalability of the data platform, enabling stakeholders to make confident, data-driven decisions.<br/><br/></p><p>- Collaborate with cross-functional teams to gather requirements and deliver business-critical data solutions, including automation of payment transactions lifecycle management, regulatory reporting, and compliance.<br/><br/></p><p>- Design and implement data models across various storage paradigms to support payment transactions at scale while ensuring efficient data ingestion, transformation, and storage.<br/><br/></p><p>- Maintain data integrity by implementing robust validation, testing, and error-handling mechanisms within data workflows.<br/><br/></p><p>- Ensure that the data platform adheres to the highest standards for security, privacy, and governance.<br/><br/></p><p>- Provide mentorship and guidance to junior engineers, driving innovation, best practices, and continuous improvement across the :</b></p><p><br/></p>- 4-6 years of experience in backend development and/or data platform engineering.<br/><br/></p><p>- Proficiency in Python, with hands-on experience using data-focused libraries such as NumPy, Pandas, </p><p>SQLAlchemy, and Pandera to build high-quality data pipelines.<br/><br/></p><p>- Strong expertise in AWS services (S3, Redshift, Lambda, Glue, Kinesis, etc.) for cloud-based data infrastructure and processing.<br/><br/></p><p>- Experience with multiple data storage models, including relational, columnar, and time-series databases.<br/><br/></p><p>- Proven ability to design and implement scalable, reliable, and high-performance data workflows, ensuring data integrity, performance, and availability.<br/><br/></p><p>- Experience with workflow orchestrators such as Apache Airflow or Argo Workflows for scheduling and automating data pipelines.<br/><br/></p><p>- Familiarity with Python-based data stack tools like DBT, Dask, Ray, Modin, and Pandas for distributed data processing.<br/><br/></p><p>- Hands-on experience with data ingestion, cataloging, and change-data-capture (CDC) tools.<br/><br/></p><p>- Understanding of DataOps and DevSecOps practices to ensure secure and efficient data pipeline development and deployment.<br/><br/></p><p>- Strong collaboration, communication, and problem-solving skills, with the ability to work effectively </p><p>across multiple teams and geographies.<br/><br/></p><p>- Experience in payments or fintech platforms is a strong plus, particularly in processing high volumes of transactional data.</p><br/></p> (ref:hirist.tech)