Job Description
<p><p>At Ennoventure, we are redefining the fight against counterfeit goods with our groundbreaking technology.<br/><br/>Backed by key investors like Fenice Investment Group and Tanglin Venture Partners, we are ready to embark on the next phase of our journey.<br/><br/>Our aim?
To build a world where authenticity reigns, ensuring every product and experience is genuine.<br/><br/>Here, innovation moves fast, collaboration fuels success, and your growth isnt just encouragedits inevitable.<br/><br/>As a Data Engineer, youll take the lead in designing, building, and optimizing data pipelines, storage solutions, and infrastructure for scalable data applications.<br/><br/>You will collaborate closely with crossfunctional teams to ensure data quality, integration, and performance in modern cloud-based architectures, helping to create impactful, next-gen products.<br/><br/>Your role will also involve transforming ideas into action, optimizing systems, and pushing technical boundaries.<br/><br/>If you are ready to break new ground and revolutionize the tech/research landscape, this is the opportunity for you.<br/><br/><b>What will you do :</b><br/><br/>- Good understanding in application of modern data architecture - Data Lake, Data Warehouse, Data Lake, Data Lake, Data Lake, Data Lake, Data Lakehouse, as well as Data Fabric and Data Mesh concepts.<br/><br/>- In-depth expertise in cloud platforms - AWS, AZURE (including IaaS/PaaS/SaaS service models)<br/><br/>- Proficient in multi-cloud and hybrid-cloud platforms.<br/><br/>- Good understanding of data storage, application integration, open file formats, and data processing<br/><br/>- Experience in orchestrating end-to-end data engineering infrastructure for intricate and large-scale applications.<br/><br/>- Collaborate with data scientists to translate model requirements into optimized data pipelines, ensuring data quality, processing, and integration.<br/><br/>- Define and refine performance benchmarks, and optimize data infrastructure to achieve peak correctness, availability, cost efficiency, scalability, and robustness.<br/><br/>- Expert in data engineering architecture and frameworks including batch and stream processing with Hadoop ecosystem, data warehouse / data lake platforms, Python / PySpark programming.<br/><br/>- Data Pipelines & Infrastructure: Take full ownership of building and maintain ETL/ELT pipelines to ensure data is collected, transformed, and available for real-time analytics.<br/><br/>- Design and implement systems which can power the customer facing analytics with good performance and user experience.<br/><br/>- Build and manage large, complex data sets to meet functional and non-functional business need<br/><br/>- Analyze and integrate disparate systems to provide timely and accurate information to visualization teams and business stakeholders.<br/><br/>- Optimize and troubleshoot data pipelines for performance, reliability, and scalability.<br/><br/>- Ensure data integrity and quality by implementing validation and testing strategies.<br/><br/>- Create and implement internal process improvements, such as redesigning infrastructure for scalability, improving data delivery, and automating manual processes.<br/><br/>- Data Observability: Track and manage the lifecycle of data paths across systems using data observability tools to measure and ensure quality, reliability, and lineage.<br/><br/><b>What do we look for at Ennoventure?:</b><br/><br/>- Bachelors degree in computer science, Engineering, Information Technology, or a related field.<br/><br/>- Strong proficiency in Python, SQL, and PySpark for data manipulation and transformation.<br/><br/>- In-depth knowledge of Big Data technologies such as Hadoop, MapReduce, and Spark.<br/><br/>- Experience with cloud platforms like AWS, Azure (familiarity with services like Redshift).<br/><br/>- Solid understanding of ETL/ELT processes, data pipelines, and data integration tools.<br/><br/>- Experience with Modern data warehousing technologies and architectures, such as Redshift, Snowflake, Big Query, ClickHouse, etc.<br/><br/>- Strong knowledge of data modelling, including dimensional modelling, Hive, and star schema.<br/><br/>- Familiarity with data visualization tools such as Tableau, Power BI, Superset, and Metabase.<br/><br/>- Hands-on experience with Data Processing & Orchestration frameworks: Spark, Temporal, Kafka, Pyspark, Presto<br/><br/>- Experience with Star Rocks, and NoSQL databases like MongoDB, Cassandra, or HBase is a plus.<br/><br/>- Familiarity with modern data lake concepts and tools (e.g., AWS Lake Formation, Azure Data Lake).<br/><br/>- Familiarity with containerization (e.g., Docker, Kubernetes) and CI/CD pipelines.<br/><br/>- Excellent problem-solving skills and analytical thinking.<br/><br/>- Ability to work in a fast-paced and constantly evolving environment<br/><br/>- - - - - - - </p><br/></p> (ref:hirist.tech)