Job Description
            
                <p>Senior Software Engineer, Remote.<br/><br/> The Software Engineer, Data Ingestion will be a critical individual contributor responsible for designing collection strategies, developing, and maintaining robust and scalable data pipelines.<br/><br/> This role is at the heart of our data ecosystem, deliver new analytical software solution to access timely, accurate, and complete data for insights, products, and operational efficiency.<br/><br/> Key Responsibilities :<br/><br/>- Design, develop, and maintain high-performance, fault-tolerant data ingestion pipelines using Python.<br/><br/>- Integrate with diverse data sources (databases, APIs, streaming platforms, cloud storage, etc.<br/><br/>- Implement data transformation and cleansing logic during ingestion to ensure data quality.<br/><br/>- Monitor and troubleshoot data ingestion pipelines, identifying and resolving issues promptly.<br/><br/>- Collaborate with database engineers to optimize data models for fast consumption.<br/><br/>- Evaluate and propose new technologies or frameworks to improve ingestion efficiency and reliability.<br/><br/>- Develop and implement self-healing mechanisms for data pipelines to ensure continuity.<br/><br/>- Define and uphold SLAs and SLOs for data freshness, completeness, and availability.<br/><br/>- Participate in on-call rotation as needed for critical data pipeline issues.<br/><br/> Key Skills :<br/><br/>- 4+ years of experience, ideally with background in Computer Science, working in software product companies.<br/><br/>- Extensive Python Expertise: Extensive experience in developing robust, production-grade applications with Python.<br/><br/>- Data Collection & Integration: Proven experience collecting data from various sources (REST APIs, OAuth, GraphQL, Kafka, S3, SFTP, etc.<br/><br/>- Distributed Systems & Scalability: Strong understanding of distributed systems concepts, designing for scale, performance optimization, and fault tolerance.<br/><br/>- Cloud Platforms: Experience with major cloud providers (AWS or GCP) and their data-related services (e.g , S3, EC2, Lambda, SQS, Kafka, Cloud Storage, GKE).<br/><br/> Database Fundamentals: Solid understanding of relational databases (SQL, schema design, indexing, query optimization).<br/><br/>- OLAP database experience is a plus (Hadoop).<br/><br/>- Monitoring & Alerting: Experience with monitoring tools (e.g , Prometheus, Grafana) and setting up effective alerts.<br/><br/> Version Control : Proficiency with Git.<br/><br/> Containerization (Plus) : </p><p><br/></p><p>- Experience with Docker and Kubernetes.<br/><br/> Streaming Technologies (Plus) : </p><p><br/></p><p>- Experience with real-time data processing using Kafka, Flink, Spark Streaming.<br/></p> (ref:hirist.tech)