RESPONSIBILITIES:
- Design and build optimized data pipelines in a cloud environment to drive analytical insights.
- Construct infrastructure for efficient ETL processes from various sources and storage systems.
- Lead the implementation of algorithms and prototypes to transform raw data into useful information.
- Architect, design, and maintain database pipeline architectures, ensuring readiness for AI/ML transformations.
- Create innovative data validation methods and data analysis tools.
- Ensure compliance with data governance and security policies.
- Interpret data trends and patterns to establish operational alerts.
- Develop analytical tools, programs, and reporting mechanisms.
- Conduct complex data analysis and present results effectively.
- Prepare data for prescriptive and predictive modeling.
- Continuously explore opportunities to enhance data quality and reliability.
- Apply strong programming and problem-solving skills to develop scalable solutions.
Skills Required
Apache Spark, Hive, Hadoop, Spark, Scala, Java, Data Lake, Etl