RESPONSIBILITIES:
- Designing and building optimized data pipelines using cutting-edge technologies in a cloud environment to drive analytical insights.
- Constructing infrastructure for efficient ETL processes from various sources and storage systems.
- Leading the implementation of algorithms and prototypes to transform raw data into useful information.
- Architecting, designing, and maintaining database pipeline architectures, ensuring readiness for AI/ML transformations.
- Creating innovative data validation methods and data analysis tools.
- Ensuring compliance with data governance and security policies.
- Interpreting data trends and patterns to establish operational alerts.
- Developing analytical tools, programs, and reporting mechanisms.
- Conducting complex data analysis and presenting results effectively.
- Preparing data for prescriptive and predictive modeling.
- Continuously exploring opportunities to enhance data quality and reliability.
- Applying strong programming and problem-solving skills to develop scalable solutions.
Skills Required
Apache Spark, Hive, Hadoop, Scala, Databricks, Etl, Java