Salary: Competitive / Paid in Indian Rupee .
INR / AnnualRecommended Quick LinksWhat You Should Know About This Job
- Job Title: Data Engineer (Java)
- Location : All EXL Locations
Job Summary:
We are seeking a skilled Data Engineer with strong expertise in Java and big data technologies to design, develop, and maintain scalable batch data pipelines.
The ideal candidate will have hands-on experience working with modern data lakehouse architectures, cloud-native data platforms, and automation tools to support high-performance analytics and data processing workloads.
Key Responsibilities:
- Design, develop, and optimize scalable batch data pipelines using Java and Apache Spark to handle large volumes of structured and semi-structured data.
- Utilize Apache Iceberg to manage data lakehouse environments, supporting advanced features such as schema evolution and time travel for data versioning and auditing.
- Build and maintain reliable data ingestion and transformation workflows using AWS Glue, EMR, and Lambda services to ensure seamless data flow and integration.
- Integrate with Snowflake as the cloud data warehouse to enable efficient data storage, querying, and analytics workloads.
- Collaborate closely with DevOps and infrastructure teams to automate deployment, testing, and monitoring of data workflows using CI/CD tools like Jenkins.
- Develop and manage CI/CD pipelines for Spark/Java applications, ensuring automated testing and smooth releases in a cloud environment.
- Monitor and continuously optimize the performance, reliability, and cost-efficiency of data pipelines running on cloud-native platforms.
- Implement and enforce data security, compliance, and governance policies in line with organizational standards.
- Troubleshoot and resolve complex issues related to distributed data processing and integration.
- Work collaboratively within Agile teams to deliver high-quality data engineering solutions aligned with business requirements.
Required Skills and Qualifications:
- Bachelor's or Master's degree in Computer Science, Engineering, or a related technical field.
- Strong proficiency in Java programming with solid understanding of object-oriented design principles.
- Proven experience designing and building ETL/ELT pipelines and frameworks.
- Excellent command of SQL and familiarity with relational database management systems.
- Hands-on experience with big data technologies such as Apache Spark, Hadoop, and Kafka or equivalent streaming and batch processing frameworks.
- Knowledge of cloud data platforms, preferably AWS services (Glue, EMR, Lambda) and Snowflake.
- Experience with data modeling, schema design, and concepts of data warehousing.
- Understanding of distributed computing, parallel processing, and performance tuning in big data environments.
- Strong analytical, problem-solving, and debugging skills.
- Excellent communication and teamwork skills with experience working in Agile environments.
Preferred Qualifications:
- Experience with containerization and orchestration technologies such as Docker and Kubernetes.
- Familiarity with workflow orchestration tools like Apache Airflow.
- Basic scripting skills in languages like Python or Bash for automation tasks.
- Exposure to DevOps best practices and building robust CI/CD pipelines.
- Prior experience managing data security, governance, and compliance in cloud environments.