Job Description
            
                TCS Hiring for AWS Data Lake Administrator
Role - AWS Data Lake Administrator
Experience - 10 to 14 Years
Location - Bengaluru, Hyderabad
Roles & Responsibilities
Key Responsibilities:
- Administer and manage the AWS Data Lake infrastructure, ensuring high availability, security, and performance.
- Configure and manage AWS S3, Lake Formation, and Glue Data Catalog to organize, secure, and catalog data within the data lake.
- Set up and manage Redshift Spectrum for querying and analyzing data stored in S3 using SQL and Redshift.
- Implement and manage data ingestion pipelines to ingest structured and unstructured data from various sources into the data lake using AWS services like Glue ETL, Lambda, and other orchestration tools.
- Define and enforce data governance policies, access control, and security measures using AWS Lake Formation, ensuring compliance with organizational and regulatory requirements.
- Optimize data storage in S3 through partitioning, compression, and appropriate data formats like Parquet, Avro, or ORC to improve query performance.
- Monitor and manage the Glue Catalog for maintaining metadata, table definitions, and data lineage within the data lake.
- Collaborate with data engineers, architects, and analysts to ensure seamless access to data for analytics and machine learning.
- Troubleshoot and optimize query performance in Redshift Spectrum, and manage workload performance in the data lake environment.
- Implement backup, recovery, and disaster recovery strategies for data lake assets, ensuring data integrity and availability.
- Maintain detailed documentation of data lake architecture, security policies, and data governance procedures.
Qualifications:
- 4-8 years of experience in managing data lake environments, with a focus on AWS data services.
- Strong expertise in AWS services like S3, Lake Formation, Glue, Glue Data Catalog, Redshift Spectrum, and Lambda.
- Good knowledge of SQL and query optimization techniques, particularly in querying data from S3 using Redshift Spectrum and Athena.
- Experience in data ingestion pipelines using AWS Glue ETL, Lambda, and other AWS orchestration tools.
- Strong understanding of data governance, access control, and security best practices in AWS environments.
- Experience with data formats (Parquet, ORC, Avro, JSON) and optimizing data storage for cost and performance.
- Knowledge of data partitioning, compression, and indexing strategies in S3.
- Familiarity with monitoring, troubleshooting, and optimizing AWS Data Lake performance.
- Excellent communication and collaboration skills to work across teams and support business requirements.
AWS certifications, such as AWS Certified Big Data – Specialty or AWS Certified Solutions Architect, are a plus.