Location : Gurgaon/Mumbai
Job ID : DE2010
Job Type : Full time
Experience : 4-6 years
Why would you like to join us?
TransOrg Analytics, an award winning – Big Data and Predictive analytics- company, offers advanced analytics solutions to industry leaders and Fortune 500 companies across India, US, UK, Singapore and the Middle East.
Our products – Clonizo’ (customer cloning) have yielded significant incremental benefits to our clients.
We have been recognized by the CIO Review magazine as the – Predictive Analytics Company of the Year- and by TiE for excellence in entrepreneurship.
What do we expect from you?
- Develop best in class capability of deploying self-running and learning framework for ML/DL model across cloud platform
- Provide strategic direction and technical expertise to meet data architecture needs
- Contribute to the development of data policies, governance and implementation plans, and conceptual and logical data models
- Work to assess data architectures to determine overall effectiveness and compliance with data strategy, enterprise requirements, and objectives
- Review and comment on data solutions designed by IT project teams, and support requirements development
- Research the implementation of data standards for interoperability, quality, and entity and data attributes
- Play a key role in cloud migration strategy with focus on single/multi cloud, best possible tools/service on cloud, design architecture
- Should be able to leverage Dev ops environment like GitHub, Docker, and Kubernetes etc.
to deploy/orchestrate the models - Establish rail roads of continuous data flow and transmission of insights to business especially in real time environment.
- Help come up with framework on cost optimization while running various process on cloud basis analytics job execution
- Instrumental in designing dev ops capability of the analytics group, which can lead to making “data as first citizen” in the organization
- Regular interaction with technology, analytics and business stakeholders to ensure the analytics use case pipeline are prioritised basis organization priority and impact
- Enforcing data architecture standards, procedures and policies to ensure consistency across different program and project implementations
What are we looking for?
- Bachelors in Computer Science/ Engineering, Statistics, Mathematics or related quantitative degree
- 6-9 years of professional experience as a Data Architect, Data Engineer or related role developing conceptual, logical, and preliminary physical models for an enterprise
- 4+ years of experience in Industry 4.0 initiatives and/or other aspects of corporate data management; e.g. Data Governance, Data Security, AI/ML, Data DevOps
- Strong knowledge on BI and ETL
- Well versed with any cloud platform AWS/Azure/GCP.
AWS is a plus. - Experience on CRM projects, Master Data Management projects, data cleansing projects, data de-duplication
- In-depth hands-on experience on AWS Glue with workflows to build ETL pipelines, Automate Scheduling
- In depth experience on Spark with Scala, Python and SparkSQL interfaces.
Familiar with parameters required for tuning and logging - In-depth hands-on experience on Amazon S3 with Python API (boto3)
- In-depth hands-on experience on Athena SQL as well as Python API (boto3)
- Hands-on experience with AWS lambda
- Deep experience with Redshift database and data mining via Python.
- Knowledge of developing and using data standards comprising the format, representation, definition, structuring, manipulation, tagging, transmission, use, and management of data
- Experience developing and using metadata guidelines to tag data and use the tags as part of a data management strategy
- Experience assessing data solutions for compliance and conformance with guidelines and policies
- Understanding of the use and maintenance of data dictionaries, data models, and data maps in an enterprise business software environment.