Who You Are
You're an ML Research Engineer with 2+ years of experience who bridges the gap between
cutting-edge research and production systems.
You're passionate about training models that
perform exceptionally well not just on benchmarks but in real-world applications.
You enjoy
diving deep into model architectures, experimenting with training techniques, and building
robust evaluation frameworks that ensure model reliability in critical applications.
Responsibilities
● Train and fine-tune models for speech recognition and natural language processing in
multilingual healthcare contexts
● Develop specialized models through fine-tuning and optimization techniques for
domain-specific tasks
● Design and implement comprehensive evaluation frameworks to measure model
performance across critical metrics
● Build data pipelines for collecting, annotating, and augmenting training datasets
● Research and implement state-of-the-art techniques from academic papers to improve
model performance
● Collaborate with AI engineers to deploy optimized models into production systems
● Create synthetic data generation pipelines to address data scarcity challenges
Qualifications
Required
● 2+ years of experience in ML/DL with focus on training and fine-tuning production
models
● Deep expertise in speech recognition systems (ASR) or natural language processing
(NLP), including transformer architectures
● Proven experience with model training frameworks (PyTorch, TensorFlow) and
distributed training
● Strong understanding of evaluation metrics and ability to design domain-specific
benchmarks
● Experience with modern speech models (Whisper, Wav2Vec2, Conformer) or LLM
fine-tuning techniques (LoRA, QLoRA, full fine-tuning)
● Proficiency in handling multilingual datasets and cross-lingual transfer learning
● Track record of improving model performance through data engineering and
augmentation strategies
Nice to Have
● Published research or significant contributions to open-source ML projects
● Experience with model optimization techniques (quantization, distillation, pruning)
● Background in low-resource language modeling
● Experience building evaluation frameworks for production ML systems