Job Overview
Company
Stealth Mode Startup - AI Product Based Company
Category
Computer Occupations
Ready to Apply?
Take the Next Step in Your Career
Join Stealth Mode Startup - AI Product Based Company and advance your career in Computer Occupations
Apply for This Position
Click the button above to apply on our website
Job Description
<p><p><b>Key Responsibilities :</b><br/><br/>- Model Porting & Deployment : Port and deploy complex deep learning models from various frameworks (e.g., PyTorch, TensorFlow) to proprietary or commercial ML accelerator hardware platforms (e.g., TPUs, NPUs, GPUs).<br/><br/>- Performance Optimization : Analyse and optimize the performance of ML models for target hardware, focusing on latency, throughput, and power consumption.<br/><br/>- Quantization : Lead the efforts in model quantization (e.g., INT8) to reduce model size and accelerate inference while preserving model accuracy.<br/><br/>- Profiling & Debugging : Utilize profiling tools to identify performance bottlenecks and debug issues in the ML inference pipeline on the accelerator.<br/><br/>- Collaboration : Work closely with the, hardware, and software teams to understand model requirements and hardware capabilities, providing feedback to improve both.<br/><br/>- Tooling & Automation : Develop and maintain tools and scripts to automate the model porting, quantization, and performance testing workflows.<br/><br/>- Research & Innovation : Stay current with the latest trends and research in ML hardware, model compression, and optimization Qualifications :</b></p><br/>- Experience : 8 to 10 years of professional experience in software engineering, with a focus on model deployment and optimization.<br/><br/>- Technical Skills :<br/><br/>- Deep expertise in deep learning frameworks such as PyTorch and TensorFlow.<br/><br/>- Proven experience in optimizing models for inference onNPUs, TPUs, or other specialized accelerators.<br/><br/>- Extensive hands-on experience with model quantization (e.g., Post-Training Quantization, Quantization-Aware Training).<br/><br/>- Strong proficiency in C++ and Python, with experience writing high-performance, low-level code.<br/><br/>- Experience with GPU programming models like CUDA/cuDNN.<br/><br/>- Familiarity with ML inference engines and runtimes (e.g., TensorRT, OpenVINO, TensorFlow Lite).<br/><br/>- Strong understanding of computer architecture principles, including memory hierarchies, SIMD/vectorization, and cache optimization.<br/><br/>- Version Control : Proficient with Git and collaborative development workflows.<br/><br/>- Education : Bachelor's or Master's degree in Computer Science, Electrical Engineering, or a related Qualifications :</b></p><br/>- Experience with hardware-aware model design and co-design.<br/><br/>- Knowledge of compiler technologies for deep learning.<br/><br/>- Contributions to open-source ML optimization projects.<br/><br/>- Experience with real-time or embedded systems.<br/><br/>- Knowledge of cloud platforms (AWS, GCP, Azure) and MLOps best practices.<br/><br/>- Familiarity with CI/CD pipelines and automated testing for ML models.<br/><br/>- Domain knowledge in areas like computer vision, natural language processing, or speech recognition.</p><br/></p> (ref:hirist.tech)
About Stealth Mode Startup - AI Product Based Company
Don't Miss This Opportunity!
Stealth Mode Startup - AI Product Based Company is actively hiring for this Deep Learning Engineer - Machine Learning Models position
Apply Now