We are building next-generation AI Voice Agents for recruiters .
You’ll design and implement real-time, low-latency conversational systems that combine LLMs, STT, TTS, RAG , and streaming pipelines.
The backend runs in Python , with a React/Node.js  frontend.
If you have built real-time voice systems that can handle human interruptions and dynamic context, this role fits you.
Key Responsibilities  
- Build and optimize real-time AI Voice Agents  in Python.
 
 
- Integrate TTS (e.g., ElevenLabs)  and STT (e.g., Deepgram)  systems for natural, fast voice interaction.
 
 
- Implement LLM integration  with focus on latency reduction  and interruption handling .
 
 
- Develop caching layers for prompts and audio streams  to reduce cost and delay.
 
 
- Design and maintain SIP, WebRTC, and WebSocket  pipelines for call handling.
 
 
- Handle automatic answering machine detection (AMD)  and dynamic call routing.
 
 
- Work with React/Node.js frontend teams for end-to-end real-time communication.
 
 
Required Skills  
- Python : Strong backend development experience.
 
 
- Real-Time AI Voice Systems : Proven working experience is mandatory .
 
 
- LLM Integration : Deep understanding of latency, interruption handling, and RAG pipelines.
 
 
- Networking Protocols : Hands-on experience with SIP, WebRTC, and WebSockets .
 
 
- Frontend Collaboration : Working knowledge of React  and Node.js .
 
 
Preferred Skills  
- Experience with ElevenLabs , Deepgram , or other leading TTS/STT APIs. 
- Knowledge of audio stream caching , buffering, and GPU optimization.
 
 
- Familiarity with FreeSWITCH , Asterisk , or similar VoIP platforms.
 
 
- Background in automatic machine detection  and call classification.
 
 
Why Join Us  
You’ll work on cutting-edge voice-first AI infrastructure  used by staffing companies worldwide.
Early-stage team, fast decisions, deep tech stack, and clear impact.
Salary is in INR