We are building next-generation AI Voice Agents for recruiters  .
You’ll design and implement real-time, low-latency conversational systems that combine LLMs, STT, TTS, RAG  , and streaming pipelines.
The backend runs in Python  , with a React/Node.Js   frontend.
If you have built real-time voice systems that can handle human interruptions and dynamic context, this role fits you.
 
Key Responsibilities    
- Build and optimize real-time AI Voice Agents   in Python.
 
 
- Integrate TTS (e.G., ElevenLabs)   and STT (e.G., Deepgram)   systems for natural, fast voice interaction.
 
 
- Implement LLM integration   with focus on latency reduction   and interruption handling  .
 
 
- Develop caching layers for prompts and audio streams   to reduce cost and delay.
 
 
- Design and maintain SIP, WebRTC, and WebSocket   pipelines for call handling.
 
 
- Handle automatic answering machine detection (AMD)   and dynamic call routing.
 
 
- Work with React/Node.Js frontend teams for end-to-end real-time communication.
 
 
Required Skills    
- Python  : Strong backend development experience.
 
 
- Real-Time AI Voice Systems  : Proven working experience is mandatory  .
 
 
- LLM Integration  : Deep understanding of latency, interruption handling, and RAG pipelines.
 
 
- Networking Protocols  : Hands-on experience with SIP, WebRTC, and WebSockets  .
 
 
- Frontend Collaboration  : Working knowledge of React   and Node.Js  .
 
 
Preferred Skills    
- Experience with ElevenLabs  , Deepgram  , or other leading TTS/STT APIs.  
- Knowledge of audio stream caching  , buffering, and GPU optimization.
 
 
- Familiarity with FreeSWITCH  , Asterisk  , or similar VoIP platforms.
 
 
- Background in automatic machine detection   and call classification.
 
 
Why Join Us    
You’ll work on cutting-edge voice-first AI infrastructure   used by staffing companies worldwide.
Early-stage team, fast decisions, deep tech stack, and clear impact.
 
  
Salary is in INR