We are building next-generation AI Voice Agents for recruiters.
You’ll design and implement real-time, low-latency conversational systems that combine LLMs, STT, TTS, RAG, and streaming pipelines.
The backend runs in Python, with a React/Node.Js frontend.
If you have built real-time voice systems that can handle human interruptions and dynamic context, this role fits you.
Key Responsibilities
- Build and optimize real-time AI Voice Agents in Python.
- Integrate TTS (e.G., ElevenLabs) and STT (e.G., Deepgram) systems for natural, fast voice interaction.
- Implement LLM integration with focus on latency reduction and interruption handling.
- Develop caching layers for prompts and audio streams to reduce cost and delay.
- Design and maintain SIP, WebRTC, and WebSocket pipelines for call handling.
- Handle automatic answering machine detection (AMD) and dynamic call routing.
- Work with React/Node.Js frontend teams for end-to-end real-time communication.
Required Skills
- Python: Strong backend development experience.
- Real-Time AI Voice Systems: Proven working experience is mandatory.
- LLM Integration: Deep understanding of latency, interruption handling, and RAG pipelines.
- Networking Protocols: Hands-on experience with SIP, WebRTC, and WebSockets.
- Frontend Collaboration: Working knowledge of React and Node.Js.
Preferred Skills
- Experience with ElevenLabs, Deepgram, or other leading TTS/STT APIs.
- Knowledge of audio stream caching, buffering, and GPU optimization.
- Familiarity with FreeSWITCH, Asterisk, or similar VoIP platforms.
- Background in automatic machine detection and call classification.
Why Join Us
You’ll work on cutting-edge voice-first AI infrastructure used by staffing companies worldwide.
Early-stage team, fast decisions, deep tech stack, and clear impact.
Salary is in INR