Job Description
<p><p><b>Key Responsibilities :</b><br/><br/> <b>- Python Engineering</b><br/><br/> <b>- Generative AI & RAG</b><br/><br/> <b>- MCP (Model Context Protocol)</b><br/><br/> <b>- Agent-to-Agent (A2A) Workflows</b><br/><br/> <b>- Production & Observability</b><br/><br/> <b>Required Skills & Qualifications :</b><br/><br/> - 37 years professional experience with Python (3.9+).<br/><br/> - Strong knowledge of OOP, async programming, and REST API design.<br/><br/> - Proven hands-on experience with RAG implementations and vector databases (Pinecone, Weaviate, FAISS, Qdrant, Milvus).<br/><br/> - Familiarity with MCP (Model Context Protocol) concepts and hands-on experience with MCP server implementations.<br/><br/> - Understanding of multi-agent workflows and orchestration libraries (LangGraph, AutoGen, CrewAI).<br/><br/> - Proficiency with FastAPI/Django for backend development.<br/><br/> - Comfort with Docker, GitHub Actions, CI/CD pipelines.<br/><br/> - Practical experience with cloud infrastructure (AWS/GCP/Azure).<br/><br/> - Add tracing, logging, and evaluation metrics (PromptFoo, LangSmith, Ragas).<br/><br/> - Optimize for latency, cost, and accuracy in real-world deployments.<br/><br/> - Deploy solutions using Docker, Kubernetes, and cloud platforms (AWS/GCP/Azure).<br/><br/> - Design and implement multi-agent orchestration (e.g., AutoGen, CrewAI, LangGraph).<br/><br/> - Build pipelines for agents to delegate tasks, exchange structured context, and collaborate.<br/><br/> - Add observability, replay, and guardrails to A2A interactions.<br/><br/> - Develop MCP servers to expose tools, resources, and APIs to LLMs.<br/><br/> - Work with FastMCP SDK and design proper tool/resource decorators.<br/><br/> - Ensure MCP servers follow best practices for discoverability, schema compliance, and security.<br/><br/> - Implement RAG pipelines: text preprocessing, embeddings, chunking strategies, retrieval, re-ranking, and evaluation.<br/><br/> - Integrate with LLM APIs (OpenAI, Anthropic, Gemini, Mistral) and open-source models (Llama, MPT, Falcon).<br/><br/> - Handle context-window optimization and fallback strategies for production workloads.<br/><br/> - Build clean, modular, and scalable Python codebases using FastAPI/Django.<br/><br/> - Implement APIs, microservices, and data pipelines to support AI use cases.<br/><br/> <b>Nice-to-Have</b><br/><br/> - Exposure to AI observability & evaluation (LangSmith, PromptFoo, Ragas).<br/><br/> - Contributions to open-source AI/ML or MCP projects.<br/><br/> - Understanding of compliance/security frameworks (SOC-2, GDPR, HIPAA).<br/><br/> - Prior work with custom embeddings, fine-tuning, or LLMOps stacks.<br/><br/> <b>What We Offer :</b><br/><br/> - Opportunity to own core AI modules (MCP servers, RAG frameworks, A2A orchestration).<br/><br/> - End-to-end involvement from architecture MVP production rollout.<br/><br/> - A fast-moving, engineering-first culture where experimentation is encouraged.<br/><br/> - Competitive compensation, flexible work setup, and strong career growth.<br/><br/> <b>Location :</b><br/><br/>- Bangalore (Hybrid) / Remote.<br/><br/> <b>Experience Level :</b><br/><br/>- 3 7 years.<br/><br/> <b>Compensation :</b><br/><br/>- Competitive, based on expertise.</p><br/></p> (ref:hirist.tech)