Job Description
<p><p><b>Description :</b><br/><br/><b>Role Summary :</b><br/><br/>We are seeking an experienced Generative AI Developer / Architect with a strong track record of building and deploying LLM-powered applications in real-world environments.
The ideal candidate will bring deep technical expertise in LLMs, prompt engineering, RAG architectures, and cloud-based AI services (Azure OpenAI, AWS Bedrock).<br/><br/>This is a high-impact role for someone who thrives at the intersection of innovation, scale, and responsible AI.
Youll be driving the end-to-end design, development, and deployment of production-grade GenAI solutions for diverse enterprise use cases.<br/><br/><b>Key Responsibilities :</b><br/><br/></p><p>- Design and develop GenAI applications using state-of-the-art LLMs such as GPT (OpenAI/Azure), Claude (Anthropic), LLaMA (Meta), etc.<br/><br/>- Build prompt engineering pipelines, including few-shot, chain-of-thought, and role-based prompts for enhanced context understanding.<br/><br/>- Develop RAG (Retrieval-Augmented Generation) pipelines using vector databases for dynamic knowledge retrieval from enterprise data.<br/><br/>- Architect and implement agent-based GenAI workflows for multi-step tasks, automation, and decision-making systems.<br/><br/>- Deploy scalable GenAI applications on Azure OpenAI (AI Studio) or AWS Bedrock, leveraging managed services and serverless infrastructure.<br/><br/>- Utilize Python, FastAPI, and frameworks like LangChain, LLamaIndex, or Haystack for backend orchestration.<br/><br/>- Integrate vector databases such as FAISS, Pinecone, Weaviate, or Chroma for efficient embedding storage and similarity search.<br/><br/>- Implement cost optimization strategies by identifying high-impact GenAI use cases and minimizing token consumption.<br/><br/>- Enforce Responsible AI practices, including fairness, explainability, and privacy compliance across all GenAI deployments.<br/><br/>- Design controls for prompt injection prevention, jailbreaking mitigation, and output filtering using input/output sanitization.<br/><br/>- Build enterprise Q&A systems using embedded knowledge bases, PDFs, internal wikis, and structured databases.<br/><br/>- Implement Human-in-the-Loop (HITL) mechanisms for validation, continuous learning, and human oversight.<br/><br/>- Design multimodal pipelines (text + image + voice) and handle real-time parsing, transcription, chunking, and token management.<br/><br/><b>Required Skills & Experience :</b><br/><br/>- 9 to 15 years of total experience, with minimum 23 years in LLM/GenAI development and real-world solution delivery.<br/><br/>- Hands-on experience with Azure OpenAI, AWS Bedrock, or other cloud-based GenAI offerings.<br/><br/>- Proficiency in Python, with strong backend development skills using FastAPI, Flask, or Django.<br/><br/>- Deep understanding of LLM operations, prompt engineering, and optimization for latency and cost.<br/><br/>- Experience with LangChain, LLamaIndex, Transformers (Hugging Face), and embedding models (e.g., OpenAI, Cohere, Azure Embeddings).<br/><br/>- Hands-on experience with Vector Databases like Pinecone, FAISS, Weaviate, or Qdrant.<br/><br/>- Familiarity with embedding techniques, document chunking, and similarity scoring algorithms.<br/><br/>- Exposure to prompt safety, Responsible AI frameworks, and data privacy regulations (e.g., GDPR, HIPAA).<br/><br/>- Experience deploying GenAI apps in production environments, with a strong understanding of MLOps and CI/CD pipelines for LLM applications.<br/><br/></p><br/></p> (ref:hirist.tech)