Job Description
<p><p><b>Role Overview :</b></p><p><p><br/></p><p>We are seeking a highly skilled LLM Engineer with strong expertise in building, deploying, and optimizing Large Language Model (LLM)-driven applications.</p><p><br/></p>The ideal candidate will have a solid software engineering background, proven experience in AI/ML systems, and hands-on exposure to modern cloud-native, containerized, and vector database technologies.<br/><br/></p><p>This role demands a professional who can integrate LLM capabilities into scalable backend systems while also collaborating on frontend Responsibilities :</b></p><p><b><br/></b></p><p><b>LLM Development & Integration :</b></p></p><p><b><br/></b>- Design, fine-tune, and deploy LLM-based solutions using frameworks such as AWS Bedrock, Gemini, Hugging Face, or OpenAI APIs.</p><p><br/>- Build custom pipelines for prompt engineering, retrieval-augmented generation (RAG), and model orchestration.<br/><br/></p><p>- Optimize model inference performance and reduce latency in real-world Engineering :</b></p><p><b><br/></b></p>- Develop and maintain backend services using Python (FastAPI, Flask, Django) with strong focus on </p><p>scalability, modularity, and performance.<br/><br/></p><p>- Implement APIs and microservices to integrate AI-driven features with enterprise-grade applications.<br/><br/></p><p>- Work with EKS, Docker, and Kubernetes for containerized deployments and Database & Retrieval Systems :</b></p><p><b><br/></b></p>- Build and optimize semantic search systems using vector databases such as Pinecone, FAISS, or Weaviate.<br/><br/></p><p>- Design and implement embeddings pipelines to support RAG (Retrieval-Augmented Generation) use cases.<br/><br/></p><p>- Ensure efficient storage, indexing, and retrieval of unstructured data for LLM & Infrastructure :</b></p><p><b><br/></b></p>- Leverage AWS (Bedrock, Lambda, S3, SageMaker) and GCP AI/ML services for scalable model deployment.<br/><br/></p><p>- Implement cloud-native best practices for security, cost optimization, and monitoring.<br/><br/></p><p>- Automate deployment pipelines (CI/CD) for ML-powered Integration :</b></p><p><b><br/></b></p>- Collaborate with frontend engineers to integrate AI-powered features into applications using React, Vue.js, or Lovable.dev.</p><p><br/></p><p>- Ensure seamless user experiences for AI-driven interfaces, including conversational UIs and intelligent Collaboration :</b></p><p><b><br/></b></p>- Partner with data scientists, ML engineers, and product managers to design end-to-end LLM-powered solutions.<br/><br/></p><p>- Contribute to system architecture, scalability discussions, and performance reviews.<br/><br/></p><p>- Stay updated with the latest advancements in LLMs, vector search, and AI Skills & Qualifications :</b></p><p><b><br/></b></p>- Software Engineering Experience : 6+ years in backend or full-stack development.<br/><br/></p><p>- LLM Expertise : Proven hands-on work with AWS Bedrock, Gemini, Hugging Face, or similar LLM ecosystems.<br/><br/></p><p>- Programming Skills : Strong proficiency in Python for backend development, API design, and AI integration.<br/><br/></p><p>- Frameworks & Tools : Experience with FastAPI, Flask, and containerized deployments (EKS, Docker, Kubernetes).<br/><br/></p><p>- Vector Databases : Practical experience with Pinecone, FAISS, Weaviate, or similar technologies.<br/><br/></p><p>- Cloud Proficiency : Hands-on exposure to AWS and GCP services for deploying and scaling LLM applications.<br/><br/></p><p>- Frontend Knowledge : Working experience with React, Vue.js, or Lovable.dev for AI feature integration.<br/><br/></p><p>- System Design : Ability to design scalable, distributed, and fault-tolerant AI-driven architectures.<br/><br/></p><p>- Problem-Solving : Strong debugging, optimization, and performance tuning skills</p><br/></p> (ref:hirist.tech)