Know ATS Score
CV/Résumé Score
  • Expertini Resume Scoring: Our Semantic Matching Algorithm evaluates your CV/Résumé before you apply for this job role: Azure Cloud AI ML Python Backend Engineer.
India Jobs Expertini

Urgent! Azure Cloud - AI ML Python Backend Engineer Job Opening In pune – Now Hiring Sereno

Azure Cloud AI ML Python Backend Engineer



Job description

Who you are

You're someone who’s already shipped GenAI stuff—even if it was small: a chatbot, a RAG tool, or an agent prototype.

You live in Python, LangChain, LlamaIndex, Hugging Face, and vector DBs like FAISS or Milvus.

You know your way around prompts—noisy chains, rerankers, retrievals.

You've deployed models or services on Azure/AWS/GCP, wrapped them into FastAPI endpoints, and maybe even wired a bit of terraform/ARM.

You’re not building from spreadsheets; you're iterating with real data, debugging hallucinations, and swapping out embeddings in production.

You can read blog posts and paper intros, follow new methods like QLoRA, and build on them.

You're fine with ambiguity and startup chaos—no strict specs, no roadmap, just a mission.

You work in async Slack, ask quick questions, push code that works, and help teammates stay afloat.

You're not satisfied with just getting things done—you want GenAI to feel reliable, usable, and maybe even fun.


What you’ll actually do

You’ll build real GenAI features: agentic chatbots for document lookup, conversation assistants, or knowledge workflows.

You’ll design and implement RAG systems: data ingestion, embeddings, vector indexing, retrievals, and prompt pipelines.

You’ll write inference APIs in FastAPI that work with vector stores and cloud LLM endpoints.

You’ll containerize services with Docker, push to Azure/AWS/GCP, wire basic CI/CD, monitor latency and faulty responses, and iterate fast.

You’ll experiment with LoRA/QLoRA fine-tuning on small LLMs, test prompt variants, and measure output quality.

You’ll collaborate with DevOps to ensure deployment reliability, QA to make tests more robust, and frontend folks to shape UX.

You’ll share your work in quick “demo & dish” sessions: what's working, what's broken, what you're trying next.

You’ll tweak embeddings, watch logs, and improve pipelines one experiment at a time.

You’ll help write internal docs or “how-tos” so others can reuse your work.


Skills and knowledge

You have solid experience in Python backend development (FastAPI/Django)

Experienced with LLM frameworks: LangChain, LlamaIndex, CrewAI, or similar

Comfortable with vector databases: FAISS, Pinecone, Milvus

Able to fine-tune models using PEFT/LoRA/QLoRA

Knowledge of embeddings, retrieval systems, RAG pipelines, and prompt engineering

Familiar with cloud deployment and infra-as-code (Azure, AWS, GCP with Docker/K8s, Terraform/ARM)

Good understanding of monitoring and observability—tracking response latency, hallucinations, and costs

Able to read current research, try prototypes, and apply them pragmatically

  • Works well in minimal-structure startups; self-driven, team-minded, proactive communicator


Required Skill Profession

Other General



Your Complete Job Search Toolkit

✨ Smart • Intelligent • Private • Secure

Start Using Our Tools

Join thousands of professionals who've advanced their careers with our platform

Rate or Report This Job
If you feel this job is inaccurate or spam kindly report to us using below form.
Please Note: This is NOT a job application form.


    Unlock Your Azure Cloud Potential: Insight & Career Growth Guide