Job description
About the Role
We are looking for a highly skilled Backend Engineer with a strong background in Python, system design, and infrastructure to join our team.
You will be responsible for designing, building, and maintaining scalable backend systems, while collaborating with cross-functional teams to deliver robust and efficient solutions.
This role requires someone who can think end-to-end , from designing high-level architecture, implementing core services, to ensuring production-grade reliability and performance.
Key Responsibilities
Develop and maintain backend services and APIs using Python & Node Js
Design scalable, resilient, and maintainable systems, focusing on system architecture and distributed systems .
Integrate AI and large language models (LLMs) into applications, ensuring performance, scalability, and cost-efficiency.
Collaborate with AI/ML teams to deploy models into production pipelines.
Optimize infrastructure for AI workloads (GPU usage, caching, batch processing)
Build and maintain monitoring, logging, and observability for AI-powered systems.
Troubleshoot and resolve issues in production systems while maintaining high reliability.
Participate in design and code reviews, and drive engineering best practices across the team.
Automate deployment pipelines for backend and AI services (CI/CD, Ia C).
Required Skills & Qualifications
Strong experience in Python (Fast API (most-preferred), Flask, Django, or similar) or Node JS (Express (most-preferred), Fastify or similar)
Solid understanding of system design principles : scalability, fault tolerance, distributed systems.
Experience with infrastructure and Dev Ops : Docker, Kubernetes, Terraform, CI/CD pipelines.
Hands-on experience with cloud platforms (AWS, Azure, GCP), especially for AI workloads.
Knowledge of databases (SQL & No SQL) and caching systems (Redis, Memcached).
Experience integrating LLMs or AI APIs into production systems (Open AI, Hugging Face, Lang Chain, etc.).
Familiarity with messaging/streaming systems (Kafka, Rabbit MQ).
Monitoring and observability experience (Prometheus, Grafana, ELK).
Strong problem-solving, debugging, and analytical skills.
Excellent communication and collaboration skills.
Nice to Have
Experience with generative AI pipelines , vector databases, and embeddings.
Familiarity with ML Ops tools (MLflow, Bento ML, Ray Serve, etc.).
Knowledge of event-driven architectures and microservices.
Prior experience in AI/LLM-focused startups or high-scale AI systems .
What We Offer
Opportunity to work on challenging, large-scale systems with real-world impact.
Collaborative team culture with focus on learning and innovation .
Competitive compensation and growth opportunities.
Required Skill Profession
Other General