AI Engineer — Image‑to‑Video (Mid‑Level)
Location: Mumbai (on‑site/hybrid)
Contract: 6 months, extendable
Start: ASAP
What you’ll do
- Build, fine‑tune, and ship image‑to‑video generation pipelines (prompt‑to‑video, storyboard‑to‑video, identity‑preserving headshots)
- Integrate and iterate on SOTA components (Stable Video Diffusion, AnimateDiff, LTX‑Video/13B variants, CogVideo‑X, ControlNet‑style conditioning).
- Optimize inference for throughput and latency (TorchScript/ONNX, TensorRT, CUDA kernels, xFormers/Flash‑Attention, mixed precision).
- Handle multi‑GPU training/inference (DDP, gradient checkpointing, sharded weights, efficient sampling).
- Own dataset curation and augmentation for faces/motion;
enforce consent, licensing, and privacy. - Build evaluation loops and dashboards (FVD, CLIP/ID‑similarity, temporal consistency, face‑ID retention).
- Productionize with Docker and CI/CD;
wire up tracking (W&B/ClearML) and experiment reproducibility. - Collaborate with design and product to convert creative briefs into deployable features and A/B tests.
Must‑have
- 3–5 years total software/ML experience with 1–2+ years in generative video or diffusion work.
- Strong Python + PyTorch, Diffusers, and CV fundamentals (spatiotemporal models, sampling).
- Proven experience with multi‑GPU (DDP/NCCL) and performance profiling on Linux.
- Solid grasp of FFmpeg, video codecs/bitrates, and post‑processing pipelines.
- Portfolio: repo(s), demo links, or a short reel showing your image‑to‑video work.
Nice‑to‑have
- Experience with ComfyUI nodes/graphs, LoRA/ControlNet training, face‑ID preservation, or lip‑sync.
- Triton kernels, custom schedulers/samplers, quantization (INT8/FP8) for fast inference.
- MLOps on AWS/GCP/Azure, Kubernetes, vector stores, prompt orchestration.
Tools you’ll touch
- PyTorch, Diffusers, CUDA, TensorRT/ONNX, xFormers/Flash‑Attention, FFmpeg, Docker, W&B/ClearML, ComfyUI, GitHub Actions.
What we offer
- Competitive contract compensation (INR, market‑aligned) with extension potential.
- High‑impact ownership on production creative pipelines.
- Modern GPU stack and a fast path from prototype → production.
How to apply
- Email hr@mugshotstudios.com with subject “AI Engineer — Image‑to‑Video (Mumbai)” and include:
- Resume/CV, links to GitHub and any demos/reels.
- 3–5 bullet points on your most relevant image→video work.
- Earliest start date and work authorization status for India.