Job Description
<p><p><b>Job Overview :</b><br/><br/> Were hiring an Senior AI Engineer to build production-grade components for an AI-first, data-centric platform.<br/><br/> You will implement agentic capabilities (intent, planner, router/composer), integrate knowledge-graph reasoning alongside a strong RAG baseline, and instrument robust evaluation and observability.<br/><br/> The ideal candidate writes clean, reliable code, understands LLM systems and data retrieval trade-offs, and can optimize for latency, quality, and cost.<br/><br/><b>Key Responsibilities :</b><br/><br/> - Agent Implementation: Build and harden Intent, Planner, and Router/Composer agents with typed JSON I/O, retries/timeouts, and idempotency; emit call-graph traces and correlation IDs.<br/><br/> - Knowledge-Graph Reasoning: Generate correct graph queries (SPARQL/Gremlin/PGQL) from planner outputs; perform subgraph extraction; encode rationale and references in responses.<br/><br/> - RAG Baseline & Retrieval: Implement document prep, chunking/embeddings, hybrid retrieval and (where available) reranking; maintain a high-quality baseline path for side-by-side comparisons.<br/><br/> - Prompt/Config Tuning: Version and tune prompts, routing policies (small?large model escalation), temperature/top-p settings, and caching; document routing outcomes and cost/latency budgets.<br/><br/> - Evaluation Hooks: Integrate test sets and scoring (faithfulness/correctness, precision/recall, multi-hop coverage, latency); enable automated re-evaluation on any change (model/agent/prompt/data).<br/><br/> - Observability & Cost Controls: Instrument traces/metrics/logs (token usage, latency P50/P95, error codes); surface cost-per-answer dashboards; implement backpressure and graceful degradation.<br/><br/> - Security & Guardrails: Enforce policy-as-code and entitlement checks (role/row/column), PII/PHI handling, content moderation, and HITL approval prompts for state-changing actions.<br/><br/> - Quality & CI/CD: Write unit/integration/contract tests; participate in PR reviews; ship via CI/CD with feature flags and environment promotion; maintain API/connector schemas and docs.<br/><br/><b>Required Skills :</b><br/><br/> - Applied LLM Engineering: 1-2+ years building production services; hands-on with LLM tool/function-calling, agent frameworks, and prompt/version management.<br/><br/> - Knowledge & Retrieval: Practical experience with Knowledge Graphs (RDF/SPARQL or property graph/Gremlin) and RAG pipelines (chunking, embeddings, retrieval/reranking).<br/><br/> - Data/Model Ecosystem: One or more vector DBs (pgvector, Pinecone, Weaviate, Milvus) and search (OpenSearch/Elasticsearch); familiarity with major model platforms (Azure OpenAI, Vertex, Anthropic, open-weights).<br/><br/> - Backend Skills: Proficiency in Python and/or TypeScript/Node.js; strong REST/gRPC API design, JSON Schema/OpenAPI, retries/backoff/idempotency, and error taxonomies.<br/><br/> - Observability & Reliability: OpenTelemetry (traces/metrics/logs), performance profiling, resiliency patterns (circuit breakers, bulkheads, DLQ/queues).<br/><br/> - Security by Design: OIDC/SSO, secrets management, least-privilege access, audit logging, and secure coding for AI/data services.<br/><br/> - CI/CD & Testing: Git-based workflows, automated pipelines, unit/integration/contract tests, and environment promotion practices.<br/><br/><b>Good To Have Skills :</b><br/><br/> - Ontology & Data Quality: SHACL/OWL basics, ontology stewardship, lineage/provenance capture, and data quality checks for KG/RAG pipelines.<br/><br/> - Evaluation Engineering: Judge-model setups, A/B testing, rubric design, and regression dashboards.<br/><br/> - Performance & FinOps: Async I/O, caching strategies, connection pooling, and token/runtime budget enforcement.<br/><br/> - Runtime & Platform: Containers/Kubernetes, service mesh/API gateways, feature flags, blue/green or canary releases.<br/><br/> - UX for Explainability: Collaborating on rationale/explanations (source lists, subgraph summaries) and clear HITL approval prompts.<br/></p><br/></p> (ref:hirist.tech)