Data Engineer - Quant Trading
About Deeter Investments
Deeter Investments is a founder‑led proprietary trading firm built around real‑time, data‑driven decision‑making.
We prize curiosity, collaboration, and a bias for action.
After years of discretionary success, we’re launching a dedicated algorithmic division.
Role Description
As our first Data Engineer, you’ll own critical datasets end-to-end from ingestion and system architecture to reliability and access.
You'll be designing, building, and running the data backbone for our algorithmic team.
You’ll work directly with traders and researchers to turn messy external feeds into high-performance, well-structured datasets that guide decisions in research and production.
Key Responsibilities
- Architect cloud-native batch and streaming ELT for diverse sources; standardize, de-duplicate, document; define schemas and redundancy.
- Stand up core platform: storage/lakehouse, orchestration, metadata/catalog, CI/CD, IaC, observability; keep it simple and cost-aware.
- Implement data quality checks, anomaly detection; maintain survivorship-bias-free histories and handle corporate actions/entitlements.
- Expose clean data via APIs/query layers and shared libs; produce “research-ready” datasets for fast backtests and production.
- Partner with quants/DS/SWE to scope, prototype, and productionize new datasets quickly; own incident response and runbooks.
- Uphold security and access hygiene (IAM/least-privilege, secrets, audit).
Qualifications & Experience
- 5+ years building and operating production data pipelines/platforms (or equivalent).
- Strong Python and SQL; ideally familiarity with distributed, time-series, or NoSQL databases.
- Comfortable on at least one major cloud (AWS/GCP/Azure).
- Docker and Terraform (or similar).
- Orchestration (e.g. Airflow/Prefect/Dagster), distributed/batch compute (e.g. Spark/Dask/Beam), warehouses/lakes, columnar formats (e.g. Parquet/Delta/Iceberg).
- Monitoring/observability (logs/metrics/traces) and cost management.
- Proven delivery for quantitative users or ML/DS teams; clear thinking, clean design, pragmatic trade-offs.
Nice to Have
- Financial/time-series data (corporate actions, vendor entitlements/licensing), alternative data ingestion.
- Multimodal ETL (NLP/embeddings, transcription, basic image/video processing).
- Dataset/version control and reproducibility (e.g., LakeFS/DVC) and research workflow tooling.
Location: Remote
Language: English required
Employment: Full-time