PPM Data Platform
Multi-OEM vehicle data platform built with Dagster. Shared logic lives in packages/ai_core; each OEM gets its own Dagster project under projects/.
- Getting Started — Installation, running Dagster locally, materializing assets, monitoring runs, running tests
- Dagster Execution Model — How the ECS executor works, run task vs step tasks, resource sizing,
container_context.yamlconfiguration - Monitoring — Sentry alerting for asset check failures and run crashes
- Debugging ECS Containers — Using ECS Exec / SSM to shell into Fargate containers (Nessie, code locations)
Ingestion
- Design — Three-tier pipeline, asset key conventions, checks, partitioning
- Code Structure — Package layout, component classes, YAML config, adding a new OEM
- Component Reference — Advanced component attributes, raw tier schema, utilities
- ConsolidatedComponent — Source groups, gap fill, merge spec, foreign keys
- Operations — Running Dagster, materializing assets, monitoring runs
Derived
Downstream of consolidated ingestion, the platform produces a graph of OEM-specific assets that turn raw entity records into KPIs, predictions, and smoothed dealer-level take rates. Each asset is partitioned by day and emitted by a reusable component class.
Asset keys always start with <oem>/ and use the two-segment <oem>/<name> pattern. Dealer classifications land under <oem>/dealers/<name> (e.g. mb/dealers/neighborhoods). Every asset uses DailyPartitionsDefinition with end_offset=1, matching the consolidated tier.
- Enrichment — Spatial dealer clustering and computed sold-date, days-on-lot, half-life weights on consolidated inventory
- Vehicle Features — Option classification into structured attribute types and per-vehicle feature assembly
- Inventory Statistics — Current inventory count, rolling sales counts, days-supply, average days-on-lot per grouping
- Days on Lot — Trains a regression model and scores every active vehicle with predicted days on lot
- Take Rates — Per-attribute take rates and metric KPIs, blended across geographic layers to dealer granularity
Warehouse
- Nessie — Iceberg catalog: deployment, VPC vs public connectivity, Auth0 authentication, utilities