Skip to main content

Getting Started

Prerequisites — direnv

The repo uses direnv to inject shell environment variables automatically when you cd into the project. Install it once:

# macOS
brew install direnv

# Ubuntu / Debian
sudo apt install direnv

Then add the hook to your shell profile and restart your shell:

# ~/.zshrc  (or ~/.bashrc for bash)
eval "$(direnv hook zsh)" # replace zsh with bash if needed

Finally, allow the repo's .envrc from the project root:

direnv allow

direnv will load the variables automatically every time you enter the directory. The .envrc file is gitignored — never commit it.

note

direnv allow must be re-run whenever .envrc changes.

Prerequisites — Private Package Index (Nexus)

Some OEM projects (ai_mb, ai_stellantis) depend on cosy-encryption, which is hosted on a private Nexus index at nexus.sulzer-us.com. uv will fail with a 401 error during uv sync if credentials are not configured.

Get the Nexus username and password from a team member, then configure credentials using one of the approaches below.

macOS / Linux — ~/.netrc

Add an entry to ~/.netrc (create the file if it doesn't exist):

machine nexus.sulzer-us.com
login <username>
password <password>

Then restrict permissions:

chmod 600 ~/.netrc

uv reads ~/.netrc automatically — no additional configuration is needed.

Windows — %USERPROFILE%\.netrc

uv reads %USERPROFILE%\.netrc on Windows. Windows Explorer won't create dot-prefixed files, so use PowerShell:

@"
machine nexus.sulzer-us.com
login <username>
password <password>
"@ | Out-File -FilePath "$env:USERPROFILE\.netrc" -Encoding ascii

Alternative — environment variables

Set UV_INDEX_NEXUS_USERNAME and UV_INDEX_NEXUS_PASSWORD instead of using a netrc file. On macOS/Linux add them to your shell profile or .envrc; on Windows set them as user environment variables via System Properties → Advanced → Environment Variables. This is how CI authenticates.

Installation via uv sync

Each package manages its own env. Run uv sync --directory <path> once per package:

uv sync --directory packages/ai_core
uv sync --directory projects/ai_<oem> # e.g. projects/ai_audi
uv sync --directory deployments/local # Dagster UI tooling

Environment Setup

Each OEM project needs a .env file in its project root (e.g. projects/ai_audi/.env). This file is gitignored — never commit it.

Create the file with the following variables, replacing <REPO> with the absolute path to your local clone of the repo:

VariablePurposeExample value
ICEBERG_CATALOG_URISQLite catalog for local Iceberg tablessqlite:////<REPO>/temp/iceberg/catalog.db (macOS/Linux) or sqlite:///<REPO>/temp/iceberg/catalog.db (Windows)
ICEBERG_WAREHOUSEDirectory for Iceberg data filesfile:///<REPO>/temp/iceberg/warehouse (macOS/Linux) or file://<REPO>/temp/iceberg/warehouse (Windows)
RAW_STORAGE_URIDirectory for raw JSON responsestemp/raw
DUCKDB_PATHPath for OEM DuckDB filetemp/audi.duckdb
PPM_SOURCE_OVERRIDESPath to source override YAML (optional)temp/source_overrides.yaml
note

The Iceberg URI formats differ by platform. macOS/Linux uses four slashes after sqlite: (sqlite:////abs/path), while Windows uses three (sqlite:///C:/path). The same applies to ICEBERG_WAREHOUSE — macOS/Linux uses file:///, Windows uses file://.

The Iceberg directories are created automatically on first run if they don't exist, so you don't need to create them manually.

Connecting to the production Nessie catalog

Set NESSIE_URI=https://nessie.app.autointel.ai/ and NESSIE_READ_ONLY=1 to read from the production catalog while writing outputs to the local SQLite catalog. The public endpoint requires Auth0 credentials — see Nessie for connectivity options, authentication setup, and the NESSIE_AUTH0_* env vars.

Individual OEM projects may define additional environment variables — check the project's README for details.

Running Dagster Locally

Start the Dagster UI from the repo root:

deployments/local/.venv/bin/dg dev

The UI is available at http://localhost:3000.

note

The deployments/local venv is only for the dg dev UI server. Do not use it to materialize assets — it doesn't have OEM project modules installed.

Rematerializing After Code Changes

After changing entity classes, component logic, or YAML config, you only need to rematerialize the affected tiers — not re-collect raw data. Raw assets represent point-in-time API fetches that Dagster reads from Iceberg; they do not need to be re-run unless you specifically need fresh source data.

# Rematerialize only transformed + downstream for a specific partition
cd projects/ai_<oem>
uv run dg launch \
--assets 'key:"stellantis/transformed/...",key:"stellantis/consolidated/..."' \
--partition YYYY-MM-DD

Dagster resolves upstream dependencies from whatever is already in Iceberg.

Clean Slate (Nuclear Reset)

warning

reset_iceberg_local.sh deletes all local Iceberg data including raw. Only use this when a breaking schema change requires dropping all tables. For normal development, use dg launch --assets to rematerialize only what changed.

To surgically drop a single table without losing everything:

sqlite3 temp/iceberg/catalog.db \
"DELETE FROM iceberg_tables WHERE table_name='<table>';"
rm -rf temp/iceberg/warehouse/stellantis/<table>/

1. Stop Dagster if it's running (Ctrl+C in the terminal).

2. Reset Iceberg warehouse and catalog:

# macOS / Linux
./scripts/reset_iceberg_local.sh

# Windows
scripts\reset_iceberg_local.bat

3. Remove OEM-specific DuckDB files (if any):

rm -f temp/<oem>.duckdb temp/iceberg_views.duckdb     # macOS / Linux
del temp\<oem>.duckdb temp\iceberg_views.duckdb 2>nul # Windows

4. Rebuild the OEM venv (picks up any ai_core changes):

uv sync --directory projects/ai_<oem> --reinstall-package ai-core

5. Start Dagster:

deployments/local/.venv/bin/dg dev      # macOS / Linux
deployments\local\.venv\Scripts\dg.exe dev # Windows

Materializing Assets

There are two ways to materialize assets depending on whether a Dagster webserver is running.

Direct launch (no webserver needed)

Use the OEM project's own venv to invoke dg launch directly. The CLI creates a local instance without requiring a running webserver:

macOS / Linux

cd projects/ai_audi
source .venv/bin/activate
dg launch --assets 'key:"audi/*"' --partition 2026-01-27

Windows (PowerShell)

cd projects/ai_audi
.venv\Scripts\dg.exe launch --assets 'key:"audi/*"' --partition 2026-01-27

Replace the partition date with today's date in YYYY-MM-DD format.

Webserver submission (requires running dg dev)

If dg dev is already running, use the /dagster-run skill to submit and monitor a run:

/dagster-run

Describe what you want to materialize and the skill will submit the run, track step progress, and surface any errors — all from Claude Code.

Monitoring a Run

Use the /dagster-run skill to monitor progress or investigate a failure on any run, whether it was launched via the UI, dg launch, or the skill itself:

/dagster-run

The skill can poll a run until it reaches a terminal state and collect structured error output and tracebacks if it fails. The run ID appears in the Dagster UI URL (/runs/<run-id>) or in dg launch output.

Running Tests

Tests use pytest and live alongside each package. Run from the repo root:

# Shared library tests
uv run --directory packages/ai_core pytest

# OEM project tests
uv run --directory projects/ai_<oem> pytest

Add -v for verbose output showing individual test names:

uv run --directory packages/ai_core pytest -v