dataproduct-builder-dbt
May 28, 2026 · View on GitHub
Skills for your favorite coding agent that helps you implement data products with dbt, compliant to your organization's conventions, and fully integrated with Entropy Data.
It also supports a contract-driven approach: specify your requirements as a data contract, and the builder implements the data product in minutes.
Skills
The plugin ships seven skills:
- dataproduct-bootstrap scaffolds a new dbt data product from scratch:
dbt_project.yml, model layout, README,profiles.yml.example, and more. - dataproduct-implement analyzes the data product input and output data contract, and implements the dbt models.
- dataproduct-exampledata extracts sample rows, drops PII columns flagged in the contract (and obvious name-based PII), and uploads the scrubbed sample to Entropy Data.
- datacontract-edit edits an output-port
models/output_ports/v<N>/*.odcs.yamlusing natural language. - datacontract-test runs
datacontract testto verify the live data still matches the schema and quality rules. - entropy-data-sync integrate existing dbt project to the Entropy Data reference layout, set up GitHub Actions workflow, and synchronize all metadata.
- entropy-data-teams lists the teams configured in Entropy Data so the user can pick an owner.
Install
The skills are plain markdown, any coding agent that can read instruction files can run them.
For major coding agents, those can be installed as a plugin:
Claude Code
In your terminal:
claude plugin marketplace add https://github.com/entropy-data/dataproduct-builder-dbt
claude plugin install dataproduct-builder-dbt@dataproduct-builder-dbt -s project
OpenAI Codex
In your terminal:
codex plugin marketplace add https://github.com/entropy-data/dataproduct-builder-dbt
codex plugin add dataproduct-builder-dbt@dataproduct-builder-dbt
GitHub Copilot CLI
In your terminal:
copilot plugin marketplace add https://github.com/entropy-data/dataproduct-builder-dbt
copilot plugin install dataproduct-builder-dbt@dataproduct-builder-dbt
Other agents (Cursor, Aider, etc.)
Any agent that reads AGENTS.md picks up the routing manifest.
Alternatively, copy the skills to the directory that your coding agent expects.
Connect
The skills authenticate against Entropy Data through a connection registered with the entropy-data CLI (requires uv).
The skills use a per-project venv for both entropy-data and datacontract. After dataproduct-bootstrap scaffolds a project, run uv sync from the project root to install both CLIs at the pinned versions, then invoke them as uv run entropy-data … / uv run datacontract ….
The one exception is the first call to dataproduct-bootstrap itself, which runs against an empty directory (no pyproject.toml, no venv yet) and needs entropy-data available globally for its lookup step. Install once per machine:
uv tool install --upgrade entropy-data
entropy-data connection add default --api-key <your-api-key> --host <your-entropy-data-host>
Create a user-scoped key in the Entropy Data web UI (Organization Settings → API Keys → Create new API key, scope User (personal token)). For CI workflows, add a connection with a team-scoped or organization-scoped API key.
Use
Ask the agent:
Implement the data product url or id or new name.
Or:
Add the property name to the data contract and find data products that we could use as input ports. Request access and implement the dbt pipeline.
Customization
Organizations with their own data-product stack and naming conventions are encouraged to fork or copy this repository and adapt it to their environment.
Common extension points:
- Templates under
skills/dataproduct-bootstrap/templates/andskills/entropy-data-sync/templates/ship the ODPS, ODCS, OpenLineage transport, GitHub Actions workflow, and dbt project skeleton that the bootstrap and sync skills install. Replace any of them to match your conventions (e.g. swap GitHub Actions for Airflow, change the model layer naming, embed company-specific tags). - Skills: add your own
skills/<name>/SKILL.mdfor stack-specific flows: specific skills for your data platform, internal data-quality checks, governance approvals, downstream sync to your data catalog, etc. UpdateAGENTS.mdand.github/copilot-instructions.mdso the routing tables surface them. - Hooks: extend
hooks/hooks.jsonwith additionalPostToolUsevalidators (e.g. an internal lint onmodels/**/*.sql, or a check that team names match your IdP). - Subagents: add subagents under
agents/for specialist roles (e.g. a PII scanner tuned to your classification taxonomy, a contract-review specialist for your terms-of-use boilerplate).
After customizing, rename the plugin in .claude-plugin/plugin.json and .codex-plugin/plugin.json, then publish under your own GitHub organization or GitLab repository.
Update the installation instructions and usage instructions in Entropy Data.
If a change you've made is broadly useful, open an issue or PR upstream, generic improvements are very welcome.
License
MIT