Fluid Forge
Get Started
See it run
  • Local (DuckDB)
  • Source-Aligned (Postgres → DuckDB)
  • AI Forge + Data Models
  • GCP (BigQuery)
  • Snowflake Team Collaboration
  • Declarative Airflow
  • Orchestration Export
  • Jenkins CI/CD
  • Universal Pipeline
  • 11-Stage Production Pipeline
  • Catalog Forge End-to-End
CLI Reference
  • Overview
  • Quickstart
  • Examples
  • Your own CI
  • Your own scaffolding
  • Custom validator
  • Apply hook
  • Reference
Demos
  • Overview
  • Architecture
  • GCP (BigQuery)
  • AWS (S3 + Athena)
  • Snowflake
  • Local (DuckDB)
  • Custom Providers
  • Roadmap
GitHub
GitHub
Get Started
See it run
  • Local (DuckDB)
  • Source-Aligned (Postgres → DuckDB)
  • AI Forge + Data Models
  • GCP (BigQuery)
  • Snowflake Team Collaboration
  • Declarative Airflow
  • Orchestration Export
  • Jenkins CI/CD
  • Universal Pipeline
  • 11-Stage Production Pipeline
  • Catalog Forge End-to-End
CLI Reference
  • Overview
  • Quickstart
  • Examples
  • Your own CI
  • Your own scaffolding
  • Custom validator
  • Apply hook
  • Reference
Demos
  • Overview
  • Architecture
  • GCP (BigQuery)
  • AWS (S3 + Athena)
  • Snowflake
  • Local (DuckDB)
  • Custom Providers
  • Roadmap
GitHub
GitHub
  • Introduction

    • Home
    • Getting Started
    • Snowflake Quickstart
    • See it run
    • Forge Data Model
    • Vision & Roadmap
    • Playground
    • FAQ
  • Concepts

    • Concepts
    • Builds, Exposes, Bindings
    • What is a contract?
    • Quality, SLAs & Lineage
    • Governance & Policy
    • Agent Policy (LLM/AI governance)
    • Providers vs Platforms
    • Fluid Forge vs alternatives
  • Data Products

    • Product Types — SDP, ADP, CDP
  • Walkthroughs

    • Walkthrough: Local Development
    • Source-Aligned: Postgres → DuckDB → Parquet
    • AI Forge And Data-Model Journeys
    • Walkthrough: Deploy to Google Cloud Platform
    • Walkthrough: Snowflake Team Collaboration
    • Declarative Airflow DAG Generation - The FLUID Way
    • Generating Orchestration Code from Contracts
    • Jenkins CI/CD for FLUID Data Products
    • Universal Pipeline
    • The 11-Stage Pipeline
    • End-to-End Walkthrough: Catalog → Contract → Transformation
  • CLI Reference

    • CLI Reference
    • fluid init
    • fluid demo
    • fluid forge
    • fluid skills
    • fluid status
    • fluid validate
    • fluid plan
    • fluid apply
    • fluid generate
    • fluid generate artifacts
    • fluid validate-artifacts
    • fluid verify-signature
    • fluid generate-airflow
    • fluid generate-pipeline
    • fluid viz-graph
    • fluid odps
    • fluid odps-bitol
    • fluid odcs
    • fluid export
    • fluid export-opds
    • fluid publish
    • fluid datamesh-manager
    • fluid market
    • fluid import
    • fluid policy
    • fluid policy check
    • fluid policy compile
    • fluid policy apply
    • fluid contract-tests
    • fluid contract-validation
    • fluid diff
    • fluid test
    • fluid verify
    • fluid product-new
    • fluid product-add
    • fluid workspace
    • fluid ide
    • fluid ai
    • fluid memory
    • fluid mcp
    • fluid scaffold-ci
    • fluid scaffold-composer
    • fluid scaffold-ide
    • fluid docs
    • fluid config
    • fluid split
    • fluid bundle
    • fluid auth
    • fluid doctor
    • fluid providers
    • fluid provider-init
    • fluid roadmap
    • fluid version
    • fluid runs
    • fluid retention
    • fluid secrets
    • fluid stats
    • fluid contract
    • fluid ship
    • fluid rollback
    • fluid schedule-sync
    • Catalog adapters

      • Source Catalog Integration (V1.5)
      • BigQuery Catalog
      • Snowflake Horizon Catalog
      • Databricks Unity Catalog
      • Google Dataplex Catalog
      • AWS Glue Data Catalog
      • DataHub Catalog
      • Data Mesh Manager Catalog
    • CLI by task

      • CLI by task
      • Add quality rules
      • Add agent governance
      • Debug a failed pipeline run
      • Switch clouds with one line
  • Recipes

    • Recipes
    • Recipe — add a quality rule
    • Recipe — switch clouds with one line
    • Recipe — tag PII in your schema
  • SDK & Plugins

    • SDK & Plugins
    • Quickstart — your first plugin
    • Examples

      • Runnable examples
      • Example: hello-scaffold — the minimal viable plugin
      • Example: gitlab-ci-scaffold — generate a complete CI project
      • Example: steward-validator — a custom governance rule
      • Example: prod-key-guard — apply-time invariant check
    • Journeys

      • Journeys
      • Your own CI/CD

        • You have your own CI/CD setup, no problem
        • GitLab CI — the bundle template
        • GitHub Actions — the bundle template
        • Jenkins — the bundle template
        • CircleCI — the bundle template
      • You have a strict project layout, no problem
      • You have governance rules, no problem
      • You want a check at apply time, no problem
    • Reference

      • Reference
      • Roles reference
      • Entry points reference
      • Trust model
      • Packaging
      • Companion packages
  • Providers

    • Providers
    • Provider Architecture
    • GCP Provider
    • AWS Provider
    • Snowflake Provider
    • Local Provider
    • Creating Custom Providers
    • Provider Roadmap
  • Advanced

    • Blueprints
    • Governance & Compliance
    • Airflow Integration
    • Built-in And Custom Forge Guidance
    • FLUID Forge Contract GPT Packet
    • Forge Discovery Guide
    • Forge Memory Guide
    • LLM Providers
    • Capability Warnings
    • LiteLLM Backend (opt-in)
    • MCP Server
    • Credential Resolver — Security Model
    • Cost Tracking
    • Agentic Primitives
    • Typed Errors
    • Typed CLI Errors
    • Authoring Forge Tools
    • Source-Aligned Acquisition
    • API Stability — fluid_build.api
    • Guided fluid forge UX
    • V1.5 Catalog Integration — Architecture Deep-Dive
    • V1.5 + V2 Hardening — Release Notes
  • Project

    • Contributing to Fluid Forge
    • Fluid Forge Docs Baseline: CLI 0.8.3
    • Fluid Forge Docs Baseline: CLI 0.8.0
    • Fluid Forge Docs Baseline: CLI 0.7.11
    • Fluid Forge Docs Baseline: CLI 0.7.9
    • Fluid Forge v0.7.1 - Multi-Provider Export Release

Forge Data Model

fluid forge data-model forges a reviewable data-model contract from a business intent file, raw DDL, or a configured metadata source. It writes a Fluid contract plus a logical model sidecar that downstream generation commands consume.

When to use it

Use fluid forge data-model when you want the CLI to create the semantic data-model layer before you generate dbt or other transformation artifacts.

For the full set of user journeys, including AI provider setup, strict hosted-provider smoke tests, Ollama, discovery, memory, DDL, source catalogs, dbt, and scheduling, see AI Forge And Data-Model Journeys.

InputCommandBest for
YAML/JSON intentfluid forge data-model from-intentA business-first description of the data product you want
SQL DDLfluid forge data-model from-ddlReverse-engineering existing warehouse tables
Catalog metadatafluid forge data-model from-sourceForging from Snowflake, Unity, BigQuery, Dataplex, Glue, DataHub, or DMM metadata

Don't know where to start? Just run it bare.

Typing fluid forge data-model with no subcommand renders an interactive panel listing every input shape with a one-line description and a quick-start example. The guide is cwd-aware: if it sees intent.yaml it recommends from-intent, if it sees *.sql files it recommends from-ddl, and if you have a metadata source configured in ~/.fluid/sources.yaml it recommends from-source. The same pattern is also wired into fluid memory, fluid auth, fluid mcp, fluid policy, and fluid config.

From an intent file

An intent file is a YAML or JSON description of the data product you want. The minimum useful shape is:

data_product:
  name: customer_orders
  domain: retail
grain:
  entity: order_line
  time_dimension: order_date
dimensions:
  entities: [customer, product, store]
metrics:
  - name: total_revenue
    description: Sum of order line revenue

Forge the model:

fluid forge data-model from-intent intent.yaml \
  --technique dimensional \
  --output customer_orders.fluid.yaml

Required minimum:

  • data_product.name
  • data_product.domain
  • At least one grain, dimensions.entities, metrics, or data_sources entry

Useful optional fields:

  • business_context
  • grain
  • dimensions
  • metrics
  • data_sources
  • business_rules
  • modeling.technique

Discover the format

The CLI now teaches the intent format directly:

fluid forge data-model from-intent --example
fluid forge data-model from-intent --example retail
fluid forge data-model from-intent --example telco
fluid forge data-model from-intent --example finance
fluid forge data-model from-intent --schema
fluid forge data-model from-intent --validate intent.yaml

--example prints parseable YAML to stdout. --schema prints the BusinessIntent JSON Schema for editors and automation. --validate checks the input file only and does not write contract artifacts.

Bundled examples live in the CLI repo under examples/intents/:

  • customer_orders.intent.yaml
  • telco_service.intent.yaml
  • finance_risk.intent.yaml

Field mapping

Intent fieldForged model meaning
data_productContract identity, domain, owner, and description
grainFact grain for dimensional models or the central entity for Data Vault 2.0
dimensions.entitiesDimensions in a star model or hubs in Data Vault 2.0
metricsSemantic measures and metrics
data_sourcesSource hints used in model docs and transformation generation
business_rulesAssumptions and logic notes preserved for reviewers
modeling.techniqueDefault modeling technique unless the CLI --technique flag overrides it

Output artifacts

A successful forge writes:

customer_orders.fluid.yaml
customer_orders.fluid.yaml.model.json
customer_orders.fluid.yaml.model.md
customer_orders.fluid.yaml.semantics.osi.yaml
ArtifactPurpose
.fluid.yamlFluid contract with OSI semantic metadata
.model.jsonCanonical logical model sidecar used by downstream generators
.model.mdHuman review document with Mermaid diagram, inventory, grain, metrics, dimensions, source hints, and assumptions
.semantics.osi.yamlStandalone OSI semantic sidecar for BI and semantic tooling

The Markdown model document is on by default. Suppress only that human-readable document with:

fluid forge data-model from-intent intent.yaml \
  -o customer_orders.fluid.yaml \
  --no-emit-model-doc

The .model.json machine sidecar is still written.

From DDL

fluid forge data-model from-ddl \
  --ddl legacy/orders.sql legacy/customers.sql \
  --source-type snowflake \
  --technique data-vault-2 \
  --output customer_orders.fluid.yaml

For live Snowflake schemas, first dump the DDL:

fluid forge data-model dump-ddl \
  --database BIZ_LAB \
  --schema SEEDED \
  -o /tmp/biz_lab.sql

Then feed the dump to from-ddl. Snowflake GET_DDL output such as create or replace TABLE is parsed by the current DDL path.

From a metadata source

fluid ai setup --source snowflake --name snowflake-prod

fluid forge data-model from-source \
  --source snowflake \
  --credential-id snowflake-prod \
  --database BIZ_LAB \
  --schema SEEDED \
  --technique data-vault-2 \
  -o biz_lab.fluid.yaml

See Source catalogs for the full Snowflake, Unity, BigQuery, Dataplex, Glue, DataHub, and Data Mesh Manager setup.

Generate dbt after forging

The transformation generator auto-loads the logical sidecar referenced by labels.modelSidecar and writes deterministic dbt SQL from the forged model:

fluid generate transformation customer_orders.fluid.yaml \
  -o ./dbt_customer_orders \
  --dbt-validate

For dbt output, the generator fails clearly if it produces zero models/**/*.sql files. A normal dbt project includes dbt_project.yml, profiles.yml, models/sources.yml when applicable, and non-empty SQL model files under models/.

fluid generate speed-transformation and fluid generate dbt remain aliases, but docs lead with fluid generate transformation.

AI provider runs

Provider keys are never part of the intent file or contract. Export one provider key for the shell session or run fluid ai setup:

export GOOGLE_API_KEY="<your-gemini-key>"

fluid forge data-model from-intent intent.yaml \
  -o customer_orders.gemini.fluid.yaml \
  --llm-provider gemini \
  --tiered \
  --require-llm

Use --require-llm when you need to prove a hosted provider actually ran. For CI and repeatable production pipelines, commit the forged artifacts and use deterministic validate, generate, plan, and apply steps rather than live LLM calls.

Deterministic and strict modes

fluid forge data-model from-intent intent.yaml \
  -o customer_orders.fluid.yaml \
  --deterministic

--deterministic disables cache and tiering for byte-stable replay. For provider validation, use --require-llm; it fails loudly if the configured LLM cannot run instead of falling back to heuristics.

fluid forge data-model from-intent intent.yaml \
  -o customer_orders.fluid.yaml \
  --llm-provider ollama \
  --llm-model gemma4:latest \
  --require-llm

Review and iteration

Use --review to open the logical sidecar in $EDITOR before the contract is finalized:

fluid forge data-model from-intent intent.yaml \
  -o customer_orders.fluid.yaml \
  --review

Compare sidecars after edits:

fluid forge data-model diff old.model.json new.model.json

Validate a forged contract or sidecar:

fluid forge data-model validate customer_orders.fluid.yaml

For agent-driven edits, use the MCP server. It exposes read_logical_model, update_entity, add_relationship, and regenerate_physical with path and namespace policy controls.

Edit this page on GitHub
Last Updated: 4/26/26, 10:42 PM
Contributors: fas89, Claude Opus 4.7
Prev
See it run
Next
Vision & Roadmap