Fluid Forge
Get Started
See it run
  • Local (DuckDB)
  • Source-Aligned (Postgres → DuckDB)
  • AI Forge + Data Models
  • GCP (BigQuery)
  • Snowflake Team Collaboration
  • Declarative Airflow
  • Orchestration Export
  • Jenkins CI/CD
  • Universal Pipeline
  • 11-Stage Production Pipeline
  • Catalog Forge End-to-End
CLI Reference
  • Overview
  • Quickstart
  • Examples
  • Your own CI
  • Your own scaffolding
  • Custom validator
  • Apply hook
  • Reference
Demos
  • Overview
  • Architecture
  • GCP (BigQuery)
  • AWS (S3 + Athena)
  • Snowflake
  • Local (DuckDB)
  • Custom Providers
  • Roadmap
GitHub
GitHub
Get Started
See it run
  • Local (DuckDB)
  • Source-Aligned (Postgres → DuckDB)
  • AI Forge + Data Models
  • GCP (BigQuery)
  • Snowflake Team Collaboration
  • Declarative Airflow
  • Orchestration Export
  • Jenkins CI/CD
  • Universal Pipeline
  • 11-Stage Production Pipeline
  • Catalog Forge End-to-End
CLI Reference
  • Overview
  • Quickstart
  • Examples
  • Your own CI
  • Your own scaffolding
  • Custom validator
  • Apply hook
  • Reference
Demos
  • Overview
  • Architecture
  • GCP (BigQuery)
  • AWS (S3 + Athena)
  • Snowflake
  • Local (DuckDB)
  • Custom Providers
  • Roadmap
GitHub
GitHub
  • Introduction

    • Home
    • Getting Started
    • Snowflake Quickstart
    • See it run
    • Forge Data Model
    • Vision & Roadmap
    • Playground
    • FAQ
  • Concepts

    • Concepts
    • Builds, Exposes, Bindings
    • What is a contract?
    • Quality, SLAs & Lineage
    • Governance & Policy
    • Agent Policy (LLM/AI governance)
    • Providers vs Platforms
    • Fluid Forge vs alternatives
  • Data Products

    • Product Types — SDP, ADP, CDP
  • Walkthroughs

    • Walkthrough: Local Development
    • Source-Aligned: Postgres → DuckDB → Parquet
    • AI Forge And Data-Model Journeys
    • Walkthrough: Deploy to Google Cloud Platform
    • Walkthrough: Snowflake Team Collaboration
    • Declarative Airflow DAG Generation - The FLUID Way
    • Generating Orchestration Code from Contracts
    • Jenkins CI/CD for FLUID Data Products
    • Universal Pipeline
    • The 11-Stage Pipeline
    • End-to-End Walkthrough: Catalog → Contract → Transformation
  • CLI Reference

    • CLI Reference
    • fluid init
    • fluid demo
    • fluid forge
    • fluid skills
    • fluid status
    • fluid validate
    • fluid plan
    • fluid apply
    • fluid generate
    • fluid generate artifacts
    • fluid validate-artifacts
    • fluid verify-signature
    • fluid generate-airflow
    • fluid generate-pipeline
    • fluid viz-graph
    • fluid odps
    • fluid odps-bitol
    • fluid odcs
    • fluid export
    • fluid export-opds
    • fluid publish
    • fluid datamesh-manager
    • fluid market
    • fluid import
    • fluid policy
    • fluid policy check
    • fluid policy compile
    • fluid policy apply
    • fluid contract-tests
    • fluid contract-validation
    • fluid diff
    • fluid test
    • fluid verify
    • fluid product-new
    • fluid product-add
    • fluid workspace
    • fluid ide
    • fluid ai
    • fluid memory
    • fluid mcp
    • fluid scaffold-ci
    • fluid scaffold-composer
    • fluid scaffold-ide
    • fluid docs
    • fluid config
    • fluid split
    • fluid bundle
    • fluid auth
    • fluid doctor
    • fluid providers
    • fluid provider-init
    • fluid roadmap
    • fluid version
    • fluid runs
    • fluid retention
    • fluid secrets
    • fluid stats
    • fluid contract
    • fluid ship
    • fluid rollback
    • fluid schedule-sync
    • Catalog adapters

      • Source Catalog Integration (V1.5)
      • BigQuery Catalog
      • Snowflake Horizon Catalog
      • Databricks Unity Catalog
      • Google Dataplex Catalog
      • AWS Glue Data Catalog
      • DataHub Catalog
      • Data Mesh Manager Catalog
    • CLI by task

      • CLI by task
      • Add quality rules
      • Add agent governance
      • Debug a failed pipeline run
      • Switch clouds with one line
  • Recipes

    • Recipes
    • Recipe — add a quality rule
    • Recipe — switch clouds with one line
    • Recipe — tag PII in your schema
  • SDK & Plugins

    • SDK & Plugins
    • Quickstart — your first plugin
    • Examples

      • Runnable examples
      • Example: hello-scaffold — the minimal viable plugin
      • Example: gitlab-ci-scaffold — generate a complete CI project
      • Example: steward-validator — a custom governance rule
      • Example: prod-key-guard — apply-time invariant check
    • Journeys

      • Journeys
      • Your own CI/CD

        • You have your own CI/CD setup, no problem
        • GitLab CI — the bundle template
        • GitHub Actions — the bundle template
        • Jenkins — the bundle template
        • CircleCI — the bundle template
      • You have a strict project layout, no problem
      • You have governance rules, no problem
      • You want a check at apply time, no problem
    • Reference

      • Reference
      • Roles reference
      • Entry points reference
      • Trust model
      • Packaging
      • Companion packages
  • Providers

    • Providers
    • Provider Architecture
    • GCP Provider
    • AWS Provider
    • Snowflake Provider
    • Local Provider
    • Creating Custom Providers
    • Provider Roadmap
  • Advanced

    • Blueprints
    • Governance & Compliance
    • Airflow Integration
    • Built-in And Custom Forge Guidance
    • FLUID Forge Contract GPT Packet
    • Forge Discovery Guide
    • Forge Memory Guide
    • LLM Providers
    • Capability Warnings
    • LiteLLM Backend (opt-in)
    • MCP Server
    • Credential Resolver — Security Model
    • Cost Tracking
    • Agentic Primitives
    • Typed Errors
    • Typed CLI Errors
    • Authoring Forge Tools
    • Source-Aligned Acquisition
    • API Stability — fluid_build.api
    • Guided fluid forge UX
    • V1.5 Catalog Integration — Architecture Deep-Dive
    • V1.5 + V2 Hardening — Release Notes
  • Project

    • Contributing to Fluid Forge
    • Fluid Forge Docs Baseline: CLI 0.8.3
    • Fluid Forge Docs Baseline: CLI 0.8.0
    • Fluid Forge Docs Baseline: CLI 0.7.11
    • Fluid Forge Docs Baseline: CLI 0.7.9
    • Fluid Forge v0.7.1 - Multi-Provider Export Release

Builds, Exposes, Bindings

Every contract maps three questions onto three YAML blocks:

QuestionBlockExample
How is the data produced?builds[]Embedded SQL, a dbt project, a Python script, a Spark job.
What does the product expose to consumers?exposes[]A table, a view, a file, a Kafka topic.
Where does it physically land?binding (inside each expose)gcp/bigquery_table, aws/s3_file, local/parquet, etc.

You can have many of each. Every expose must declare exactly one binding.

builds[] — production logic

builds:
  - id: bitcoin_price_ingestion
    pattern: embedded-logic            # or hybrid-reference (dbt/python repo)
    engine: sql                        # or python, dbt, spark
    properties:
      sql: |
        SELECT CURRENT_TIMESTAMP AS price_timestamp,
               price AS price_usd
        FROM raw_btc_feed

Patterns supported in v0.7.3 (verified against fluid-schema-0.7.3.json):

  • embedded-logic — SQL/code inline in the contract. language enum: sql, flink_sql, pyspark, scala, python, r.
  • hybrid-reference — dbt-style: point at an external repo with a model: field and optional vars:.
  • multi-stage — orchestration pattern with a stages[] array of named build steps. Schema description: "Multi-stage orchestration pattern" (introduced in v0.5.5).
  • acquisition — source-aligned ingestion pattern (added in v0.7.3) for landing raw external data.

exposes[] — the consumer-facing API

exposes:
  - exposeId: bitcoin_prices
    title: Bitcoin Hourly Prices
    kind: table                        # see expose.kind enum below
    binding:
      platform: local
      format: parquet
      location:
        path: ./runtime/out/bitcoin_prices.parquet
    contract:
      schema:
        - name: price_timestamp
          type: TIMESTAMP
          required: true
        - name: price_usd
          type: NUMERIC
          required: true

The schema lives at exposes[].contract.schema. Quality rules live one level deeper at exposes[].contract.dq.rules (see Quality, SLAs & Lineage).

expose.kind enum (verified against fluid-schema-0.7.3.json): table · view · api · file · stream · topic · feature_store · model · vector · graph · time_series · other

binding — the physical landing target

binding.platform enum (v0.7.3): gcp · aws · azure · snowflake · databricks · kafka · local · kubernetes · other

binding.format enum (v0.7.3): bigquery_table · snowflake_table · gcs_file · s3_file · http_api · grpc_api · pubsub_topic · kafka_topic · delta_table · iceberg · parquet · csv · json · other

binding.location shape varies per format:

FormatRequired location keys
bigquery_tableproject, dataset, table (region optional)
snowflake_tabledatabase, schema, table
s3_filebucket, prefix (region optional)
parquet / csvpath (relative or absolute)

The "swap one line" trick

The whole point of bindings is that platform: local → platform: gcp is the only change you need to redeploy the same product to BigQuery. The format and location keys change to match the new platform's vocabulary, but everything else (schema, DQ rules, governance) stays identical.

Multi-expose products: one product, many surfaces

Most data products produce one output. Some produce several: a Gold table for analysts, a feature_store view for the ML team, a Kafka topic for downstream consumers. Add multiple exposes[] entries:

exposes:
  - exposeId: customer_360_table          # for analysts
    kind: table
    binding:
      platform: gcp
      format: bigquery_table
      location: { project: prod, dataset: analytics, table: customer_360 }
    policy:
      authz:
        readers: [group:analysts@company.com]

  - exposeId: customer_360_features       # for ML
    kind: feature_store
    binding:
      platform: gcp
      format: bigquery_table
      location: { project: prod, dataset: features, table: customer_360_v1 }
    policy:
      authz:
        readers: [group:ml-team@company.com, serviceAccount:training@…]

  - exposeId: customer_changes            # for downstream
    kind: stream
    binding:
      platform: gcp
      format: pubsub_topic
      location: { project: prod, topic: customer-changes }

The builds[] are shared. The compute happens once, the surfaces are independent. Each surface gets its own audience via policy.authz.

consumes[] — declaring dependencies

When your product depends on another product (Silver consuming Bronze, Gold consuming Silver), declare it in consumes[]:

consumes:
  - consumeId: bronze_orders
    productId: bronze.retail.orders_v1
    contract: { exposeId: orders_table }

consumes[] references compile to:

  • Lineage edges in fluid generate artifacts (OPDS / ODCS / DataMesh Manager output)
  • Read grants in policy-apply (the consumer's service principal gets read on the producer's expose)
  • Build-time validation — fluid validate confirms the upstream product exists and the cited exposeId matches

You don't usually wire this by hand. fluid forge infers it from your SQL/dbt refs. Override only when crossing system boundaries.

Build execution: where SQL/Python actually runs

builds[].execution controls the runtime environment:

builds:
  - id: bitcoin_price_ingestion
    pattern: embedded-logic
    engine: python
    properties:
      script: ./ingest.py
    execution:
      runtime:
        image: python:3.11-slim
        environment:                   # map of NAME: value, injected into the container
          AWS_REGION: us-east-1
          S3_BUCKET: my-ingest-bucket
      retries:                         # → retryPolicy schema
        maxAttempts: 3
        backoffStrategy: exponential   # fixed | exponential | linear
      trigger:
        type: scheduled
        cron: "0 * * * *"             # hourly

For SQL builds (engine: sql), the runtime is the warehouse itself (BigQuery, Snowflake, DuckDB) — execution.runtime is unused. For Python and Spark, the runtime is a container image; the chosen orchestrator (Airflow, Dagster, etc.) provisions it.

Where to look next

  • Providers vs platforms — how binding.platform resolves to actual cloud SDKs
  • Quality, SLAs & Lineage — the dq.rules, qos, and lineage blocks
  • Governance & Policy — the accessPolicy and agentPolicy blocks
  • fluid plan walkthrough — what the planner emits per binding
Edit this page on GitHub
Last Updated: 5/17/26, 6:10 PM
Contributors: fas89, Claude Opus 4.7 (1M context)
Prev
Concepts
Next
What is a contract?