Fluid Forge
Get Started
See it run
  • Local (DuckDB)
  • Source-Aligned (Postgres → DuckDB)
  • AI Forge + Data Models
  • GCP (BigQuery)
  • Snowflake Team Collaboration
  • Declarative Airflow
  • Orchestration Export
  • Jenkins CI/CD
  • Universal Pipeline
  • 11-Stage Production Pipeline
  • Catalog Forge End-to-End
CLI Reference
  • Overview
  • Quickstart
  • Examples
  • Your own CI
  • Your own scaffolding
  • Custom validator
  • Apply hook
  • Reference
Demos
  • Overview
  • Architecture
  • GCP (BigQuery)
  • AWS (S3 + Athena)
  • Snowflake
  • Local (DuckDB)
  • Custom Providers
  • Roadmap
GitHub
GitHub
Get Started
See it run
  • Local (DuckDB)
  • Source-Aligned (Postgres → DuckDB)
  • AI Forge + Data Models
  • GCP (BigQuery)
  • Snowflake Team Collaboration
  • Declarative Airflow
  • Orchestration Export
  • Jenkins CI/CD
  • Universal Pipeline
  • 11-Stage Production Pipeline
  • Catalog Forge End-to-End
CLI Reference
  • Overview
  • Quickstart
  • Examples
  • Your own CI
  • Your own scaffolding
  • Custom validator
  • Apply hook
  • Reference
Demos
  • Overview
  • Architecture
  • GCP (BigQuery)
  • AWS (S3 + Athena)
  • Snowflake
  • Local (DuckDB)
  • Custom Providers
  • Roadmap
GitHub
GitHub
  • Introduction

    • Home
    • Getting Started
    • Snowflake Quickstart
    • See it run
    • Forge Data Model
    • Vision & Roadmap
    • Playground
    • FAQ
  • Concepts

    • Concepts
    • Builds, Exposes, Bindings
    • What is a contract?
    • Quality, SLAs & Lineage
    • Governance & Policy
    • Agent Policy (LLM/AI governance)
    • Providers vs Platforms
    • Fluid Forge vs alternatives
  • Data Products

    • Product Types — SDP, ADP, CDP
  • Walkthroughs

    • Walkthrough: Local Development
    • Source-Aligned: Postgres → DuckDB → Parquet
    • AI Forge And Data-Model Journeys
    • Walkthrough: Deploy to Google Cloud Platform
    • Walkthrough: Snowflake Team Collaboration
    • Declarative Airflow DAG Generation - The FLUID Way
    • Generating Orchestration Code from Contracts
    • Jenkins CI/CD for FLUID Data Products
    • Universal Pipeline
    • The 11-Stage Pipeline
    • End-to-End Walkthrough: Catalog → Contract → Transformation
  • CLI Reference

    • CLI Reference
    • fluid init
    • fluid demo
    • fluid forge
    • fluid skills
    • fluid status
    • fluid validate
    • fluid plan
    • fluid apply
    • fluid generate
    • fluid generate artifacts
    • fluid validate-artifacts
    • fluid verify-signature
    • fluid generate-airflow
    • fluid generate-pipeline
    • fluid viz-graph
    • fluid odps
    • fluid odps-bitol
    • fluid odcs
    • fluid export
    • fluid export-opds
    • fluid publish
    • fluid datamesh-manager
    • fluid market
    • fluid import
    • fluid policy
    • fluid policy check
    • fluid policy compile
    • fluid policy apply
    • fluid contract-tests
    • fluid contract-validation
    • fluid diff
    • fluid test
    • fluid verify
    • fluid product-new
    • fluid product-add
    • fluid workspace
    • fluid ide
    • fluid ai
    • fluid memory
    • fluid mcp
    • fluid scaffold-ci
    • fluid scaffold-composer
    • fluid scaffold-ide
    • fluid docs
    • fluid config
    • fluid split
    • fluid bundle
    • fluid auth
    • fluid doctor
    • fluid providers
    • fluid provider-init
    • fluid roadmap
    • fluid version
    • fluid runs
    • fluid retention
    • fluid secrets
    • fluid stats
    • fluid contract
    • fluid ship
    • fluid rollback
    • fluid schedule-sync
    • Catalog adapters

      • Source Catalog Integration (V1.5)
      • BigQuery Catalog
      • Snowflake Horizon Catalog
      • Databricks Unity Catalog
      • Google Dataplex Catalog
      • AWS Glue Data Catalog
      • DataHub Catalog
      • Data Mesh Manager Catalog
    • CLI by task

      • CLI by task
      • Add quality rules
      • Add agent governance
      • Debug a failed pipeline run
      • Switch clouds with one line
  • Recipes

    • Recipes
    • Recipe — add a quality rule
    • Recipe — switch clouds with one line
    • Recipe — tag PII in your schema
  • SDK & Plugins

    • SDK & Plugins
    • Quickstart — your first plugin
    • Examples

      • Runnable examples
      • Example: hello-scaffold — the minimal viable plugin
      • Example: gitlab-ci-scaffold — generate a complete CI project
      • Example: steward-validator — a custom governance rule
      • Example: prod-key-guard — apply-time invariant check
    • Journeys

      • Journeys
      • Your own CI/CD

        • You have your own CI/CD setup, no problem
        • GitLab CI — the bundle template
        • GitHub Actions — the bundle template
        • Jenkins — the bundle template
        • CircleCI — the bundle template
      • You have a strict project layout, no problem
      • You have governance rules, no problem
      • You want a check at apply time, no problem
    • Reference

      • Reference
      • Roles reference
      • Entry points reference
      • Trust model
      • Packaging
      • Companion packages
  • Providers

    • Providers
    • Provider Architecture
    • GCP Provider
    • AWS Provider
    • Snowflake Provider
    • Local Provider
    • Creating Custom Providers
    • Provider Roadmap
  • Advanced

    • Blueprints
    • Governance & Compliance
    • Airflow Integration
    • Built-in And Custom Forge Guidance
    • FLUID Forge Contract GPT Packet
    • Forge Discovery Guide
    • Forge Memory Guide
    • LLM Providers
    • Capability Warnings
    • LiteLLM Backend (opt-in)
    • MCP Server
    • Credential Resolver — Security Model
    • Cost Tracking
    • Agentic Primitives
    • Typed Errors
    • Typed CLI Errors
    • Authoring Forge Tools
    • Source-Aligned Acquisition
    • API Stability — fluid_build.api
    • Guided fluid forge UX
    • V1.5 Catalog Integration — Architecture Deep-Dive
    • V1.5 + V2 Hardening — Release Notes
  • Project

    • Contributing to Fluid Forge
    • Fluid Forge Docs Baseline: CLI 0.8.3
    • Fluid Forge Docs Baseline: CLI 0.8.0
    • Fluid Forge Docs Baseline: CLI 0.7.11
    • Fluid Forge Docs Baseline: CLI 0.7.9
    • Fluid Forge v0.7.1 - Multi-Provider Export Release

Fluid Forge vs alternatives

If you already use dbt, Dagster, Terraform, OPA, or Snowpark, you might be wondering which problem Fluid Forge is solving that those tools don't. Honest answer: none of them, individually. Forge's value is unifying the four contracts every data product has — schema, infrastructure, orchestration, policy — into one declarative file so they can't drift from each other.

This page is the comparison page we'd want to read if we were evaluating Forge cold. It includes honest losses (dbt has a bigger ecosystem, Dagster has better asset lineage, Snowpark has tighter Snowflake integration) so you can decide whether the unification trade is worth it for your team.

  • The 30-second answer
  • The unification table
  • Forge vs dbt
    • Where dbt wins
    • Where Forge wins
    • How they fit together
  • Forge vs Dagster
    • Where Dagster wins
    • Where Forge wins
    • How they fit together
  • Forge vs Terraform
    • Where Terraform wins
    • Where Forge wins
    • How they fit together
  • Forge vs Snowpark / dbt-Snowflake / dbt Cloud
    • Where Snowflake-stack wins
    • Where Forge wins
    • How they fit together
  • When NOT to use Fluid Forge
    • You're a one-warehouse, one-team analytics shop
    • You need real-time streaming with sub-second SLA
    • You're doing pure ML feature engineering
    • Your team already has a working four-tool stack
    • You need a hosted control plane today
  • Bottom line
  • See also

The 30-second answer

You have…Forge fits when…Forge is overkill when…
One warehouse, one team, dbt + a CI runnerYou'll need governance, agent boundaries, or to add a second cloud laterYou'll never leave that warehouse, and policy/agent boundaries aren't on the roadmap
Multi-cloud already (BQ + Snowflake + Athena)You're rewriting the same contract three times in three formatsYou have dedicated platform engineers per cloud and they don't mind the duplication
Compliance pressure (SOX / GDPR / HIPAA)You want governance, sovereignty, and AI access boundaries in the same file as the schemaYou've already centralised governance on a separate plane (Immuta, Privacera)
Building agentic data productsYou want declarative agentPolicy gating LLM access at read-timeYour agents query through a separate runtime layer that already enforces this
Prototyping, don't know what you need yetStart local DuckDB, graduate when you doYou're at a code-only POC stage with no contract requirements

The unification table

ToolWhat it ownsWhat it doesn't ownWhat you wire by hand today
dbt CoreSQL transformations, lineage, refs/sources, dbt testsProvisioning, IAM, multi-cloud, agent governance, sovereigntyTerraform for IAM + Airflow for orchestration + OPA for policy + custom JSON for AI access
DagsterAsset orchestration, asset checks, sensors, schedulesSchema-as-contract, native cloud IAM emission, multi-cloud abstractionTerraform/Pulumi for infra + dbt for SQL + your own RBAC layer + your own AI gating
Airflow / Composer / MWAADAG scheduling, task retries, sensorsSchema, IAM, multi-cloud abstraction, contract validationResources + provider-specific operators + dbt + IAM + governance code
TerraformCloud infrastructure (any provider)Schema, quality rules, transformations, orchestrationTables + SQL + DAGs + dbt project + SLA checks + lineage emitters
OPA / RegoPolicy evaluation engineSchema, transformations, cloud-native IAM emission, AI/agent boundariesCompiling Rego → BigQuery row-level security / Snowflake masking / AWS IAM bindings
Snowpark / dbt-SnowflakeSnowflake-native data plane (UDFs, stored procs, dbt-cloud features)Multi-cloud portability, agent governanceRewriting everything if you ever leave Snowflake; a separate AI gate layer
Fluid ForgeAll four — schema + infra + orchestration + policy + AI gating — in one contract.fluid.yamlBespoke per-vendor extreme tuning (e.g. Snowflake search optimization, BigQuery BI Engine reservations)Anything genuinely vendor-specific that doesn't have a cross-cloud abstraction

Forge vs dbt

The most common comparison. dbt is the dominant SQL transformation framework; Forge is sometimes mistaken for a dbt competitor. It isn't.

Where dbt wins

  • Ecosystem maturity — 1000s of dbt packages, dbt-utils, dbt-expectations, dbt-snowflake, dbt-bigquery. Forge has none of these; it uses dbt for the SQL layer when engine: dbt is selected.
  • SQL-only teams — if your data product is a SQL transformation and nothing more, dbt is simpler. Forge's contract surface (schema, dq.rules, accessPolicy, agentPolicy, sovereignty) is overhead you don't need.
  • Community + hiring — dbt has been around since 2016. There are dbt analysts on the market. Forge engineers are still rare.
  • dbt Cloud / dbt Mesh — if you're committed to the dbt ecosystem and willing to pay for the cloud product, you get IDE, CI, lineage, and discovery without leaving dbt-land.

Where Forge wins

  • Multi-cloud portability — change binding.platform: snowflake to binding.platform: bigquery, redeploy. dbt's adapter layer handles SQL dialect differences but not the surrounding infrastructure (datasets, IAM, regions).
  • Governance as part of the contract — accessPolicy.grants compiles to native IAM (BigQuery IAM_BINDINGS, Snowflake GRANT, AWS resource policies). dbt has no equivalent — you wire IAM separately.
  • Agent governance — agentPolicy declares which LLMs can read which fields, with audit logging. dbt has no concept of this.
  • Sovereignty / regulatory framework — sovereignty.regulatoryFramework: ['SOX', 'GDPR'] is enforced before deploy. dbt does not validate compliance.
  • Local-first development — pipx install "data-product-forge[local]" and fluid apply work entirely on DuckDB with no cloud account. dbt-core works locally too, but the typical dbt onboarding assumes a warehouse.
  • The contract is the source of truth — dbt models describe transformations; Forge contracts describe the entire data product (schema, transformation, exposure, governance). Different scope.

How they fit together

Forge does not replace dbt. The recommended pattern is engine: dbt inside a Forge contract — Forge handles provisioning, IAM, policy, and AI gating; dbt handles the SQL transformation. Both worlds, no overlap.

builds:
  - id: customer_metrics
    engine: dbt              # ← dbt does the SQL
    repository: ./dbt
    properties:
      project: customer_360
      target: prod

Forge vs Dagster

A sharper philosophical comparison. Dagster's asset-oriented orchestration is the closest thing in the OSS world to Forge's contract-first model.

Where Dagster wins

  • Asset orchestration depth — software-defined assets, partitioned assets, asset checks, asset sensors. Forge has builds + exposes; Dagster has a richer asset graph model with native lineage.
  • Python-first — write your business logic in Python and let Dagster orchestrate. Forge's primary interface is YAML; Python is for builds via engine: python.
  • Dagster Cloud / Plus — hosted control plane, branch deployments, concurrency controls. Forge has no hosted offering.
  • Op-level retries, backfills, and sensors — operationally rich. Forge defers orchestration to the chosen scheduler (fluid generate schedule --scheduler dagster | airflow | prefect).

Where Forge wins

  • Contract-first vs pipeline-first — Forge starts with "what is this data product" (schema, SLAs, governance). Dagster starts with "how is it computed" (assets, ops). Different first principle.
  • Native cloud IAM emission — same as the dbt comparison: accessPolicy.grants → bindings.json → policy-apply. Dagster has IO managers and resources, but no equivalent IAM compilation.
  • Multi-cloud abstraction at the contract layer — Dagster's resources are typed to a specific platform per pipeline. Forge's binding.platform is a swap.
  • Agent governance as a first-class contract field — Dagster doesn't model this.
  • Smaller surface for read-only data product producers — if your team's job is to publish a data product (not to operate a complex pipeline), Forge's mental model is lighter than Dagster's.

How they fit together

fluid generate schedule --scheduler dagster emits a Dagster job from the Forge contract. You get Forge's contract + governance + multi-cloud, plus Dagster's runtime. Forge owns the what; Dagster owns the how.


Forge vs Terraform

The infrastructure-as-code comparison. Terraform is universal; Forge is data-specific.

Where Terraform wins

  • Universality — Terraform manages everything: VPCs, Kubernetes clusters, IAM roles, S3 buckets, Stripe products, Cloudflare DNS. Forge only manages data products.
  • Provider ecosystem — 3000+ Terraform providers. Forge has 4 primary (local, gcp, aws, snowflake) plus a Custom Provider SDK.
  • State management — Terraform's state is a battle-tested model with locking, partial apply, drift detection. Forge has lighter state semantics tuned for data products.
  • Mature blast-radius controls — terraform plan, terraform import, workspaces, modules. Forge has fluid plan but the surrounding tooling is younger.

Where Forge wins

  • Data-product-specific abstractions — exposes, dq.rules, agentPolicy, sovereignty, lineage — try expressing these in Terraform. You can't, except as ad-hoc resource configurations that drift.
  • Schema-as-contract — Forge validates the schema against the actual deployed table. Terraform doesn't know what a "schema" is.
  • One contract, three clouds — Terraform requires three different sets of resource definitions to deploy "the same" BigQuery table on Snowflake and Athena. Forge does it with one binding swap.
  • Compiles to Terraform — fluid generate artifacts --target terraform emits Terraform HCL when you want to inherit your Terraform pipeline downstream.

How they fit together

Forge sits on top of Terraform conceptually. Many teams use Forge for the data-product layer and inherit their Terraform pipeline for the surrounding infra (VPCs, KMS keys, etc). Forge's policy-apply can either apply IAM directly or emit Terraform for human review.


Forge vs Snowpark / dbt-Snowflake / dbt Cloud

The single-vendor stack. If you've gone all-in on Snowflake, this is who you're really comparing against.

Where Snowflake-stack wins

  • Vendor-specific feature depth — Snowpark UDFs, stored procs, search optimization, query acceleration, time-travel, zero-copy clones. Forge can drive Snowflake but doesn't surface every Snowflake-specific tuning knob.
  • Single bill, single support contract — one vendor relationship, one billing system, one support team.
  • Snowflake Cortex / native LLM — if your strategy is "Snowflake will be the AI plane too", Cortex is integrated. Forge supports Snowflake Cortex via providers but isn't tied to it.
  • dbt Cloud — IDE, CI, lineage, jobs all hosted. Forge has none of the hosted UX yet.

Where Forge wins

  • Optionality — the day Snowflake pricing changes or a faster engine emerges (DuckDB, RisingWave, Materialize), you can move. With Snowpark you cannot.
  • Local development — Forge runs end-to-end on DuckDB with no Snowflake account. Snowpark needs a Snowflake account from day one.
  • Cross-warehouse — if part of your portfolio is on BigQuery and part on Snowflake, Forge unifies the contract. Snowpark + BigQuery is two parallel stacks.
  • Open-source, Apache-2.0 — no vendor lock at the orchestration layer.

How they fit together

Use binding.platform: snowflake for your Snowflake-resident data products and inherit Snowflake's vendor-specific features via binding.properties.snowflake.*. The contract stays portable; the Snowflake-specific knobs are a property pass-through.


When NOT to use Fluid Forge

The honest list. Adoption decisions are easier when you know where the tool actively isn't right.

You're a one-warehouse, one-team analytics shop

If you have one Snowflake account, one dbt project, and no governance/compliance pressure, the Forge contract is overhead. Stay on dbt. You can still adopt fluid validate standalone if you ever want schema-as-contract testing.

You need real-time streaming with sub-second SLA

Forge's batch-and-mini-batch model fits 5-minute to 24-hour latency. For sub-second streaming (CDC → live materialized view), look at Materialize, RisingWave, or a Kafka + Flink stack. Forge's engine: kafka-connect and engine: debezium cover ingestion; the streaming compute layer is out of scope.

You're doing pure ML feature engineering

Forge's contract surface is general-purpose. For ML feature stores specifically, Feast, Tecton, or Hopsworks have richer concepts (feature views, point-in-time joins, online/offline parity). Forge can express the output feature table as a contract but doesn't have feature-store-specific abstractions.

Your team already has a working four-tool stack

If Terraform + dbt + Airflow + OPA is humming and the team is happy, the migration cost to a unified contract may not pay off. Consider adopting Forge incrementally — start with fluid validate for contract testing, expand only if/when the cross-tool drift starts to bite.

You need a hosted control plane today

Forge is currently CLI + GitHub Actions / GitLab CI / Jenkins / Tekton (any CI). There is no hosted Forge Cloud. If you need a SaaS UI for your data team to onboard non-engineers, Dagster Cloud / dbt Cloud are mature options today; the equivalent Forge offering is on the roadmap but not shipped.


Bottom line

Pick Fluid Forge when:

  • You're shipping data products across 2+ clouds (or expect to within 12 months)
  • Governance / compliance is part of the contract, not an after-the-fact audit
  • You're building agentic data products and need declarative LLM access boundaries
  • You want a single source of truth for schema, infra, orchestration, policy, AI gating

Pick something else when:

  • You're committed to one warehouse forever and have no governance pressure → use dbt
  • You need rich orchestration with pipeline-first semantics → use Dagster
  • You need arbitrary cloud infrastructure (not just data products) → use Terraform
  • You need sub-second streaming → use Materialize / RisingWave
  • You need a feature store → use Feast / Tecton

Use them together when:

  • You want Forge's contract + governance with dbt's SQL: engine: dbt
  • You want Forge's contract with Dagster's runtime: fluid generate schedule --scheduler dagster
  • You want Forge's contract layered on top of Terraform: fluid generate artifacts --target terraform

The unification value is highest when at least two of (multi-cloud, governance, AI gating) apply to your data products. If only one applies, the dedicated tool for that one thing is usually a better fit.

See also

  • Concepts overview — the rest of the conceptual reference
  • Get Started — install and ship a data product in 30 seconds
  • Demos library — the proof, in 30-second SVG casts
  • Universal pipeline — the canonical 11-stage CI flow
Edit this page on GitHub
Last Updated: 5/17/26, 6:51 PM
Contributors: fas89, Claude Opus 4.7 (1M context)
Prev
Providers vs Platforms