Fluid Forge
Get Started
See it run
  • Local (DuckDB)
  • Source-Aligned (Postgres → DuckDB)
  • AI Forge + Data Models
  • GCP (BigQuery)
  • Snowflake Team Collaboration
  • Declarative Airflow
  • Orchestration Export
  • Jenkins CI/CD
  • Universal Pipeline
  • 11-Stage Production Pipeline
  • Catalog Forge End-to-End
CLI Reference
  • Overview
  • Quickstart
  • Examples
  • Your own CI
  • Your own scaffolding
  • Custom validator
  • Apply hook
  • Reference
Demos
  • Overview
  • Architecture
  • GCP (BigQuery)
  • AWS (S3 + Athena)
  • Snowflake
  • Local (DuckDB)
  • Custom Providers
  • Roadmap
GitHub
GitHub
Get Started
See it run
  • Local (DuckDB)
  • Source-Aligned (Postgres → DuckDB)
  • AI Forge + Data Models
  • GCP (BigQuery)
  • Snowflake Team Collaboration
  • Declarative Airflow
  • Orchestration Export
  • Jenkins CI/CD
  • Universal Pipeline
  • 11-Stage Production Pipeline
  • Catalog Forge End-to-End
CLI Reference
  • Overview
  • Quickstart
  • Examples
  • Your own CI
  • Your own scaffolding
  • Custom validator
  • Apply hook
  • Reference
Demos
  • Overview
  • Architecture
  • GCP (BigQuery)
  • AWS (S3 + Athena)
  • Snowflake
  • Local (DuckDB)
  • Custom Providers
  • Roadmap
GitHub
GitHub
  • Introduction

    • Home
    • Getting Started
    • Snowflake Quickstart
    • See it run
    • Forge Data Model
    • Vision & Roadmap
    • Playground
    • FAQ
  • Concepts

    • Concepts
    • Builds, Exposes, Bindings
    • What is a contract?
    • Quality, SLAs & Lineage
    • Governance & Policy
    • Agent Policy (LLM/AI governance)
    • Providers vs Platforms
    • Fluid Forge vs alternatives
  • Data Products

    • Product Types — SDP, ADP, CDP
  • Walkthroughs

    • Walkthrough: Local Development
    • Source-Aligned: Postgres → DuckDB → Parquet
    • AI Forge And Data-Model Journeys
    • Walkthrough: Deploy to Google Cloud Platform
    • Walkthrough: Snowflake Team Collaboration
    • Declarative Airflow DAG Generation - The FLUID Way
    • Generating Orchestration Code from Contracts
    • Jenkins CI/CD for FLUID Data Products
    • Universal Pipeline
    • The 11-Stage Pipeline
    • End-to-End Walkthrough: Catalog → Contract → Transformation
  • CLI Reference

    • CLI Reference
    • fluid init
    • fluid demo
    • fluid forge
    • fluid skills
    • fluid status
    • fluid validate
    • fluid plan
    • fluid apply
    • fluid generate
    • fluid generate artifacts
    • fluid validate-artifacts
    • fluid verify-signature
    • fluid generate-airflow
    • fluid generate-pipeline
    • fluid viz-graph
    • fluid odps
    • fluid odps-bitol
    • fluid odcs
    • fluid export
    • fluid export-opds
    • fluid publish
    • fluid datamesh-manager
    • fluid market
    • fluid import
    • fluid policy
    • fluid policy check
    • fluid policy compile
    • fluid policy apply
    • fluid contract-tests
    • fluid contract-validation
    • fluid diff
    • fluid test
    • fluid verify
    • fluid product-new
    • fluid product-add
    • fluid workspace
    • fluid ide
    • fluid ai
    • fluid memory
    • fluid mcp
    • fluid scaffold-ci
    • fluid scaffold-composer
    • fluid scaffold-ide
    • fluid docs
    • fluid config
    • fluid split
    • fluid bundle
    • fluid auth
    • fluid doctor
    • fluid providers
    • fluid provider-init
    • fluid roadmap
    • fluid version
    • fluid runs
    • fluid retention
    • fluid secrets
    • fluid stats
    • fluid contract
    • fluid ship
    • fluid rollback
    • fluid schedule-sync
    • Catalog adapters

      • Source Catalog Integration (V1.5)
      • BigQuery Catalog
      • Snowflake Horizon Catalog
      • Databricks Unity Catalog
      • Google Dataplex Catalog
      • AWS Glue Data Catalog
      • DataHub Catalog
      • Data Mesh Manager Catalog
    • CLI by task

      • CLI by task
      • Add quality rules
      • Add agent governance
      • Debug a failed pipeline run
      • Switch clouds with one line
  • Recipes

    • Recipes
    • Recipe — add a quality rule
    • Recipe — switch clouds with one line
    • Recipe — tag PII in your schema
  • SDK & Plugins

    • SDK & Plugins
    • Quickstart — your first plugin
    • Examples

      • Runnable examples
      • Example: hello-scaffold — the minimal viable plugin
      • Example: gitlab-ci-scaffold — generate a complete CI project
      • Example: steward-validator — a custom governance rule
      • Example: prod-key-guard — apply-time invariant check
    • Journeys

      • Journeys
      • Your own CI/CD

        • You have your own CI/CD setup, no problem
        • GitLab CI — the bundle template
        • GitHub Actions — the bundle template
        • Jenkins — the bundle template
        • CircleCI — the bundle template
      • You have a strict project layout, no problem
      • You have governance rules, no problem
      • You want a check at apply time, no problem
    • Reference

      • Reference
      • Roles reference
      • Entry points reference
      • Trust model
      • Packaging
      • Companion packages
  • Providers

    • Providers
    • Provider Architecture
    • GCP Provider
    • AWS Provider
    • Snowflake Provider
    • Local Provider
    • Creating Custom Providers
    • Provider Roadmap
  • Advanced

    • Blueprints
    • Governance & Compliance
    • Airflow Integration
    • Built-in And Custom Forge Guidance
    • FLUID Forge Contract GPT Packet
    • Forge Discovery Guide
    • Forge Memory Guide
    • LLM Providers
    • Capability Warnings
    • LiteLLM Backend (opt-in)
    • MCP Server
    • Credential Resolver — Security Model
    • Cost Tracking
    • Agentic Primitives
    • Typed Errors
    • Typed CLI Errors
    • Authoring Forge Tools
    • Source-Aligned Acquisition
    • API Stability — fluid_build.api
    • Guided fluid forge UX
    • V1.5 Catalog Integration — Architecture Deep-Dive
    • V1.5 + V2 Hardening — Release Notes
  • Project

    • Contributing to Fluid Forge
    • Fluid Forge Docs Baseline: CLI 0.8.3
    • Fluid Forge Docs Baseline: CLI 0.8.0
    • Fluid Forge Docs Baseline: CLI 0.7.11
    • Fluid Forge Docs Baseline: CLI 0.7.9
    • Fluid Forge v0.7.1 - Multi-Provider Export Release

Provider Architecture

Providers are the execution layer of Fluid Forge. They translate your declarative YAML contract into concrete platform operations — creating tables in DuckDB locally, provisioning BigQuery datasets on GCP, or deploying schemas in Snowflake.

This page explains how the provider system works under the hood. If you want to build your own provider, see Creating Custom Providers.

How It Works

Every Fluid Forge command follows the same flow: contract → provider → plan → apply → result.

┌─────────────────────────────────────────────────────┐
│              FLUID Contract (YAML)                  │
│  id, name, version, consumes[], builds[], exposes[] │
└──────────────────────┬──────────────────────────────┘
                       │
                 ┌─────▼─────┐
                 │  fluid    │  CLI parses the contract
                 │  plan     │  and resolves the provider
                 └─────┬─────┘
                       │
           ┌───────────▼───────────┐
           │   Provider Registry   │  Discovers all available
           │   (auto-discovery)    │  providers at startup
           └───────────┬───────────┘
                       │
     ┌─────────────────┼─────────────────┐
     ▼                 ▼                 ▼
┌─────────┐     ┌──────────┐     ┌───────────┐
│  Local   │     │   GCP    │     │ Snowflake │  ...
│ (DuckDB) │     │(BigQuery)│     │           │
└────┬────┘     └────┬─────┘     └─────┬─────┘
     │               │                 │
  plan() → actions   plan() → actions  plan() → actions
  apply() → result   apply() → result  apply() → result

This design gives you:

  • One contract, multiple targets — the same YAML runs locally for development, then deploys to any cloud in production
  • Deterministic planning — plan() is pure with no side effects, the same contract always produces the same actions
  • Idempotent apply — apply() is safe to re-run, it converges toward the desired state
  • Extensibility — add a new provider without changing contracts or the CLI

The Two Required Methods

Every provider must implement exactly two methods:

plan(contract) → actions

Reads the contract and returns a list of actions — plain Python dicts describing what needs to happen:

actions = provider.plan(contract)
# [
#   {"op": "load_data", "path": "data/customers.csv", "table_name": "customers"},
#   {"op": "execute_sql", "sql": "SELECT * FROM customers WHERE active", ...},
#   {"op": "materialize", "source_table": "result", "path": "out/active.csv"}
# ]

Planning makes no network calls and has no side effects. It's just data transformation: contract in, action list out.

apply(actions) → ApplyResult

Executes each action against the target platform and returns a structured result:

result = provider.apply(actions)
# ApplyResult(
#   provider="local",
#   applied=3, failed=0,
#   duration_sec=0.142,
#   timestamp="2026-03-05T10:30:00Z",
#   results=[
#     {"i": 0, "status": "ok", "op": "load_data"},
#     {"i": 1, "status": "ok", "op": "execute_sql"},
#     {"i": 2, "status": "ok", "op": "materialize"}
#   ]
# )

Provider Discovery

When you run any fluid command, the CLI automatically discovers all available providers. You never need to configure this — it just works.

How Discovery Finds Providers

Discovery runs a 4-layer pipeline, in order:

LayerWhat it doesUse case
1. Entry pointsScans pip-installed packages for fluid_build.providers entry pointsThird-party providers installed via pip install
2. Built-in modulesImports the curated defaults: local, gcp, aws, snowflake, odpsThe providers that ship with Fluid Forge
3. Subpackage scanScans fluid_build/providers/* for any remaining modulesCatches providers added to the package tree
4. FallbackRe-attempts imports if registry is still emptyRecovers from import ordering issues

Discovery is lazy (runs on first access), idempotent (subsequent calls are no-ops), and thread-safe.

Selecting a Provider

The CLI resolves which provider to use in this order:

  1. The --provider flag: fluid --provider gcp plan contract.yaml
  2. The FLUID_PROVIDER environment variable: export FLUID_PROVIDER=gcp
# List all discovered providers
fluid providers

# Restrict discovery to specific providers (advanced)
FLUID_PROVIDERS="local,gcp" fluid providers

Built-in Providers

Fluid Forge ships with these providers:

ProviderRuntimeBest for
LocalDuckDBDevelopment, testing, CSV/Parquet workflows
GCPBigQuery + GCSGoogle Cloud production deployments
AWSS3 + Athena + GlueAmazon Web Services deployments
SnowflakeSnowflakeEnterprise data warehouse deployments
ODPSStandards exportData product interoperability (ODPS v4.1)
# Local development
fluid --provider local apply contract.yaml --yes

# Deploy to GCP
fluid --provider gcp apply contract.yaml --project my-gcp-project

# Deploy to Snowflake
fluid --provider snowflake apply contract.yaml

# Deploy to AWS
fluid --provider aws apply contract.yaml --region us-east-1

The Action System

Actions are the intermediate representation between planning and execution. Each action is a plain dict with an op field that identifies the operation.

Standard Action Types

OpPurposeKey fields
load_dataImport a file into the query enginepath, table_name, format
execute_sqlRun a SQL transformationsql, output_table, resource_id
materializeWrite results to an output filesource_table, path, format
copyCopy or export datasource, destination, format
noopPlaceholder (no operation)—

Cloud providers define their own ops (e.g., ensure_dataset, ensure_table, create_view, grant_role).

Dependency Resolution

The planner builds a dependency graph and uses topological sorting to determine execution order. Data must be loaded before transformations run, and transformations must complete before materialization:

load_data(customers.csv)  ──┐
                             ├──▶  execute_sql(transform)  ──▶  materialize(output.csv)
load_data(orders.csv)     ──┘

Capabilities

Providers advertise what they support through a capabilities object. The CLI uses this to enable or disable features dynamically:

def capabilities(self):
    return ProviderCapabilities(
        planning=True,       # Can generate execution plans
        apply=True,          # Can execute actions
        render=False,        # Can export to external formats
        graph=False,         # Can generate lineage graphs
        auth=False,          # Requires authentication
    )

Check capabilities from the CLI:

fluid providers         # Shows capabilities for all providers

Error Handling

Providers use a two-tier error model:

Error typeWhen to useUser experience
ProviderErrorUser-fixable problems (bad contract, missing resource)Friendly error message
ProviderInternalErrorBugs or environment failures (API outage)Full traceback in debug mode
from fluid_provider_sdk import ProviderError, ProviderInternalError

# User error — they can fix this
raise ProviderError("Dataset 'analytics' not found in project 'my-project'")

# Internal error — something unexpected broke
raise ProviderInternalError(f"BigQuery API returned unexpected status: {status}")

Environment Variables

VariablePurposeExample
FLUID_PROVIDERDefault providerlocal, gcp, snowflake
FLUID_PROJECTCloud project/accountmy-gcp-project
FLUID_REGIONDeployment regionus-central1
FLUID_PROVIDERSRestrict which providers to discoverlocal,gcp

Next Steps

  • Build your own provider: Creating Custom Providers
  • Use a specific provider: GCP · AWS · Snowflake · Local
  • See what's coming: Provider Roadmap
Edit this page on GitHub
Last Updated: 4/16/26, 7:55 AM
Contributors: Jeff Watson
Prev
Providers
Next
GCP Provider