Fluid Forge
Get Started
See it run
  • Local (DuckDB)
  • Source-Aligned (Postgres → DuckDB)
  • AI Forge + Data Models
  • GCP (BigQuery)
  • Snowflake Team Collaboration
  • Declarative Airflow
  • Orchestration Export
  • Jenkins CI/CD
  • Universal Pipeline
  • 11-Stage Production Pipeline
  • Catalog Forge End-to-End
CLI Reference
  • Overview
  • Quickstart
  • Examples
  • Your own CI
  • Your own scaffolding
  • Custom validator
  • Apply hook
  • Reference
Demos
  • Overview
  • Architecture
  • GCP (BigQuery)
  • AWS (S3 + Athena)
  • Snowflake
  • Local (DuckDB)
  • Custom Providers
  • Roadmap
GitHub
GitHub
Get Started
See it run
  • Local (DuckDB)
  • Source-Aligned (Postgres → DuckDB)
  • AI Forge + Data Models
  • GCP (BigQuery)
  • Snowflake Team Collaboration
  • Declarative Airflow
  • Orchestration Export
  • Jenkins CI/CD
  • Universal Pipeline
  • 11-Stage Production Pipeline
  • Catalog Forge End-to-End
CLI Reference
  • Overview
  • Quickstart
  • Examples
  • Your own CI
  • Your own scaffolding
  • Custom validator
  • Apply hook
  • Reference
Demos
  • Overview
  • Architecture
  • GCP (BigQuery)
  • AWS (S3 + Athena)
  • Snowflake
  • Local (DuckDB)
  • Custom Providers
  • Roadmap
GitHub
GitHub
  • Introduction

    • Home
    • Getting Started
    • Snowflake Quickstart
    • See it run
    • Forge Data Model
    • Vision & Roadmap
    • Playground
    • FAQ
  • Concepts

    • Concepts
    • Builds, Exposes, Bindings
    • What is a contract?
    • Quality, SLAs & Lineage
    • Governance & Policy
    • Agent Policy (LLM/AI governance)
    • Providers vs Platforms
    • Fluid Forge vs alternatives
  • Data Products

    • Product Types — SDP, ADP, CDP
  • Walkthroughs

    • Walkthrough: Local Development
    • Source-Aligned: Postgres → DuckDB → Parquet
    • AI Forge And Data-Model Journeys
    • Walkthrough: Deploy to Google Cloud Platform
    • Walkthrough: Snowflake Team Collaboration
    • Declarative Airflow DAG Generation - The FLUID Way
    • Generating Orchestration Code from Contracts
    • Jenkins CI/CD for FLUID Data Products
    • Universal Pipeline
    • The 11-Stage Pipeline
    • End-to-End Walkthrough: Catalog → Contract → Transformation
  • CLI Reference

    • CLI Reference
    • fluid init
    • fluid demo
    • fluid forge
    • fluid skills
    • fluid status
    • fluid validate
    • fluid plan
    • fluid apply
    • fluid generate
    • fluid generate artifacts
    • fluid validate-artifacts
    • fluid verify-signature
    • fluid generate-airflow
    • fluid generate-pipeline
    • fluid viz-graph
    • fluid odps
    • fluid odps-bitol
    • fluid odcs
    • fluid export
    • fluid export-opds
    • fluid publish
    • fluid datamesh-manager
    • fluid market
    • fluid import
    • fluid policy
    • fluid policy check
    • fluid policy compile
    • fluid policy apply
    • fluid contract-tests
    • fluid contract-validation
    • fluid diff
    • fluid test
    • fluid verify
    • fluid product-new
    • fluid product-add
    • fluid workspace
    • fluid ide
    • fluid ai
    • fluid memory
    • fluid mcp
    • fluid scaffold-ci
    • fluid scaffold-composer
    • fluid scaffold-ide
    • fluid docs
    • fluid config
    • fluid split
    • fluid bundle
    • fluid auth
    • fluid doctor
    • fluid providers
    • fluid provider-init
    • fluid roadmap
    • fluid version
    • fluid runs
    • fluid retention
    • fluid secrets
    • fluid stats
    • fluid contract
    • fluid ship
    • fluid rollback
    • fluid schedule-sync
    • Catalog adapters

      • Source Catalog Integration (V1.5)
      • BigQuery Catalog
      • Snowflake Horizon Catalog
      • Databricks Unity Catalog
      • Google Dataplex Catalog
      • AWS Glue Data Catalog
      • DataHub Catalog
      • Data Mesh Manager Catalog
    • CLI by task

      • CLI by task
      • Add quality rules
      • Add agent governance
      • Debug a failed pipeline run
      • Switch clouds with one line
  • Recipes

    • Recipes
    • Recipe — add a quality rule
    • Recipe — switch clouds with one line
    • Recipe — tag PII in your schema
  • SDK & Plugins

    • SDK & Plugins
    • Quickstart — your first plugin
    • Examples

      • Runnable examples
      • Example: hello-scaffold — the minimal viable plugin
      • Example: gitlab-ci-scaffold — generate a complete CI project
      • Example: steward-validator — a custom governance rule
      • Example: prod-key-guard — apply-time invariant check
    • Journeys

      • Journeys
      • Your own CI/CD

        • You have your own CI/CD setup, no problem
        • GitLab CI — the bundle template
        • GitHub Actions — the bundle template
        • Jenkins — the bundle template
        • CircleCI — the bundle template
      • You have a strict project layout, no problem
      • You have governance rules, no problem
      • You want a check at apply time, no problem
    • Reference

      • Reference
      • Roles reference
      • Entry points reference
      • Trust model
      • Packaging
      • Companion packages
  • Providers

    • Providers
    • Provider Architecture
    • GCP Provider
    • AWS Provider
    • Snowflake Provider
    • Local Provider
    • Creating Custom Providers
    • Provider Roadmap
  • Advanced

    • Blueprints
    • Governance & Compliance
    • Airflow Integration
    • Built-in And Custom Forge Guidance
    • FLUID Forge Contract GPT Packet
    • Forge Discovery Guide
    • Forge Memory Guide
    • LLM Providers
    • Capability Warnings
    • LiteLLM Backend (opt-in)
    • MCP Server
    • Credential Resolver — Security Model
    • Cost Tracking
    • Agentic Primitives
    • Typed Errors
    • Typed CLI Errors
    • Authoring Forge Tools
    • Source-Aligned Acquisition
    • API Stability — fluid_build.api
    • Guided fluid forge UX
    • V1.5 Catalog Integration — Architecture Deep-Dive
    • V1.5 + V2 Hardening — Release Notes
  • Project

    • Contributing to Fluid Forge
    • Fluid Forge Docs Baseline: CLI 0.8.3
    • Fluid Forge Docs Baseline: CLI 0.8.0
    • Fluid Forge Docs Baseline: CLI 0.7.11
    • Fluid Forge Docs Baseline: CLI 0.7.9
    • Fluid Forge v0.7.1 - Multi-Provider Export Release

Quality, SLAs & Lineage

Three pillars of "is this data product trustworthy?" — all declarative, all enforced by fluid validate + fluid test + fluid verify.

Data quality rules — dq.rules

Live at exposes[].contract.dq.rules. Each rule has an id, a type, a severity, and (usually) a selector + threshold + operator.

exposes:
  - exposeId: bitcoin_prices
    contract:
      schema:
        - name: price_usd
          type: NUMERIC
          required: true
      dq:
        rules:
          - id: price_not_null
            type: completeness          # ← one of the 8 allowed types
            selector: price_usd
            threshold: 1.0
            operator: ">="
            severity: error             # error | warn | info

          - id: data_freshness
            type: freshness
            window: PT1H                # ISO 8601 duration
            severity: warn

Allowed type values (v0.7.3 schema): freshness · completeness · uniqueness · valid_values · accuracy · schema · anomaly_detection · drift_detection

Severity enum (verified against fluid-schema-0.7.3.json): info · warn · error · critical.

Conventional behavior (confirm specifics with fluid apply -h / fluid test -h for your CLI version):

  • error / critical — block the deploy. Used for hard guarantees.
  • warn — deploy proceeds; warning emitted to stdout + the test report.
  • info — recorded only.

SLAs — qos

Service-level targets at exposes[].qos:

exposes:
  - exposeId: bitcoin_prices
    qos:
      availability: "99.5%"
      latencyP95: PT5S

Currently used for catalog publish (ODCS) + Data Mesh Manager. Active monitoring against these thresholds is on the roadmap.

Lineage — auto-derived

You don't write lineage yourself. The schema captures upstream relationships through:

  1. consumes[] — explicit upstream-product references at the contract root.
  2. builds[].properties.sql — column-level lineage parsed from SQL.
  3. builds[].repository — for hybrid-reference builds, the dbt manifest is read for graph data.

The exact output paths and viewer formats depend on the CLI version + provider. Run fluid plan --html and check the generated artifact directory; document what you see in your team's runbook rather than relying on this page.

Common rule patterns

These are the rule shapes most production data products end up with. Copy them as a starting point. Each example uses only fields defined in fluid-schema-0.7.3.json.

Conditional completeness via the build, not the rule

The schema's dqRule shape (id, type, selector, threshold, operator, window, severity, description, tags, labels) intentionally doesn't carry a where: clause. The recommended pattern when a column is required for some rows but not others is to handle it in the SQL build, then check completeness on the fully-populated column:

builds:
  - id: customer_metrics
    pattern: embedded-logic
    engine: sql
    properties:
      sql: |
        SELECT
          customer_id,
          customer_age_days,
          -- arpu is non-null only when 30 days of history exists
          CASE
            WHEN customer_age_days >= 30 THEN COALESCE(arpu_30d_eur_raw, 0)
            ELSE NULL
          END AS arpu_30d_eur
        FROM raw.customers c
        LEFT JOIN raw.transactions t USING (customer_id)

Then the rule is plain completeness, scoped to the rows you care about via the selector's downstream filter (or simply tolerated at threshold < 1.0):

dq:
  rules:
    - id: arpu_30d_completeness
      type: completeness
      selector: arpu_30d_eur
      threshold: 0.85         # 85% of all rows have non-null arpu (the 15% are < 30 days)
      operator: ">="
      severity: error

This pattern keeps the rule schema clean and pushes the lifecycle logic into SQL where it belongs.

Drift detection on schema or distribution

dq:
  rules:
    - id: schema_stability
      type: schema
      severity: critical              # block deploy on schema change without explicit version bump

    - id: revenue_distribution_drift
      type: drift_detection
      selector: weekly_revenue
      window: P14D                    # 14-day rolling baseline (ISO 8601 duration)
      threshold: 0.20                 # alert if drift score exceeds threshold
      operator: "<="
      severity: warn

drift_detection requires the verify command running on a schedule against a baseline window. Schedule fluid verify via your CI / orchestrator (Airflow, Dagster, GitHub Actions cron) — the qos block on the expose declares the target, but scheduling lives in the runtime layer.

Freshness with two-tier severity

The schema has no grace / escalate_after field — declare two separate rules with different windows + severities to express the same intent:

dq:
  rules:
    - id: freshness_hourly_warn
      type: freshness
      window: PT1H
      severity: warn

    - id: freshness_75min_critical
      type: freshness
      window: PT75M
      severity: critical

CI runs fluid verify; both rules evaluate against the same deployed-table last-write timestamp; whichever crosses first fires.

Valid values

The schema's valid_values rule type takes a selector plus a threshold + operator — the actual allowed set is enforced by the build's filtering (or the schema field's description). Common pattern:

dq:
  rules:
    - id: country_valid_iso
      type: valid_values
      selector: country
      threshold: 1.0
      operator: ">="
      severity: error
      description: "country must be in ISO 3166 alpha-2 (US, CA, GB, ...)"

For richer enum enforcement, gate it in the build's WHERE clause (rejecting non-conforming rows to a quarantine table) or use the schema field's description to document the allowed set.

Multi-window monitoring

The schema doesn't provide a built-in scheduling block — qos declares targets, runtime declares schedules. Pattern:

  1. Targets live on exposes[].qos:

    exposes:
      - exposeId: customer_360_table
        qos:
          availability: 99.5
          freshnessSLO: PT1H              # ISO 8601 duration
          latencyP95: PT500MS
          completenessTarget: 0.99
          errorBudget: 0.01
    
  2. Schedules live in your CI / orchestrator (Airflow / Dagster / GitHub Actions cron). For example:

    # .github/workflows/verify-fast.yml
    on:
      schedule:
        - cron: "*/15 * * * *"          # every 15 min
    jobs:
      verify:
        runs-on: ubuntu-latest
        steps:
          - run: fluid verify contract.fluid.yaml --strict --env prod
    
    # .github/workflows/verify-weekly-audit.yml
    on:
      schedule:
        - cron: "0 8 * * MON"           # Monday 08:00
    jobs:
      audit:
        runs-on: ubuntu-latest
        steps:
          - run: fluid verify contract.fluid.yaml --strict --env prod --json | tee audit.log
    

The fast schedule catches stale-data incidents (against `freshnessSLO`); the slow audit catches creeping quality drift (against `completenessTarget`). Both invoke `fluid verify` against the same contract; the contract is the source of truth.

## Lineage emission formats

`fluid generate artifacts` emits lineage in three industry-standard formats:

| Format | File | Used by |
|---|---|---|
| **OPDS** (Open Product Data Schema) | `artifacts/standards/product.opds.json` | Generic catalog ingest |
| **ODCS** (Open Data Contract Standard) | `artifacts/standards/product.odcs.yaml` | Data Mesh Manager, Atlan, Collibra (when configured) |
| **OpenLineage** | `artifacts/lineage/openlineage.json` | Marquez, DataHub, OpenLineage-compliant tools |

Pick whichever your existing catalog speaks. The contract is the source of truth; these are derived artifacts that re-emit on every `apply`.

## Where to look next

- [Governance & Policy](./governance-policy.md) — `accessPolicy` and `agentPolicy` complementing `dq.rules`
- [Builds, Exposes, Bindings](./builds-exposes-bindings.md) — where `dq.rules` lives in the schema
- [`fluid verify`](/forge_docs/cli/verify) — runtime drift detection
- [`fluid test`](/forge_docs/cli/test) — pre-deploy quality gates
Edit this page on GitHub
Last Updated: 5/17/26, 6:10 PM
Contributors: fas89, Claude Opus 4.7 (1M context)
Prev
What is a contract?
Next
Governance & Policy