Fluid ForgeFluid Forge
Home
Get Started
  • Local (DuckDB)
  • GCP (BigQuery)
  • Snowflake Team Collaboration
  • Declarative Airflow
  • Orchestration Export
  • Jenkins CI/CD
  • Universal Pipeline
CLI Reference
  • Overview
  • Architecture
  • GCP (BigQuery)
  • AWS (S3 + Athena)
  • Snowflake
  • Local (DuckDB)
  • Custom Providers
  • Roadmap
GitHub
GitHub
Home
Get Started
  • Local (DuckDB)
  • GCP (BigQuery)
  • Snowflake Team Collaboration
  • Declarative Airflow
  • Orchestration Export
  • Jenkins CI/CD
  • Universal Pipeline
CLI Reference
  • Overview
  • Architecture
  • GCP (BigQuery)
  • AWS (S3 + Athena)
  • Snowflake
  • Local (DuckDB)
  • Custom Providers
  • Roadmap
GitHub
GitHub
  • Introduction

    • /
    • Getting Started
    • Snowflake Quickstart
    • Vision & Roadmap
  • Walkthroughs

    • Walkthrough: Local Development
    • Walkthrough: Deploy to Google Cloud Platform
    • Walkthrough: Snowflake Team Collaboration
    • Declarative Airflow DAG Generation - The FLUID Way
    • Generating Orchestration Code from Contracts
    • Jenkins CI/CD for FLUID Data Products
    • Universal Pipeline
  • CLI Reference

    • CLI Reference
    • fluid init
    • fluid forge
    • fluid status
    • fluid validate
    • fluid plan
    • fluid apply
    • fluid generate
    • fluid publish
    • fluid market
    • fluid import
    • fluid policy-check
    • fluid diff
    • fluid test
    • fluid verify
    • fluid config
    • fluid split
    • fluid bundle
    • fluid auth
    • fluid doctor
    • fluid providers
    • fluid version
  • Providers

    • Providers
    • Provider Architecture
    • GCP Provider
    • AWS Provider
    • Snowflake Provider
    • Local Provider
    • Creating Custom Providers
    • Provider Roadmap
  • Advanced

    • Blueprints
    • Governance & Compliance
    • Airflow Integration
    • Built-in And Custom Forge Guidance
    • FLUID Forge Contract GPT Packet
    • Forge Discovery Guide
    • Forge Memory Guide
  • Project

    • Contributing to Fluid Forge
    • Fluid Forge Docs Baseline: CLI 0.7.9
    • Fluid Forge v0.7.1 - Multi-Provider Export Release

Snowflake Provider

Deploy data products to Snowflake Data Cloud — databases, schemas, tables, RBAC grants — using the same contract and CLI commands as every other provider.

Status: ✅ Production
Docs Baseline: CLI 0.7.9
Tested Services: Databases, Schemas, Tables, Warehouses, RBAC Grants

Compatibility note

This page preserves some older 0.7.1 examples for backward-compatibility context. Current scaffolds emit fluidVersion: 0.7.2, and new orchestration examples should prefer fluid generate schedule --scheduler airflow.


Overview

The Snowflake provider turns a FLUID contract into real Snowflake infrastructure:

  • ✅ Plan & Apply — Databases, schemas, tables, warehouses
  • ✅ RBAC Compilation — fluid policy-compile generates Snowflake GRANT statements from accessPolicy
  • ✅ Sovereignty Validation — Region constraints enforced before deployment
  • ✅ Orchestration Generation — prefer fluid generate schedule --scheduler airflow for current docs
  • ✅ Governance — Classification, column masking, row-level security, audit labels
  • ✅ Universal Pipeline — Same Jenkinsfile as GCP and AWS — zero provider logic

Choose Your Starting Path

Use Snowflake in one of these two modes:

  • Enterprise recommended path: dbt-snowflake plus explicit environment-specific warehouse, database, schema, and role settings. Start with the billing_history example and the Snowflake quickstart.
  • Minimal starter path: native SQL with the smoke example when you want the smallest contract that still proves auth, validate, plan, apply, and verify.

For production teams, make these explicit per environment:

  • SNOWFLAKE_ACCOUNT
  • SNOWFLAKE_USER
  • SNOWFLAKE_WAREHOUSE
  • SNOWFLAKE_DATABASE
  • SNOWFLAKE_SCHEMA
  • SNOWFLAKE_ROLE

For production and CI, use this authentication order:

  1. SNOWFLAKE_PRIVATE_KEY_PATH for key-pair auth
  2. SNOWFLAKE_OAUTH_TOKEN for federated automation
  3. SNOWFLAKE_AUTHENTICATOR for interactive SSO

Password auth is still supported, but it should be treated as a fallback rather than the default production path.

If no explicit credentials are present, browser SSO is only attempted in an interactive TTY session. Non-interactive runs should supply key-pair, OAuth, or another explicit authenticator instead of relying on browser prompts.

Working Example: Bitcoin Price Tracker

This is a production-tested example that runs end-to-end in Jenkins CI.

Contract

fluidVersion: "0.7.1"
kind: DataProduct
id: crypto.bitcoin_prices_snowflake_governed
name: Bitcoin Price Index (FLUID 0.7.1 + Snowflake + Governance)
description: >
  Real-time Bitcoin price data with comprehensive governance policies
  on Snowflake Data Cloud
domain: finance

tags:
  - cryptocurrency
  - real-time
  - governed
  - gdpr-compliant
  - snowflake

labels:
  cost_center: "CC-1234"
  business_criticality: "high"
  compliance_gdpr: "true"
  compliance_soc2: "true"
  platform: "snowflake"

metadata:
  layer: Gold
  owner:
    team: data-engineering
    email: data-engineering@company.com

# ── Data Sovereignty ──────────────────────────────────────────
sovereignty:
  jurisdiction: "EU"
  dataResidency: true
  allowedRegions:
    - eu-west-1          # Snowflake AWS Europe (Ireland)
    - eu-central-1       # Snowflake AWS Europe (Frankfurt)
    - europe-west4       # Snowflake GCP Europe (Netherlands)
  deniedRegions:
    - us-east-1
    - us-west-2
    - us-central1
  crossBorderTransfer: false
  transferMechanisms:
    - SCCs
  regulatoryFramework:
    - GDPR
    - SOC2
  enforcementMode: advisory
  validationRequired: true

# ── Access Policy: Snowflake RBAC ─────────────────────────────
accessPolicy:
  grants:
    - principal: "role:DATA_ANALYST"
      permissions: [read, select, query]

    - principal: "role:FINANCE_ANALYST"
      permissions: [read, select]

    - principal: "role:TRADER"
      permissions: [read, select, query]

    - principal: "role:DATA_ENGINEER"
      permissions: [write, insert, update, delete, create]

    - principal: "user:looker_service@company.com"
      permissions: [read, select]

# ── Expose: Snowflake Table ───────────────────────────────────
exposes:
  - exposeId: bitcoin_prices_table
    title: "Bitcoin Real-time Price Feed"
    version: "1.0.0"
    kind: table

    binding:
      platform: snowflake
      format: snowflake_table
      location:
        account: "{{ env.SNOWFLAKE_ACCOUNT }}"
        database: "CRYPTO_DATA"
        schema: "MARKET_DATA"
        table: "BITCOIN_PRICES"
      properties:
        cluster_by: ["price_timestamp"]
        table_type: "STANDARD"
        data_retention_time_in_days: 7
        change_tracking: true

    # Governance policies
    policy:
      classification: Internal
      authn: snowflake_rbac
      authz:
        readers:
          - role:DATA_ANALYST
          - role:FINANCE_ANALYST
          - role:TRADER
        writers:
          - role:DATA_ENGINEER
        columnRestrictions:
          - principal: "role:JUNIOR_ANALYST"
            columns: [market_cap_usd, volume_24h_usd]
            access: deny
      privacy:
        masking:
          - column: "ingestion_timestamp"
            strategy: "hash"
            params:
              algorithm: "SHA256"
        rowLevelPolicy:
          expression: >
            price_timestamp >= DATEADD(day, -30, CURRENT_TIMESTAMP())

    # Schema contract
    contract:
      schema:
        - name: price_timestamp
          type: TIMESTAMP_NTZ
          required: true
          description: UTC timestamp when price was recorded
          sensitivity: cleartext
          semanticType: "timestamp"

        - name: price_usd
          type: NUMBER(18,2)
          required: true
          description: Bitcoin price in USD
          sensitivity: cleartext
          semanticType: "currency"

        - name: price_eur
          type: NUMBER(18,2)
          required: false
          description: Bitcoin price in EUR

        - name: price_gbp
          type: NUMBER(18,2)
          required: false
          description: Bitcoin price in GBP

        - name: market_cap_usd
          type: NUMBER(20,2)
          required: false
          description: Total market capitalization in USD
          sensitivity: internal

        - name: volume_24h_usd
          type: NUMBER(20,2)
          required: false
          description: 24-hour trading volume in USD
          sensitivity: internal

        - name: price_change_24h_pct
          type: NUMBER(10,4)
          required: false
          description: 24-hour price change percentage

        - name: last_updated
          type: TIMESTAMP_NTZ
          required: false
          description: Timestamp from CoinGecko API

        - name: ingestion_timestamp
          type: TIMESTAMP_NTZ
          required: true
          description: When data was ingested into our system

# ── Build: API Ingestion ──────────────────────────────────────
builds:
  - id: bitcoin_price_ingestion
    description: Fetch Bitcoin prices from CoinGecko API
    pattern: hybrid-reference
    engine: python
    repository: ./runtime
    properties:
      model: ingest
    execution:
      trigger:
        type: manual
        iterations: 1
        delaySeconds: 3
      runtime:
        platform: snowflake
        resources:
          warehouse: "COMPUTE_WH"
          warehouse_size: "X-SMALL"
      retries:
        count: 3
        backoff: exponential
    outputs:
      - bitcoin_prices_table

Key Schema Patterns

The 0.7.1 binding schema uses three fields to identify platform resources:

FieldPurposeSnowflake Values
binding.platformCloud providersnowflake
binding.formatStorage formatsnowflake_table
binding.locationResource coordinatesaccount, database, schema, table

This is identical to GCP (platform: gcp, format: bigquery_table) and AWS (platform: aws, format: parquet).

CLI Commands

Every normal Snowflake provider command is autodetected from binding.platform, so --provider snowflake is not required for plan, apply, verify, or test.

# Validate Snowflake connectivity with the same config the provider uses
fluid auth status snowflake

# Validate contract shape
fluid validate contract.fluid.yaml

# Generate execution plan
fluid plan contract.fluid.yaml --env dev --out plans/plan-dev.json

# Deploy database, schema, table, and build logic
fluid apply contract.fluid.yaml --env dev --yes

# Verify the deployed Snowflake object against the contract schema
fluid verify contract.fluid.yaml --strict

# Optional: run the live contract test flow
fluid test contract.fluid.yaml

# Validate governance declarations
fluid policy-check contract.fluid.yaml

# Compile RBAC / access bindings from accessPolicy grants
fluid policy-compile contract.fluid.yaml --env dev --out runtime/policy/bindings.json

# Apply RBAC bindings (dry-run or enforce)
fluid policy-apply runtime/policy/bindings.json --mode check
fluid policy-apply runtime/policy/bindings.json --mode enforce

# Generate Airflow DAG
fluid generate-airflow contract.fluid.yaml --out airflow-dags/bitcoin_snowflake.py

Recommended deployment gate for enterprise teams:

  1. fluid validate
  2. fluid plan
  3. fluid policy-check
  4. fluid policy-compile
  5. fluid apply
  6. fluid verify --strict
  7. optional fluid test

Every Snowflake session opened through the provider carries a QUERY_TAG so statements can be attributed in Snowflake QUERY_HISTORY. In practice this means plan/apply/verify traffic can be traced back to the contract and environment that issued it.

RBAC Policy Compilation

fluid policy-compile reads accessPolicy.grants and generates Snowflake GRANT statements:

{
  "provider": "snowflake",
  "bindings": [
    {
      "role": "role:DATA_ANALYST",
      "resource": "bitcoin_prices_table",
      "permissions": [
        "SELECT on CRYPTO_DATA.MARKET_DATA.BITCOIN_PRICES",
        "USAGE on DATABASE CRYPTO_DATA",
        "USAGE on SCHEMA CRYPTO_DATA.MARKET_DATA"
      ]
    },
    {
      "role": "role:DATA_ENGINEER",
      "resource": "bitcoin_prices_table",
      "permissions": [
        "INSERT on CRYPTO_DATA.MARKET_DATA.BITCOIN_PRICES",
        "UPDATE on CRYPTO_DATA.MARKET_DATA.BITCOIN_PRICES",
        "DELETE on CRYPTO_DATA.MARKET_DATA.BITCOIN_PRICES"
      ]
    }
  ]
}

The permission mapping:

Contract PermissionSnowflake GRANT
read, select, querySELECT on table, USAGE on database + schema
write, insertINSERT on table
updateUPDATE on table
deleteDELETE on table

Governance Scope

Use the governance commands this way:

  • fluid policy-check validates governance declarations in the contract.
  • fluid policy-compile and fluid policy-apply manage Snowflake RBAC and access-policy bindings.
  • Snowflake governance during apply handles object-level controls such as tags, descriptions, and masking policies.
  • fluid verify checks deployed schema and drift. It does not perform a full RBAC or entitlement audit.

Credentials Setup

Accepted SNOWFLAKE_ACCOUNT Formats

Forge accepts the common Snowflake account identifier formats that teams usually copy from the Snowflake UI, connector docs, or browser URL and normalizes them before opening the connection.

Accepted examples:

  • org-account
  • xy12345
  • xy12345.eu-central-1
  • xy12345.eu-central-1.aws
  • xy12345.eu-central-1.privatelink
  • https://xy12345.eu-central-1.aws.snowflakecomputing.com
  • https://app-org-account.privatelink.snowflakecomputing.com

Normalization rules:

  • strips https:// and .snowflakecomputing.com
  • strips cloud suffixes such as .aws, .gcp, and .azure
  • preserves .privatelink when it is part of the effective account identifier
  • strips the leading app- prefix from Snowsight-style browser hostnames

Invalid hostnames such as https://example.com fail fast with a validation error instead of being silently misparsed.

Jenkins CI (Recommended)

Create a Jenkins Secret File credential containing your Snowflake env vars:

# File contents (plain key=value, no 'export' prefix)
SNOWFLAKE_ACCOUNT=xy12345.eu-central-1
SNOWFLAKE_USER=FLUID_SERVICE
SNOWFLAKE_PASSWORD=xxxxxxxxxx
SNOWFLAKE_WAREHOUSE=COMPUTE_WH
SNOWFLAKE_ROLE=SYSADMIN
SNOWFLAKE_DATABASE=CRYPTO_DATA
SNOWFLAKE_SCHEMA=MARKET_DATA

The Universal Pipeline auto-detects this format and sources it into every stage. No provider-specific credential logic.

Local Development

# .env file (same format as Jenkins)
cat > .env << 'EOF'
SNOWFLAKE_ACCOUNT=xy12345.eu-central-1
SNOWFLAKE_USER=FLUID_SERVICE
SNOWFLAKE_PASSWORD=xxxxxxxxxx
SNOWFLAKE_WAREHOUSE=COMPUTE_WH
SNOWFLAKE_ROLE=SYSADMIN
SNOWFLAKE_DATABASE=CRYPTO_DATA
SNOWFLAKE_SCHEMA=MARKET_DATA
EOF

# Source and run
set -a; . .env; set +a
fluid auth status snowflake
fluid plan contract.fluid.yaml --env dev --out runtime/plan.json
fluid apply contract.fluid.yaml --env dev --yes
fluid verify contract.fluid.yaml --strict

Session Context Initialization

After connecting, Forge pins the active Snowflake session explicitly using any configured role, warehouse, database, and schema:

USE ROLE <role>;
USE WAREHOUSE <warehouse>;
USE DATABASE <database>;
USE SCHEMA <schema>;

This keeps runtime behavior aligned with the contract and credential settings across local runs, CI, and automation. If any configured value is invalid, Forge fails fast instead of leaving the session half-initialized.

DATABASE and SCHEMA settings may be dot-qualified when Snowflake accepts that shape, for example ANALYTICS.RAW.

Infrastructure Created

When you run fluid apply on a Snowflake contract, the provider creates:

ResourceDetails
DatabaseCRYPTO_DATA
SchemaCRYPTO_DATA.MARKET_DATA
TableCRYPTO_DATA.MARKET_DATA.BITCOIN_PRICES — clustered by price_timestamp
WarehouseCOMPUTE_WH (X-SMALL) — used for queries and ingestion

What the Pipeline Produces

After a successful run, the pipeline inserts real data:

SELECT price_timestamp, price_usd, price_eur, market_cap_usd
FROM CRYPTO_DATA.MARKET_DATA.BITCOIN_PRICES
ORDER BY price_timestamp DESC
LIMIT 5;
┌──────────────────────┬───────────┬───────────┬────────────────┐
│ PRICE_TIMESTAMP      │ PRICE_USD │ PRICE_EUR │ MARKET_CAP_USD │
├──────────────────────┼───────────┼───────────┼────────────────┤
│ 2025-01-30 14:30:52  │ 104809.00 │  96543.00 │ 2075000000.00  │
└──────────────────────┴───────────┴───────────┴────────────────┘

Governance Features

Data Sovereignty

The sovereignty block enforces region restrictions before any infrastructure is deployed:

sovereignty:
  jurisdiction: "EU"
  allowedRegions: [eu-west-1, eu-central-1, europe-west4]
  deniedRegions: [us-east-1, us-west-2, us-central1]
  crossBorderTransfer: false
  regulatoryFramework: [GDPR, SOC2]
  enforcementMode: advisory  # or strict (blocks deployment)

Column-Level Security

Restrict specific columns from specific roles:

authz:
  columnRestrictions:
    - principal: "role:JUNIOR_ANALYST"
      columns: [market_cap_usd, volume_24h_usd]
      access: deny

Privacy Masking

Hash sensitive fields and enforce retention policies:

privacy:
  masking:
    - column: "ingestion_timestamp"
      strategy: "hash"
      params:
        algorithm: "SHA256"
  rowLevelPolicy:
    expression: "price_timestamp >= DATEADD(day, -30, CURRENT_TIMESTAMP())"

Row-level security expressions are intentionally validated against a narrow SQL-expression allowlist before Forge generates Snowflake row access policies. Keep these expressions to predicate-style logic such as comparisons, boolean operators, function calls, and string literals.

Forge rejects or skips unsafe expressions that contain statement separators, SQL comments, or statement-level keywords such as SELECT, USE, GRANT, DROP, or INSERT. When that happens, planning continues and the CLI emits a warning so you can fix the contract instead of applying unsafe SQL.

Snowflake-Native Security

The contract's governance maps naturally to Snowflake's built-in features:

Contract FeatureSnowflake Implementation
accessPolicy.grantsGRANT SELECT/INSERT/UPDATE ON TABLE ... TO ROLE ...
columnRestrictionsDynamic Data Masking policies
rowLevelPolicyRow Access Policies
sovereignty.allowedRegionsAccount region validation
classificationObject tagging via TAG

CI/CD Pipeline

The Snowflake example uses the exact same Jenkinsfile as GCP and AWS — the Universal Pipeline. Key stages:

StageCommandWhat Happens
Validatefluid validateContract checked against 0.7.1 schema
Exportfluid odps export / fluid odcs exportStandards files generated
Compile RBACfluid policy-compileaccessPolicy → Snowflake GRANT bindings
Planfluid planExecution plan generated
Applyfluid applyDatabase + schema + table created
Apply RBACfluid policy-applyRBAC grants enforced
Executefluid executeingest.py runs, inserts rows to Snowflake
Airflow DAGfluid generate-airflowProduction DAG generated

Snowflake Table Properties

The binding.properties block supports Snowflake-specific table features:

binding:
  platform: snowflake
  format: snowflake_table
  location:
    database: "ANALYTICS"
    schema: "MARTS"
    table: "CUSTOMER_METRICS"
  properties:
    cluster_by: ["customer_id", "order_date"]
    table_type: "STANDARD"              # STANDARD or TRANSIENT
    data_retention_time_in_days: 7      # Time Travel retention
    change_tracking: true               # Enable CDC streams

See Also

  • Snowflake Team Collaboration Walkthrough - Role-based PR review example for Snowflake teams
  • Universal Pipeline — Same Jenkinsfile for every provider
  • AWS Provider — Amazon Web Services integration
  • GCP Provider — Google Cloud Platform integration
  • CLI Reference — Full command documentation
Edit this page on GitHub
Last Updated: 4/16/26, 11:38 AM
Contributors: Jeff Watson, jeffwatson-ai
Prev
AWS Provider
Next
Local Provider