Fluid Forge
Get Started
See it run
  • Local (DuckDB)
  • Source-Aligned (Postgres → DuckDB)
  • AI Forge + Data Models
  • GCP (BigQuery)
  • Snowflake Team Collaboration
  • Declarative Airflow
  • Orchestration Export
  • Jenkins CI/CD
  • Universal Pipeline
  • 11-Stage Production Pipeline
  • Catalog Forge End-to-End
CLI Reference
  • Overview
  • Quickstart
  • Examples
  • Your own CI
  • Your own scaffolding
  • Custom validator
  • Apply hook
  • Reference
Demos
  • Overview
  • Architecture
  • GCP (BigQuery)
  • AWS (S3 + Athena)
  • Snowflake
  • Local (DuckDB)
  • Custom Providers
  • Roadmap
GitHub
GitHub
Get Started
See it run
  • Local (DuckDB)
  • Source-Aligned (Postgres → DuckDB)
  • AI Forge + Data Models
  • GCP (BigQuery)
  • Snowflake Team Collaboration
  • Declarative Airflow
  • Orchestration Export
  • Jenkins CI/CD
  • Universal Pipeline
  • 11-Stage Production Pipeline
  • Catalog Forge End-to-End
CLI Reference
  • Overview
  • Quickstart
  • Examples
  • Your own CI
  • Your own scaffolding
  • Custom validator
  • Apply hook
  • Reference
Demos
  • Overview
  • Architecture
  • GCP (BigQuery)
  • AWS (S3 + Athena)
  • Snowflake
  • Local (DuckDB)
  • Custom Providers
  • Roadmap
GitHub
GitHub
  • Introduction

    • Home
    • Getting Started
    • Snowflake Quickstart
    • See it run
    • Forge Data Model
    • Vision & Roadmap
    • Playground
    • FAQ
  • Concepts

    • Concepts
    • Builds, Exposes, Bindings
    • What is a contract?
    • Quality, SLAs & Lineage
    • Governance & Policy
    • Agent Policy (LLM/AI governance)
    • Providers vs Platforms
    • Fluid Forge vs alternatives
  • Data Products

    • Product Types — SDP, ADP, CDP
  • Walkthroughs

    • Walkthrough: Local Development
    • Source-Aligned: Postgres → DuckDB → Parquet
    • AI Forge And Data-Model Journeys
    • Walkthrough: Deploy to Google Cloud Platform
    • Walkthrough: Snowflake Team Collaboration
    • Declarative Airflow DAG Generation - The FLUID Way
    • Generating Orchestration Code from Contracts
    • Jenkins CI/CD for FLUID Data Products
    • Universal Pipeline
    • The 11-Stage Pipeline
    • End-to-End Walkthrough: Catalog → Contract → Transformation
  • CLI Reference

    • CLI Reference
    • fluid init
    • fluid demo
    • fluid forge
    • fluid skills
    • fluid status
    • fluid validate
    • fluid plan
    • fluid apply
    • fluid generate
    • fluid generate artifacts
    • fluid validate-artifacts
    • fluid verify-signature
    • fluid generate-airflow
    • fluid generate-pipeline
    • fluid viz-graph
    • fluid odps
    • fluid odps-bitol
    • fluid odcs
    • fluid export
    • fluid export-opds
    • fluid publish
    • fluid datamesh-manager
    • fluid market
    • fluid import
    • fluid policy
    • fluid policy check
    • fluid policy compile
    • fluid policy apply
    • fluid contract-tests
    • fluid contract-validation
    • fluid diff
    • fluid test
    • fluid verify
    • fluid product-new
    • fluid product-add
    • fluid workspace
    • fluid ide
    • fluid ai
    • fluid memory
    • fluid mcp
    • fluid scaffold-ci
    • fluid scaffold-composer
    • fluid scaffold-ide
    • fluid docs
    • fluid config
    • fluid split
    • fluid bundle
    • fluid auth
    • fluid doctor
    • fluid providers
    • fluid provider-init
    • fluid roadmap
    • fluid version
    • fluid runs
    • fluid retention
    • fluid secrets
    • fluid stats
    • fluid contract
    • fluid ship
    • fluid rollback
    • fluid schedule-sync
    • Catalog adapters

      • Source Catalog Integration (V1.5)
      • BigQuery Catalog
      • Snowflake Horizon Catalog
      • Databricks Unity Catalog
      • Google Dataplex Catalog
      • AWS Glue Data Catalog
      • DataHub Catalog
      • Data Mesh Manager Catalog
    • CLI by task

      • CLI by task
      • Add quality rules
      • Add agent governance
      • Debug a failed pipeline run
      • Switch clouds with one line
  • Recipes

    • Recipes
    • Recipe — add a quality rule
    • Recipe — switch clouds with one line
    • Recipe — tag PII in your schema
  • SDK & Plugins

    • SDK & Plugins
    • Quickstart — your first plugin
    • Examples

      • Runnable examples
      • Example: hello-scaffold — the minimal viable plugin
      • Example: gitlab-ci-scaffold — generate a complete CI project
      • Example: steward-validator — a custom governance rule
      • Example: prod-key-guard — apply-time invariant check
    • Journeys

      • Journeys
      • Your own CI/CD

        • You have your own CI/CD setup, no problem
        • GitLab CI — the bundle template
        • GitHub Actions — the bundle template
        • Jenkins — the bundle template
        • CircleCI — the bundle template
      • You have a strict project layout, no problem
      • You have governance rules, no problem
      • You want a check at apply time, no problem
    • Reference

      • Reference
      • Roles reference
      • Entry points reference
      • Trust model
      • Packaging
      • Companion packages
  • Providers

    • Providers
    • Provider Architecture
    • GCP Provider
    • AWS Provider
    • Snowflake Provider
    • Local Provider
    • Creating Custom Providers
    • Provider Roadmap
  • Advanced

    • Blueprints
    • Governance & Compliance
    • Airflow Integration
    • Built-in And Custom Forge Guidance
    • FLUID Forge Contract GPT Packet
    • Forge Discovery Guide
    • Forge Memory Guide
    • LLM Providers
    • Capability Warnings
    • LiteLLM Backend (opt-in)
    • MCP Server
    • Credential Resolver — Security Model
    • Cost Tracking
    • Agentic Primitives
    • Typed Errors
    • Typed CLI Errors
    • Authoring Forge Tools
    • Source-Aligned Acquisition
    • API Stability — fluid_build.api
    • Guided fluid forge UX
    • V1.5 Catalog Integration — Architecture Deep-Dive
    • V1.5 + V2 Hardening — Release Notes
  • Project

    • Contributing to Fluid Forge
    • Fluid Forge Docs Baseline: CLI 0.8.3
    • Fluid Forge Docs Baseline: CLI 0.8.0
    • Fluid Forge Docs Baseline: CLI 0.7.11
    • Fluid Forge Docs Baseline: CLI 0.7.9
    • Fluid Forge v0.7.1 - Multi-Provider Export Release

Google Dataplex Catalog

Source-side catalog adapter for Google Dataplex — Google Cloud's universal metadata catalog. Wraps three Dataplex SDK clients (catalog, lineage, glossary) so one adapter call materialises all three signal sources.

Recommended for: GCP-native teams using Dataplex for governance. Pairs with BigQuery for richer warehouse metadata

  • Dataplex's aspect-types (data-quality scores, freshness SLAs) and business glossary.

Install

pip install "data-product-forge[gcp]"

Same extra as BigQuery — installs google-cloud-dataplex and google-cloud-bigquery together.

Privileges to grant

The adapter is read-only on metadata.

# Required: read entries (tables) from any registered entry-group.
gcloud projects add-iam-policy-binding my-proj \
  --member="user:analyst@example.com" \
  --role="roles/dataplex.metadataReader"

# Optional: read lineage links.
gcloud projects add-iam-policy-binding my-proj \
  --member="user:analyst@example.com" \
  --role="roles/datalineage.viewer"

# Optional: read business glossary.
gcloud projects add-iam-policy-binding my-proj \
  --member="user:analyst@example.com" \
  --role="roles/dataplex.glossaryReader"

If lineage / glossary roles are missing, the adapter soft-fails on those reads (forge still works).

Authentication methods

Same as BigQuery — Dataplex uses the same Google auth stack.

MethodWhen to useSetup
adc ★Defaultgcloud auth application-default login.
service_account_jsonCIPath to SA JSON key file.
service_account_emailGKE / Cloud RunEmail of workload-identity SA.

Setup

fluid ai setup --source dataplex --name dataplex-prod
# ? Catalog: dataplex
# ? Project: my-proj
# ? Location: EU                   (or US, asia-northeast1, ...)
# ? Auth method:
#   ★ adc (recommended)
#     service_account_json
#     service_account_email
# ? Default entry-group: @bigquery (default; matches BQ-imported entries)
# ✓ Saved to ~/.fluid/sources.yaml

Three-client construction

When the adapter constructs SDK clients, it materialises all three at once (catalog / lineage / glossary). They're cached on the adapter instance — second call returns the cached dict. This pattern keeps the construction code in one place; per-call lifecycle is managed by fluid mcp serve (server exits when stdin closes; no long-lived state).

End-to-end demo

fluid ai setup --source dataplex --name dataplex-prod

# Use the default entry group (@bigquery — covers all BQ tables).
fluid forge data-model from-source \
  --source dataplex \
  --credential-id dataplex-prod \
  --database my-proj --schema analytics \
  --technique data-vault-2 \
  -o analytics.fluid.yaml

# Or scope by an explicit entry group:
fluid forge data-model from-source \
  --source dataplex \
  --credential-id dataplex-prod \
  --catalog '@custom-entry-group' \
  --database my-proj \
  -o custom.fluid.yaml

What lands where

Dataplex sourceForge output
Entry displayNameOSIDataset.fields[].expression.description
Entry descriptionOSIDataset.fields[].expression.description
Entry aspect_types[].record (DQ scores, freshness SLA)exposes[].qos (Fluid contract)
Glossary termsOSI.ai_context.synonyms + examples
Lineage linksmetadata.lineage.upstream[] + DV2 link inference
Aspect type governance.classificationagentPolicy.sensitiveData[]

Soft-fail on optional reads

If the lineage API isn't enabled on the project, or the glossary read role is missing, the adapter returns empty results for those reads instead of erroring. The whole forge isn't blocked — you get a working contract, just without lineage / glossary signal.

# Enable lineage + glossary explicitly:
gcloud services enable datalineage.googleapis.com dataplex.googleapis.com

Common errors

CatalogConfigError: google-cloud-dataplex missing

Run pip install "data-product-forge[gcp]".

CatalogPermissionError: roles/dataplex.metadataReader required

Suggestion list contains the IAM binding command.

Glossary terms come back empty

Likely no glossary configured in the project, or missing roles/dataplex.glossaryReader. Adapter soft-fails — forge still works.

Lineage entries come back empty

Likely the Lineage API isn't enabled on the project, or the roles/datalineage.viewer role is missing. Adapter soft-fails.

See also

  • Catalog index
  • BigQuery catalog — pairs with Dataplex
  • GCP provider page
Edit this page on GitHub
Last Updated: 5/17/26, 6:10 PM
Contributors: fas89, Claude Opus 4.7, Claude Opus 4.7 (1M context)
Prev
Databricks Unity Catalog
Next
AWS Glue Data Catalog