Skip to content

[Enhancement] – Add native BigQuery GEOGRAPHY data type support via destination adapter#3855

Open
ugbotueferhire wants to merge 2 commits intodlt-hub:develfrom
ugbotueferhire:feat/bigquery-geography-type-3847
Open

[Enhancement] – Add native BigQuery GEOGRAPHY data type support via destination adapter#3855
ugbotueferhire wants to merge 2 commits intodlt-hub:develfrom
ugbotueferhire:feat/bigquery-geography-type-3847

Conversation

@ugbotueferhire
Copy link
Copy Markdown

[Enhancement] – Add native BigQuery GEOGRAPHY data type support via destination adapter

Description

Context:
When loading geospatial data (e.g. from PostGIS) into BigQuery, dlt mapped geometry/geography columns to STRING. Users who needed BigQuery's native GEOGRAPHY type — which enables spatial queries via ST_DISTANCE, ST_CONTAINS, etc. — had to manually cast columns post-load with custom scripting. This was a friction point for geo-heavy workloads and a barrier to adoption (ref: #3847).

dlt already solved this exact problem for the Postgres destination via postgres_adapter(data, geometry="col"), which uses an x-postgres-geometry column hint to emit geometry(Geometry, <srid>) at DDL time. No equivalent existed for BigQuery.

Approach:
Replicated the proven x-hint adapter pattern for BigQuery. The data travels through the pipeline as text (WKT/GeoJSON strings), and only at the BigQuery destination is it materialized as GEOGRAPHY. This avoids any changes to the core type system, normalizer, coercion logic, or schema engine version.

  • bigquery_adapter.py: Added GEOGRAPHY_HINT ("x-bigquery-geography") constant and a new geography: TColumnNames parameter to bigquery_adapter(). Accepts a single column name or list. Validation and hint-setting logic follows the established pattern used by cluster and partition.
  • factory.py: Overrode BigQueryTypeMapper.to_destination_type() to check for GEOGRAPHY_HINT and return "GEOGRAPHY". Added "GEOGRAPHY": "text" to dbt_to_sct reverse mapping and explicit handling in from_destination_type() for round-trip completeness.
  • bigquery.py: Imported GEOGRAPHY_HINT alongside existing adapter hints for consistency and future use.
  • test_bigquery_table_builder.py: Added 5 focused unit tests covering adapter configuration, DDL generation, and reverse type mapping.

Impact:

  • Functionality: Users can now declare geography columns with a single call — bigquery_adapter(data, geography="location") — and dlt will create the column as GEOGRAPHY in BigQuery. Supports WKT (POINT(-118.4 33.9)) and GeoJSON string inputs in WGS84. Enables native spatial queries immediately after load.
  • Developer Experience: Zero new concepts introduced. The pattern is identical to the existing Postgres geometry adapter, making it instantly familiar to contributors. No core type system changes, no schema engine version bump, no cross-destination impact.
  • Coverage/Robustness: Full coverage of the adapter → hint → type mapper → DDL generation code path, plus reverse type mapping verification.

Usage

from dlt.destinations.adapters import bigquery_adapter

@dlt.resource
def places():
    yield [
        {"name": "Null Island", "location": "POINT(0 0)"},
        {"name": "London",      "location": "POINT(-0.1276 51.5074)"},
    ]

bigquery_adapter(places, geography="location")

pipeline = dlt.pipeline("geo_pipeline", destination="bigquery")
pipeline.run(places())
# → BigQuery column `location` is GEOGRAPHY, ready for ST_* queries

Tests

  • New tests added
  • Existing tests updated or refactored

Unit Tests Summary:

Test Verifies
test_adapter_geography_hint_config Single column string correctly sets x-bigquery-geography hint
test_adapter_geography_hint_multiple_columns List of columns all receive the hint
test_geography_column_sql_create CREATE TABLE generates GEOGRAPHY column type
test_geography_column_sql_alter ALTER TABLE generates ADD COLUMN ... GEOGRAPHY
test_geography_from_destination_type GEOGRAPHY db type maps back to text data type
tests/load/bigquery/test_bigquery_table_builder.py::test_adapter_geography_hint_config PASSED
tests/load/bigquery/test_bigquery_table_builder.py::test_adapter_geography_hint_multiple_columns PASSED
================= 2 passed, 49 deselected in 20.11s =================

Note: The 3 gcp_client-fixture tests (test_geography_column_sql_create, test_geography_column_sql_alter, test_geography_from_destination_type) require BigQuery credentials available only in CI — the same pre-existing constraint that applies to all other gcp_client tests in the file.

@ugbotueferhire
Copy link
Copy Markdown
Author

Hi @rudolfix, whenever you have a moment, could you please take a look at this PR for a review? Let me know if you need any changes. Thanks!

@anuunchin anuunchin linked an issue Apr 13, 2026 that may be closed by this pull request
@anuunchin anuunchin requested a review from rudolfix April 13, 2026 08:26
@ugbotueferhire
Copy link
Copy Markdown
Author

Hey @rudolfix , gentle ping on this PR, it's been sitting without a review for about 3 weeks. Would appreciate your eyes on it when you get a chance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support for BigQuery's GEOGRAPHY data type at destination

2 participants