Problem
dlt applies column description hints to BigQuery only when creating a table (CREATE TABLE) or adding new columns (ALTER TABLE ADD COLUMN). If descriptions are added to the schema after the table already exists, they are never propagated to BigQuery on subsequent pipeline runs.
This happens because _get_table_update_sql in job_client_impl.py only processes new_columns (the schema diff). Existing columns whose metadata changed (e.g. description added) are not included.
Expected behavior
When a column's description changes in the dlt schema, the next pipeline run should emit an ALTER TABLE ... ALTER COLUMN ... SET OPTIONS(description=...) statement to update the BigQuery column metadata.
This is a metadata-only change (no data modification) and is safe to apply unconditionally.
Reproduction
- Create a pipeline with a resource that writes to BigQuery, with column descriptions in the schema
- Run the pipeline — descriptions are applied correctly on CREATE TABLE
- Remove or change a description in the schema, run again — the BQ column description is NOT updated
- Alternatively: create the table first without descriptions, add descriptions later — they never appear in BQ
Affected code
dlt/destinations/job_client_impl.py — _get_table_update_sql only handles new_columns
dlt/destinations/impl/bigquery/bigquery.py — _get_column_def_sql correctly generates OPTIONS with description, but is only called for new/created columns
Suggested fix
In the BigQuery SQL client, when generate_alter=True, compare existing column descriptions with the schema and emit ALTER TABLE ... ALTER COLUMN <col> SET OPTIONS(description=...) for any changes. This could be added as a BigQuery-specific override since the SET OPTIONS syntax is BQ-specific.
Environment
- dlt version: 1.24.0
- Destination: BigQuery
- Schema source: import_schema_path YAML files
Problem
dlt applies column
descriptionhints to BigQuery only when creating a table (CREATE TABLE) or adding new columns (ALTER TABLE ADD COLUMN). If descriptions are added to the schema after the table already exists, they are never propagated to BigQuery on subsequent pipeline runs.This happens because
_get_table_update_sqlinjob_client_impl.pyonly processesnew_columns(the schema diff). Existing columns whose metadata changed (e.g. description added) are not included.Expected behavior
When a column's
descriptionchanges in the dlt schema, the next pipeline run should emit anALTER TABLE ... ALTER COLUMN ... SET OPTIONS(description=...)statement to update the BigQuery column metadata.This is a metadata-only change (no data modification) and is safe to apply unconditionally.
Reproduction
Affected code
dlt/destinations/job_client_impl.py—_get_table_update_sqlonly handlesnew_columnsdlt/destinations/impl/bigquery/bigquery.py—_get_column_def_sqlcorrectly generates OPTIONS with description, but is only called for new/created columnsSuggested fix
In the BigQuery SQL client, when
generate_alter=True, compare existing column descriptions with the schema and emitALTER TABLE ... ALTER COLUMN <col> SET OPTIONS(description=...)for any changes. This could be added as a BigQuery-specific override since theSET OPTIONSsyntax is BQ-specific.Environment