Skip to content

Error writing delta table with Null columns #3316

@topagarwal

Description

@topagarwal

Environment

Delta-rs version: v 0.17.0

Binding:

Environment:

  • Cloud provider:
  • OS:
  • Other:

Bug

What happened:
I have created an empty delta table with schema that has non-nullable fields. When writing panda data frame to delta lake, I ensure that there is no null value for the non-null fields. I keep getting Schema Mis-match error.

If we change the schema to expect all nullable fields (tested writing delta table to a new location without specifying the schema and it worked. In that case, be default all fields are nullable). however we want to define certain fields as non-nullable.

What you expected to happen:
Expected that delta table will be appended with the data.

How to reproduce it:

More details:

**Actual Data that I am writing. Picked 2 rows from pyarrow table and converted them to panda data frame

namespace                           ki_record_name work_center  ...                                   kt_step_results               mi_updated_at                      mi_updated_by
0       TE  2684e41e-82c5-4f07-8753-e6b9ac20.zip    S-0010  ...  [{"filename": "gots", "sequenceName": "MainSeq... 2025-03-10 17:03:10.515000+00:00  ingestion-adapter-etl

1       TE  e2be8360-30b0-4fd5-ae2b-4cd6aacb.zip    S-0006  ...  [{"filename": "VIBE", "sequenceName": "Mai... 2025-03-10 17:03:10.515000+00:00  ingestion-adapter-etl
python/3.12.7/lib/python3.12/site-packages/deltalake/writer.py", line 351, in write_deltalake
    raise ValueError(
ValueError: Schema of data does not match table schema

Data schema:
namespace: string
ki_record_name: string
work_center: string
kt_config: string
kt_parameters: string
mi_updated_at: timestamp[us, tz=UTC]
mi_updated_by: string

Table Schema:

namespace: string
ki_record_name: string
work_center: string not null
kt_config: string
kt_parameters: string
mi_updated_at: timestamp[us, tz=UTC] not null
  -- field metadata --
  comment: '"The time this record was updated"'
mi_updated_by: string not null
  -- field metadata --
  comment: '"The process that updated this record"'

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions