Skip to content

[Bug]: Writing a number as a Decimal that exactly fits into its precision results in a decimal overflow #3909

@moritzkoerber

Description

@moritzkoerber

What happened?

Trying to write a number as a Decimal that exactly fits into its precision results in a decimal overflow: DeltaError: Kernel error: Parser error: parse decimal overflow (1.0).

The dataframe still gets written though and the same error appears when reading the table afterwards. Could 1.0 for minValues/maxValues in the log be the troublemaker?

{"metaData":{"id":"1fe1ce02-81e7-4b99-8149-6f9d9ba39b71","name":null,"description":null,"format":{"provider":"parquet","options":{}},"schemaString":"{\"type\":\"struct\",\"fields\":[{\"name\":\"value\",\"type\":\"decimal(1,0)\",\"nullable\":true,\"metadata\":{}}]}","partitionColumns":[],"createdTime":1761913123741,"configuration":{}}}
{"add":{"path":"part-00000-2a134546-2a44-4dd0-8c12-7f2f6ec19f8e-c000.snappy.parquet","partitionValues":{},"size":506,"modificationTime":1761913123812,"dataChange":true,"stats":"{\"numRecords\":1,\"minValues\":{\"value\":1.0},\"maxValues\":{\"value\":1.0},\"nullCount\":{\"value\":0}}","tags":null,"baseRowId":null,"defaultRowCommitVersion":null,"clusteringProvider":null}}
{"commitInfo":{"timestamp":1761913123814,"operation":"WRITE","operationParameters":{"mode":"ErrorIfExists"},"engineInfo":"delta-rs:py-1.2.1","operationMetrics":{"execution_time_ms":76,"num_added_files":1,"num_added_rows":1,"num_partitions":0,"num_removed_files":0},"clientVersion":"delta-rs.py-1.2.1"}}

Note: It does work with a digit of leeway: Decimal(precision=2, scale=0).

The example was run using Python 3.12.9, Polars 1.35.1, and deltalake 1.2.1.

Expected behavior

The delta table gets read and written without an error.

Operating System

macOS

Binding

Python

Bindings Version

1.2.1

Steps to reproduce

import polars
import deltalake

df = polars.DataFrame([1], schema={"value": polars.Decimal(precision=1, scale=0)})
deltalake.write_deltalake("test", df)

Relevant logs

---------------------------------------------------------------------------
DeltaError                                Traceback (most recent call last)
Cell In[11], line 5
      2 import deltalake
      4 df = pl.DataFrame([1], schema={"value": pl.Decimal(precision=1, scale=0)})
----> 5 deltalake.write_deltalake("test", df)

File [~/.cache/uv/archive-v0/HTkrjNT9lZqECGMwExN72/lib/python3.13/site-packages/deltalake/writer/writer.py:147](http://localhost:8888/~/.cache/uv/archive-v0/HTkrjNT9lZqECGMwExN72/lib/python3.13/site-packages/deltalake/writer/writer.py#line=146), in write_deltalake(table_or_uri, data, partition_by, mode, name, description, configuration, schema_mode, storage_options, predicate, target_file_size, writer_properties, post_commithook_properties, commit_properties)
    131     table._table.write(
    132         data=data,
    133         batch_schema=compatible_delta_schema,
   (...)    144         post_commithook_properties=post_commithook_properties,
    145     )
    146 else:
--> 147     write_deltalake_rust(
    148         table_uri=table_uri,
    149         data=data,
    150         batch_schema=compatible_delta_schema,
    151         partition_by=partition_by,
    152         mode=mode,
    153         schema_mode=schema_mode,
    154         predicate=predicate,
    155         target_file_size=target_file_size,
    156         name=name,
    157         description=description,
    158         configuration=configuration,
    159         storage_options=storage_options,
    160         writer_properties=writer_properties,
    161         commit_properties=commit_properties,
    162         post_commithook_properties=post_commithook_properties,
    163     )

DeltaError: Kernel error: Parser error: parse decimal overflow (1.0)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    Projects

    Status

    Done

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions