Skip to content

[Bug]: _delta_log not written with external S3 provider #3925

@AntoineRoll

Description

@AntoineRoll

What happened?

I'm trying to use the delta-rs library to write a Delta table on OVH object storage.

When attempting to execute the following code:

import pandas as pd
import os
from deltalake import write_deltalake

os.environ["AWS_REGION"] = "gra"
os.environ["AWS_ENDPOINT_URL"] = "https://s3.gra.io.cloud.ovh.net"
os.environ["AWS_ACCESS_KEY_ID"] = ""
os.environ["AWS_SECRET_ACCESS_KEY"] = ""

df = pd.DataFrame({'x': [1, 2, 3]})

write_deltalake('s3://opendatacatalog-storage-staging-writer/bronze/test', df)

I'm getting the following error:
write_deltalake_rust(
OSError: Generic S3 error
↳ Error performing PUT https://s3.gra.io.cloud.ovh.net/opendatacatalog-storage-staging-writer/bronze/test/_delta_log/00000000000000000000.json in 1.985295278s, after 10 retries, max_retries
↳ 10, retry_timeout
↳ 180s - Server returned non-2xx status code
↳ 501 Not Implemented

With RUST_LOG=debug I only see those kind of additional lines:

[2025-11-02T19:35:56Z INFO object_store::client::retry] Encountered server error, backing off for 0.44229764 seconds, retry 10 of 10
[2025-11-02T19:35:57Z DEBUG hyper_util::client::legacy::pool] reuse idle connection for ("https", s3.gra.io.cloud.ovh.net)
[2025-11-02T19:35:57Z DEBUG hyper_util::client::legacy::pool] pooling idle connection for ("https", s3.gra.io.cloud.ovh.net)

My main problem is that this is only happening on writes of delta log. The parquet files are correctly written. My rust knowledge is way too limited to debug this, but I've successfully uploaded the delta log using AWS CLI, which leaves me stuck. The issue is likely within the vendor OVHCloud but I can't replicate it outside of the deltalake lib.

Their S3 limitations are described here; https://help.ovhcloud.com/csm/en-gb-public-cloud-storage-s3-limitations?id=kb_article_view&sysparm_article=KB0047378 but I don't see anything that could cause this.

Any help to narrow down the problem would be greatly appreciated.

Expected behavior

  • Delta tables should be correctly written in external S3 vendor
  • There should not be a write error of the transaction log after the data is correctly written
  • The logs should be more verbose regarding the exact request that failed

Operating System

Linux

Binding

Python

Bindings Version

1.2.1

Steps to reproduce

  1. Setup an OVHCloud Object Storage bucket
  2. Run:
import pandas as pd
import os
from deltalake import write_deltalake

BUCKET_NAME = "testXYZ"

os.environ["AWS_REGION"] = "gra"
os.environ["AWS_ENDPOINT_URL"] = "https://s3.gra.io.cloud.ovh.net"
os.environ["AWS_ACCESS_KEY_ID"] = ""
os.environ["AWS_SECRET_ACCESS_KEY"] = ""

df = pd.DataFrame({'x': [1, 2, 3]})

write_deltalake(f's3://{BUCKET_NAME}/test', df)

Relevant logs

Traceback (most recent call last):
  File "/home/antoine/Documents/OpenDataCatalog/debug_s3.py", line 13, in <module>
    write_deltalake('s3a://opendatacatalog-storage-staging-writer/bronze/test', df)
  File "/home/antoine/Documents/OpenDataCatalog/.venv/lib/python3.11/site-packages/deltalake/writer/writer.py", line 147, in write_deltalake
    write_deltalake_rust(
OSError: Generic S3 error
          ↳ Error performing PUT https://s3.gra.io.cloud.ovh.net/opendatacatalog-storage-staging-writer/bronze/test/_delta_log/00000000000000000000.json in 4.637281632s, after 10 retries, max_retries
           ↳ 10, retry_timeout
            ↳ 180s  - Server returned non-2xx status code
             ↳ 501 Not Implemented

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    Projects

    Status

    Done

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions