Skip to content

Error while writing Pandas DataFrame to Delta Lake (S3) #2051

@vinniepsychosis

Description

@vinniepsychosis

write_deltalake isn't working as expected

I encountered an error while calling write_deltalake method:

ValueError : you must provide schema if data is iterable

Even though it worked perfectly with a PyArrow Table, it didn't work for Pandas DataFrame.

How to reproduce it:

# To reproduce it, just try to write a Pandas DataFrame to an S3 Bucket

write_deltalake(
    <your_s3_url>,
    data = df,
    storage_options = storage_options,
    overwrite_schema = True,
    mode = 'overwrite'
)

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions