-
What happened?Recently we ran into an issue with a corrupted delta log checkpoint on a table backed by Google Cloud Buckets. Our delta tables stopped loading with following errors: We inspected the lastest checkpoint and found some page in the middle is bad I am trying to figure out if delta-rs makes use of checksums when uploading since some object stores like GCS can validate them before committing the blobs https://docs.cloud.google.com/storage/docs/data-validation#server-validation Expected behaviorEnsuring uploads don't get corrupted because of external factors Operating SystemNone BindingNone Bindings VersionNo response Steps to reproduceNo response Relevant logs |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
|
There is nothing to my knowledge that passes along checksums with the ObjectStore.put used underneath delta-rs (in ParquetObjectWriter It's hard for me to imagine how to accomplish this in delta-rs since the entirety of the parquet write is offloaded into the parquet crate and the checksum of the file would not be knowable without a complete seralization of the parquet buffer into memory 😦 |
Beta Was this translation helpful? Give feedback.
There is nothing to my knowledge that passes along checksums with the ObjectStore.put used underneath delta-rs (in ParquetObjectWriter
It's hard for me to imagine how to accomplish this in delta-rs since the entirety of the parquet write is offloaded into the parquet crate and the checksum of the file would not be knowable without a complete seralization of the parquet buffer into memory 😦