Skip to content

chore: removing APIs and deprecation warnings: 0.30.x here we come#3962

Merged
rtyler merged 14 commits intodelta-io:mainfrom
rtyler:next-version-three-zero
Dec 14, 2025
Merged

chore: removing APIs and deprecation warnings: 0.30.x here we come#3962
rtyler merged 14 commits intodelta-io:mainfrom
rtyler:next-version-three-zero

Conversation

@rtyler
Copy link
Copy Markdown
Member

@rtyler rtyler commented Dec 2, 2025

💣 💥

This is a pretty massive change, regrettably, but moves a tremendous amount of code from the 🦀 from using conventional/goofy &str implementations to using url::Url. This ended up surfacing a number of Url oddities that needed to be accounted for, most notably the semantic importance of a trailing / on our use of Url since we heavily rely on Url::join.

There are additional changes here as a result of a Rust edition upgrade to 2024 and the corresponding lints and clippy checks.

To boot, a lot of deprecated code is hereby removed in this change, thus the big negative number 🔨

@github-actions github-actions bot added binding/python Issues for the Python package binding/rust Issues for the Rust crate delta-inspect labels Dec 2, 2025
@codecov
Copy link
Copy Markdown

codecov bot commented Dec 2, 2025

Codecov Report

❌ Patch coverage is 83.09859% with 180 lines in your changes missing coverage. Please review.
✅ Project coverage is 74.50%. Comparing base (3469c37) to head (80ab73b).
⚠️ Report is 14 commits behind head on main.

Files with missing lines Patch % Lines
crates/core/src/kernel/models/actions.rs 0.00% 18 Missing ⚠️
python/src/lib.rs 0.00% 15 Missing ⚠️
crates/lakefs/src/logstore.rs 53.33% 12 Missing and 2 partials ⚠️
crates/aws/src/lib.rs 79.16% 10 Missing ⚠️
crates/core/src/lib.rs 84.84% 0 Missing and 10 partials ⚠️
crates/core/src/kernel/schema/cast/merge_schema.rs 0.00% 7 Missing ⚠️
crates/core/src/operations/write/mod.rs 68.18% 4 Missing and 3 partials ⚠️
crates/test/src/read.rs 76.66% 0 Missing and 7 partials ⚠️
crates/core/src/logstore/mod.rs 93.82% 5 Missing ⚠️
crates/core/src/operations/vacuum.rs 50.00% 5 Missing ⚠️
... and 40 more
Additional details and impacted files
@@             Coverage Diff             @@
##             main    #3962       +/-   ##
===========================================
+ Coverage   26.22%   74.50%   +48.28%     
===========================================
  Files         124      152       +28     
  Lines       19885    39952    +20067     
  Branches    19885    39952    +20067     
===========================================
+ Hits         5214    29765    +24551     
+ Misses      14301     8854     -5447     
- Partials      370     1333      +963     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@rtyler rtyler force-pushed the next-version-three-zero branch 2 times, most recently from 8c9e6e6 to b5e4cd0 Compare December 2, 2025 14:49
Copy link
Copy Markdown
Collaborator

@roeap roeap left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

first pass over pushed changes. Just one question.

we need some more clippy, but awesome to get rid of some of this.

Comment thread crates/core/src/table/mod.rs
@rtyler rtyler force-pushed the next-version-three-zero branch 2 times, most recently from 059834e to 659b756 Compare December 3, 2025 15:59
@corwinjoy
Copy link
Copy Markdown
Contributor

Cool! If you need my help I can rerun somthing like this PR to remove the other clippy warnings.
#3940

@rtyler rtyler force-pushed the next-version-three-zero branch from 659b756 to 6740f2f Compare December 5, 2025 15:26
@github-actions github-actions bot added the documentation Improvements or additions to documentation label Dec 5, 2025
@rtyler rtyler force-pushed the next-version-three-zero branch 3 times, most recently from 07e5913 to 30dc809 Compare December 12, 2025 17:08
Signed-off-by: R. Tyler Croy <rtyler@brokenco.de>
Signed-off-by: R. Tyler Croy <rtyler@brokenco.de>
When using LogStoreConfig it is possible to pass a Url which does allow
for safe `join()` operations inside of delta-kernel-rs.

This change ensures that we always have a trailing slash before passing
things off into kernel

Signed-off-by: R. Tyler Croy <rtyler@brokenco.de>
Signed-off-by: R. Tyler Croy <rtyler@brokenco.de>
Signed-off-by: R. Tyler Croy <rtyler@brokenco.de>
This is almost entirely formatting changes, which is a _wee_ bit
annoying. There were no practical code changes required however.

Signed-off-by: R. Tyler Croy <rtyler@brokenco.de>
Signed-off-by: R. Tyler Croy <rtyler@brokenco.de>
Signed-off-by: R. Tyler Croy <rtyler@brokenco.de>
Every trailing slashes cause all sorts of subtle equivalency confusion
when we do things with a Url. It's better for us to normalize everything
to always have a trailing slash which makes it easier to join with, and
do other things.

Signed-off-by: R. Tyler Croy <rtyler@brokenco.de>
@rtyler rtyler force-pushed the next-version-three-zero branch from 30dc809 to df3739a Compare December 14, 2025 13:51
@rtyler rtyler marked this pull request as ready for review December 14, 2025 15:42
@rtyler rtyler requested a review from ion-elgreco as a code owner December 14, 2025 15:42
@rtyler rtyler marked this pull request as draft December 14, 2025 15:44
@rtyler rtyler marked this pull request as ready for review December 14, 2025 15:53
@rtyler rtyler enabled auto-merge (rebase) December 14, 2025 15:53
@rtyler rtyler force-pushed the next-version-three-zero branch 2 times, most recently from ee617c8 to da5c1cc Compare December 14, 2025 16:24
In this commit I am intentionally introducing the convention in our Rust
APIs:

  * anything named `uri` is expected to take a `AsRef<str>` type which
    is expected to turn into a `Url`
  * anything named `url` is expected to take a `Url` type.

As such I have renamed a number of our APIs which previously had been
converted to use `Url` but were referring to them as `uri`

Signed-off-by: R. Tyler Croy <rtyler@brokenco.de>
@rtyler rtyler force-pushed the next-version-three-zero branch from da5c1cc to d107d57 Compare December 14, 2025 16:33
match operation_id {
Some(op) => self.get_transaction_url(op, self.config.location().to_string()),
None => Err(DeltaTableError::InvalidData {
violations: vec!["LakeFS must use operation_ids for operations".into()],
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small nit, but the nested error doesn't surface now which gives a hint in which area of the code it happens

roeap
roeap previously approved these changes Dec 14, 2025
Copy link
Copy Markdown
Collaborator

@roeap roeap left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉

Comment thread crates/aws/src/lib.rs
Comment on lines +441 to +442
// NOTE: the lack of trailing slashes is a load-bearing implementation
// detail between the Delta/Spark and delta-rs S3DynamoDbLogStore
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤣

Comment thread crates/core/src/table/mod.rs Outdated
Url::parse("s3://bucket/prefix with space/").unwrap(),
"/prefix%20with%20space/",
),
//(Url::parse("s3://bucket/special&chars/你好/😊").unwrap(), "/special&chars/你好/😊/"),
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this work, or should we remove it?

Comment thread crates/core/tests/integration.rs
Comment thread crates/lakefs/src/logstore.rs
This is a hilariously invasive change insofar that we have URL-like
strings leaking out all over the place in our codebase. This channge
attempts to wrangle as much of that together as possible and has test
API changes which support the preceeding commit where _uri functions
become _url

Signed-off-by: R. Tyler Croy <rtyler@brokenco.de>
The URLs used when writing entries into the table should always be
normalized to ensure that the value of table_url() is identical to the
one that is being written into the DynamoDB table.

There may be some more testing required with the Delta/Spark
implementation of S3DynamoDbLogStore to ensure that they are always generating
URLs to the table with trailing slashes. Delta/Spark stores paths as
[org.apache.hadoop.fs.Path](https://hadoop.apache.org/docs/stable/api/org/apache/hadoop/fs/Path.html)
which should have URL-like semantics. Without treating trailing slashes
as required, this could lead to inconsistencies between the two
implementations.

Signed-off-by: R. Tyler Croy <rtyler@brokenco.de>
Any places where we see hard-coded string formatting (e.g. format!())
around Url is likely to be **wrong** and need to be removed.

This commit fixes the failing integration tests with the lakefs
connector

Signed-off-by: R. Tyler Croy <rtyler@brokenco.de>
…ad-bearing

For better or worse, the trailing slashes on the `tablePath` DynamoDB
items are semantically important and a trailing slash coming from
deltalake-core causes lookup failures with multiple writers between
Spark and Rust.

This change ensures that the S3DynamoDbLogStore is always removing the
trailing slash should it exist.

Signed-off-by: R. Tyler Croy <rtyler@brokenco.de>
@rtyler rtyler force-pushed the next-version-three-zero branch from e64c8e1 to 80ab73b Compare December 14, 2025 16:53
Comment thread crates/core/src/logstore/mod.rs Outdated
}

impl LogStoreConfig {
pub fn new(mut location: Url, options: StorageConfig) -> Self {
if !location.path().ends_with('/') {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why don't we use normalize_table_url here?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is probably one of the many ones that was already sprinkled through the code. but would be good to update as well.

Copy link
Copy Markdown
Collaborator

@hntd187 hntd187 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made it. LGTM, I left some nits, but approved whether you fix them or not.

@@ -261,7 +265,7 @@ async fn test_abort_commit_entry() -> TestResult<()> {
.await?;

// The entry should have been aborted - the latest entry should be one version lower
if let Some(new_entry) = client.get_latest_entry(&table.table_uri()).await? {
if let Some(new_entry) = client.get_latest_entry(&table.table_url().as_str()).await? {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just the pattern is gonna be to use .as_str() and such?

@@ -336,7 +344,7 @@ async fn test_concurrent_writers() -> TestResult<()> {
println!(">>> preparing table");
let table = prepare_table(&context, "concurrent_writes").await?;
println!(">>> table prepared");
let table_uri = table.table_uri();
let table_uri = table.table_url();
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

URI or URL?

@@ -72,7 +72,7 @@ async fn test_object_store_onelake_abfs() -> TestResult {
#[allow(dead_code)]
async fn read_write_test_onelake(context: &IntegrationContext, path: &Path) -> TestResult {
let table_uri = Url::parse(&context.root_uri()).unwrap();
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uri or URL?

let delta_scan = DeltaScan::new(
&wire.table_url,
wire.config,
(*inputs)[0].clone(),
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line totally makes sense, but I know we can't do much here.

}
}

#[non_exhaustive]
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this mean here?

@@ -421,7 +416,7 @@ impl std::future::IntoFuture for WriteBuilder {

fn into_future(self) -> Self::IntoFuture {
let this = self;
let table_uri = this.log_store.root_uri();
let table_uri = this.log_store.root_url().clone();
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uri or URL?

DeltaTable::new(log_store, Default::default()),
)
} else {
let storage_url =
ensure_table_uri(self.location.clone().ok_or(CreateError::MissingLocation)?)?;
(
storage_url.as_str().to_string(),
DeltaTableBuilder::from_uri(storage_url)?
storage_url.clone(),
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

never thought I'd rather see a clone than what it used to be

let trimmed_path = url.path().trim_end_matches('/').to_owned();
url.set_path(&trimmed_path);
Ok(url)
// We should always be normalizing the table URL because trailing or redundant slashes can be
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

glad to see this go away

@@ -140,45 +137,6 @@ impl DeltaTableState {
self.snapshot.snapshot().tombstones(log_store)
}

/// Full list of add actions representing all parquet files that are part of the current
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love this

@rtyler rtyler merged commit 2748585 into delta-io:main Dec 14, 2025
41 of 45 checks passed
Copy link
Copy Markdown
Collaborator

@ion-elgreco ion-elgreco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm ;)

rtyler added a commit to rtyler/delta-rs that referenced this pull request Dec 15, 2025
…e references

This is a follow up to delta-io#3962 where some leftover comments were not
addressed

Signed-off-by: R. Tyler Croy <rtyler@brokenco.de>
rtyler added a commit that referenced this pull request Dec 16, 2025
…e references

This is a follow up to #3962 where some leftover comments were not
addressed

Signed-off-by: R. Tyler Croy <rtyler@brokenco.de>
ethan-tyler pushed a commit to ethan-tyler/delta-rs that referenced this pull request Jan 9, 2026
…e references

This is a follow up to delta-io#3962 where some leftover comments were not
addressed

Signed-off-by: R. Tyler Croy <rtyler@brokenco.de>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

binding/python Issues for the Python package binding/rust Issues for the Rust crate documentation Improvements or additions to documentation

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

5 participants