fix: decode path before lookup#3976
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #3976 +/- ##
===========================================
+ Coverage 25.76% 74.52% +48.75%
===========================================
Files 127 156 +29
Lines 20539 41537 +20998
Branches 20539 41537 +20998
===========================================
+ Hits 5292 30954 +25662
+ Misses 14885 9196 -5689
- Partials 362 1387 +1025 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
rtyler
left a comment
There was a problem hiding this comment.
I would like to hold off on merging this until it's better understood as a regression and has a test case associated with it, since we have a lot of changes in the pipeline around paths/urls and log replay
|
In the draft PR where I examine adding support for full path URIs (#3963) this double encoding (once for JSON, and once for the object_store Path) is a major source of complications. If we could remove this, it would be great. If not, I would suggest a test where you use escape characters because this is where the double encoding gets complicated and tends to have problems. In addition, in the draft PR, AI suggested that there may be an existing bug with encoding the '%' character which it would be nice to test. |
2e30f0c to
f394491
Compare
Signed-off-by: Ion Koutsouris <15728914+ion-elgreco@users.noreply.github.com>
f394491 to
57e8e2b
Compare
Signed-off-by: Ion Koutsouris <15728914+ion-elgreco@users.noreply.github.com>
rtyler
left a comment
There was a problem hiding this comment.
Test demonstrates the bug, thanks for adding @ion-elgreco
# Description Paths in the json log are double encoded, when we go over logical_file_view.path() the paths are decoded but when you access them directly in the add_actions_table they aren't so we need to decode as well here to be able to look up in the map properly. Imho, the double encoding in the logs seems like a bug but seems to have been prevalent for a long time and this is just a tape fix to get old behavior back. # Related Issue(s) - closes delta-io#3939 --------- Signed-off-by: Ion Koutsouris <15728914+ion-elgreco@users.noreply.github.com> Co-authored-by: R. Tyler Croy <rtyler@brokenco.de> Co-authored-by: Robert Pack <42610831+roeap@users.noreply.github.com>
Description
Paths in the json log are double encoded, when we go over logical_file_view.path() the paths are decoded but when you access them directly in the add_actions_table they aren't so we need to decode as well here to be able to look up in the map properly.
Imho, the double encoding in the logs seems like a bug but seems to have been prevalent for a long time and this is just a tape fix to get old behavior back.
Related Issue(s)
overwritewithpredicate#3939