Skip to content

Commit ed12c66

Browse files
ilanggarydgregory
andauthored
[VFS-862] Fix ON_RESOLVE triggering refresh on internal navigation (#761)
* [VFS-862] Fix ON_RESOLVE triggering refresh on internal navigation CacheStrategy.ON_RESOLVE is intended to refresh files when the user resolves them via the public API. However, internal navigation methods (getParent, getRoot, getChildren child resolution, symlink resolution) also called fileSystem.resolveFile(), triggering ON_RESOLVE refreshes on files the user never asked to refresh. This became a severe regression after refresh() was changed to unconditionally clear FtpFileObject.childMap: each child's getParent() refreshed the parent, clearing its childMap, forcing a new FTP LIST command per child. A directory with N files produced ~N LIST commands instead of 1. Fix: - Add resolveFileInternal() that skips the ON_RESOLVE refresh. All internal navigation call sites use it instead of resolveFile(). - After a fresh directory listing, FTP and SFTP providers propagate metadata to cached child objects in-place, preserving object identity. This establishes a clear contract: cached state is used until the user explicitly calls refresh() or resolves via the public API. Internal navigation never triggers server operations. Tests: - FtpGetChildrenListCommandTest: verifies findFiles() on a directory with 50 files issues exactly 1 LIST command with ON_RESOLVE. - SftpGetChildrenListCommandTest: verifies refresh + findFiles() returns fresh children reflecting filesystem changes. * [VFS-862] Disable HTTP persistent connections in Jackrabbit 1.x tests Jackrabbit 1.x bundles Jetty 6.x which does not drain unconsumed request bodies before sending error responses (e.g. 404) on persistent connections, violating HTTP/1.1 requirements. This leaves stale request bytes that corrupt the next request on a reused connection. The resolveFileInternal change in getParent() removed an extra PROPFIND request to the parent directory that used to happen between the child's PROPFIND (404) and the child's PUT. That extra request happened to flush the stale bytes from the connection. Without it, the server reads the 112-byte PROPFIND XML request body as the PUT file content. Fix: add Connection: close to the test HttpClient configuration so each request uses a fresh connection. Jetty 9.4+ handles this correctly (see jetty#651, jetty#4117, jetty#6168). Jackrabbit 2 tests pass without changes. No production code modified. * [VFS-862] Remove SFTP test that passes without the fix The SFTP test cannot distinguish the fix from the original ON_RESOLVE behavior because SFTP's doListChildrenResolved() pushes metadata to each child via setStat() immediately after resolution. The ON_RESOLVE refresh clears attrs, but setStat() repopulates it right after — same server traffic either way. The optimization is purely client-side. The FTP test remains and definitively proves the fix (82 LISTs without the fix, 1 LIST with). * Add comment. --------- Co-authored-by: Gary Gregory <garydgregory@users.noreply.github.com>
1 parent 2ab6374 commit ed12c66

6 files changed

Lines changed: 259 additions & 12 deletions

File tree

commons-vfs2-jackrabbit1/src/test/java/org/apache/commons/vfs2/provider/webdav/test/WebdavProviderTestCase.java

Lines changed: 22 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -36,6 +36,9 @@
3636

3737
import static org.junit.jupiter.api.Assertions.assertEquals;
3838

39+
import java.util.Collections;
40+
41+
import org.apache.commons.httpclient.Header;
3942
import org.apache.commons.io.file.PathUtils;
4043
import org.apache.commons.vfs2.AbstractProviderTestConfig;
4144
import org.apache.commons.vfs2.FileObject;
@@ -284,7 +287,25 @@ public FileObject getBaseTestFolder(final FileSystemManager manager) throws Exce
284287
.getFileSystemConfigBuilder("webdav");
285288
final FileSystemOptions opts = new FileSystemOptions();
286289
builder.setRootURI(opts, uri);
287-
return manager.resolveFile(uri, opts);
290+
final FileObject baseFolder = manager.resolveFile(uri, opts);
291+
292+
// Disable HTTP persistent connections for the embedded Jackrabbit 1.x
293+
// test server. Jackrabbit 1.x bundles Jetty 6.x which does not drain
294+
// unconsumed request bodies before sending error responses (e.g. 404),
295+
// violating HTTP/1.1 persistent connection requirements. This leaves
296+
// stale request bytes on the connection that corrupt subsequent requests.
297+
// See https://github.com/jetty/jetty.project/issues/651
298+
// See https://github.com/jetty/jetty.project/issues/4117
299+
final java.lang.reflect.Method getClient =
300+
org.apache.commons.vfs2.provider.http.HttpFileSystem.class
301+
.getDeclaredMethod("getClient");
302+
getClient.setAccessible(true);
303+
((org.apache.commons.httpclient.HttpClient) getClient.invoke(
304+
baseFolder.getFileSystem())).getParams().setParameter(
305+
"http.default-headers",
306+
Collections.singletonList(new Header("Connection", "close")));
307+
308+
return baseFolder;
288309
}
289310

290311
@Override

commons-vfs2/src/main/java/org/apache/commons/vfs2/provider/AbstractFileObject.java

Lines changed: 21 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1306,7 +1306,7 @@ public FileObject getParent() throws FileSystemException {
13061306
if (name == null) {
13071307
return null;
13081308
}
1309-
parent = fileSystem.resolveFile(name);
1309+
parent = fileSystem.resolveFileInternal(name);
13101310
}
13111311
return parent;
13121312
}
@@ -1805,7 +1805,26 @@ private void removeChildrenCache() {
18051805
}
18061806

18071807
private FileObject resolveFile(final FileName child) throws FileSystemException {
1808-
return fileSystem.resolveFile(child);
1808+
return resolveFileInternal(child);
1809+
}
1810+
1811+
/**
1812+
* Resolves a file by name for internal navigation, skipping the
1813+
* {@link org.apache.commons.vfs2.CacheStrategy#ON_RESOLVE} refresh. Cache policy should only
1814+
* apply to external API calls, not internal VFS plumbing like
1815+
* {@link #getParent()}, {@link #getChildren()}, or symlink resolution.
1816+
* <p>
1817+
* Subclasses should use this instead of {@code getFileSystem().resolveFile()}
1818+
* when navigating to related files (parent, children, link targets).
1819+
* </p>
1820+
*
1821+
* @param name The FileName to resolve.
1822+
* @return The resolved FileObject.
1823+
* @throws FileSystemException if an error occurs.
1824+
* @since 2.11.0
1825+
*/
1826+
protected FileObject resolveFileInternal(final FileName name) throws FileSystemException {
1827+
return fileSystem.resolveFileInternal(name);
18091828
}
18101829

18111830
/**

commons-vfs2/src/main/java/org/apache/commons/vfs2/provider/AbstractFileSystem.java

Lines changed: 34 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -377,7 +377,7 @@ public FileObject getParentLayer() throws FileSystemException {
377377
*/
378378
@Override
379379
public FileObject getRoot() throws FileSystemException {
380-
return resolveFile(rootName);
380+
return resolveFileInternal(rootName);
381381
}
382382

383383
/**
@@ -525,11 +525,39 @@ public File replicateFile(final FileObject file, final FileSelector selector) th
525525
*/
526526
@Override
527527
public FileObject resolveFile(final FileName name) throws FileSystemException {
528-
return resolveFile(name, true);
528+
return resolveFile(name, true, true);
529529
}
530530

531-
private synchronized FileObject resolveFile(final FileName name, final boolean useCache)
532-
throws FileSystemException {
531+
/**
532+
* Resolves a file by name for internal navigation (e.g. {@code getParent()},
533+
* {@code resolveFiles()} in {@code getChildren()}). Skips the
534+
* {@link CacheStrategy#ON_RESOLVE} refresh, since cache policy should only
535+
* apply to external API calls, not internal VFS plumbing. Without this,
536+
* {@code ON_RESOLVE} triggers a refresh cascade where each child's
537+
* {@code getParent()} refreshes the parent, clearing its cached state and
538+
* causing O(N) redundant operations.
539+
*
540+
* @param name The FileName to resolve.
541+
* @return The resolved FileObject.
542+
* @throws FileSystemException if an error occurs.
543+
* @since 2.11.0
544+
*/
545+
synchronized FileObject resolveFileInternal(final FileName name) throws FileSystemException {
546+
return resolveFile(name, true, false);
547+
}
548+
549+
/**
550+
* Resolves a file by name.
551+
*
552+
* @param name The FileName to resolve.
553+
* @param useCache whether to use the file cache.
554+
* @param applyRefreshPolicy whether to apply the {@link CacheStrategy#ON_RESOLVE} refresh.
555+
* Should be {@code true} for external API calls and {@code false} for internal navigation.
556+
* @return The resolved FileObject.
557+
* @throws FileSystemException if an error occurs.
558+
*/
559+
private synchronized FileObject resolveFile(final FileName name, final boolean useCache,
560+
final boolean applyRefreshPolicy) throws FileSystemException {
533561
if (!rootName.getRootURI().equals(name.getRootURI())) {
534562
throw new FileSystemException("vfs.provider/mismatched-fs-for-name.error", name, rootName,
535563
name.getRootURI());
@@ -561,7 +589,8 @@ private synchronized FileObject resolveFile(final FileName name, final boolean u
561589
/*
562590
resync the file information if requested
563591
*/
564-
if (getFileSystemManager().getCacheStrategy().equals(CacheStrategy.ON_RESOLVE)) {
592+
if (applyRefreshPolicy
593+
&& getFileSystemManager().getCacheStrategy().equals(CacheStrategy.ON_RESOLVE)) {
565594
file.refresh();
566595
}
567596
return file;

commons-vfs2/src/main/java/org/apache/commons/vfs2/provider/ftp/FtpFileObject.java

Lines changed: 23 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -501,8 +501,29 @@ public FileObject[] getChildren() throws FileSystemException {
501501
* has C children, P parents, there will be (C * P) listings made with (C * (P + 1)) refreshes, when there
502502
* should really only be 1 listing and C refreshes.
503503
*/
504+
final boolean freshList = childMap == null;
504505
inRefresh.set(true);
505-
return super.getChildren();
506+
final FileObject[] result = super.getChildren();
507+
// [VFS-862] Fix ON_RESOLVE triggering refresh on internal navigation.
508+
// If a fresh LIST was done, propagate the new metadata to cached
509+
// child objects so they reflect the latest directory listing.
510+
if (freshList && childMap != null) {
511+
synchronized (getFileSystem()) {
512+
for (final FileObject child : result) {
513+
final FtpFileObject ftpChild = (FtpFileObject) FileObjectUtils
514+
.getAbstractFileObject(child);
515+
final FTPFile entry = childMap.get(
516+
UriParser.decode(child.getName().getBaseName()));
517+
if (entry != null) {
518+
ftpChild.ftpFile = entry;
519+
ftpChild.linkDestination = null;
520+
ftpChild.mdtmSet = false;
521+
ftpChild.injectType(null);
522+
}
523+
}
524+
}
525+
}
526+
return result;
506527
} finally {
507528
inRefresh.set(false);
508529
}
@@ -530,7 +551,7 @@ private FileObject getLinkDestination() throws FileSystemException {
530551
final FileName parent = getName().getParent();
531552
final FileName relativeTo = parent == null ? getName() : parent;
532553
final FileName linkDestinationName = getFileSystem().getFileSystemManager().resolveName(relativeTo, path);
533-
linkDestination = getFileSystem().resolveFile(linkDestinationName);
554+
linkDestination = resolveFileInternal(linkDestinationName);
534555
}
535556
return linkDestination;
536557
}

commons-vfs2/src/main/java/org/apache/commons/vfs2/provider/sftp/SftpFileObject.java

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -346,10 +346,15 @@ protected FileObject[] doListChildrenResolved() throws Exception {
346346
continue;
347347
}
348348

349-
final FileObject fo = getFileSystem().resolveFile(getFileSystem().getFileSystemManager()
349+
final FileObject fo = resolveFileInternal(getFileSystem().getFileSystemManager()
350350
.resolveName(getName(), UriParser.encode(name), NameScope.CHILD));
351351

352-
((SftpFileObject) FileObjectUtils.getAbstractFileObject(fo)).setStat(stat.getAttrs());
352+
final SftpFileObject sftpChild = (SftpFileObject) FileObjectUtils.getAbstractFileObject(fo);
353+
sftpChild.setStat(stat.getAttrs());
354+
// Clear cached type so it re-evaluates from the fresh attrs.
355+
// Without this, a cached type from a prior listing is returned
356+
// without consulting the updated attrs.
357+
sftpChild.injectType(null);
353358

354359
children.add(fo);
355360
}
Lines changed: 152 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,152 @@
1+
/*
2+
* Licensed to the Apache Software Foundation (ASF) under one or more
3+
* contributor license agreements. See the NOTICE file distributed with
4+
* this work for additional information regarding copyright ownership.
5+
* The ASF licenses this file to You under the Apache License, Version 2.0
6+
* (the "License"); you may not use this file except in compliance with
7+
* the License. You may obtain a copy of the License at
8+
*
9+
* https://www.apache.org/licenses/LICENSE-2.0
10+
*
11+
* Unless required by applicable law or agreed to in writing, software
12+
* distributed under the License is distributed on an "AS IS" BASIS,
13+
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
14+
* See the License for the specific language governing permissions and
15+
* limitations under the License.
16+
*/
17+
package org.apache.commons.vfs2.provider.ftp;
18+
19+
import static org.junit.jupiter.api.Assertions.assertEquals;
20+
import static org.junit.jupiter.api.Assertions.assertNotNull;
21+
import static org.junit.jupiter.api.Assertions.assertTrue;
22+
23+
import java.io.File;
24+
import java.io.FileWriter;
25+
import java.time.Duration;
26+
import java.util.concurrent.atomic.AtomicInteger;
27+
28+
import org.apache.commons.vfs2.CacheStrategy;
29+
import org.apache.commons.vfs2.FileDepthSelector;
30+
import org.apache.commons.vfs2.FileObject;
31+
import org.apache.commons.vfs2.FileSystemOptions;
32+
import org.apache.commons.vfs2.VfsTestUtils;
33+
import org.apache.commons.vfs2.impl.DefaultFileSystemManager;
34+
import org.apache.ftpserver.command.CommandFactoryFactory;
35+
import org.apache.ftpserver.command.impl.LIST;
36+
import org.junit.jupiter.api.AfterEach;
37+
import org.junit.jupiter.api.BeforeEach;
38+
import org.junit.jupiter.api.Test;
39+
40+
/**
41+
* Tests that {@link CacheStrategy#ON_RESOLVE} does not cause redundant FTP LIST
42+
* commands due to internal VFS navigation triggering cache refreshes.
43+
* <p>
44+
* With {@code ON_RESOLVE}, every {@code fileSystem.resolveFile()} call triggers
45+
* {@code refresh()} on the returned file. This is correct for external API calls
46+
* (the user explicitly resolving a file), but internal navigation methods like
47+
* {@code getParent()}, {@code getRoot()}, and child resolution in
48+
* {@code getChildren()} also call {@code resolveFile()}, unintentionally refreshing
49+
* cached files and clearing their state.
50+
* </p>
51+
* <p>
52+
* This was always wasteful, but became a severe regression after
53+
* {@code FtpFileObject.refresh()} was changed to unconditionally clear
54+
* {@code childMap} (to fix stale cached state on never-attached files).
55+
* That change turned the unnecessary refreshes into O(N) LIST commands:
56+
* each child's {@code getParent()} refreshes the parent, clearing its
57+
* {@code childMap}, forcing a new LIST for every child.
58+
* </p>
59+
*/
60+
public class FtpGetChildrenListCommandTest {
61+
62+
private static final int FILE_COUNT = 50;
63+
private static final AtomicInteger listCommandCount = new AtomicInteger();
64+
private static final LIST defaultListCommand = new LIST();
65+
66+
private File[] testFiles;
67+
68+
@BeforeEach
69+
public void setUp() throws Exception {
70+
listCommandCount.set(0);
71+
72+
// Create test files in the FTP home directory.
73+
final File testDir = new File(VfsTestUtils.getTestDirectory());
74+
testFiles = new File[FILE_COUNT];
75+
for (int i = 0; i < FILE_COUNT; i++) {
76+
testFiles[i] = new File(testDir, "amplification-test-" + i + ".txt");
77+
try (FileWriter w = new FileWriter(testFiles[i])) {
78+
w.write("content-" + i);
79+
}
80+
}
81+
82+
// Start FTP server with a LIST command wrapper that counts invocations.
83+
final CommandFactoryFactory cmdFactory = new CommandFactoryFactory();
84+
cmdFactory.addCommand("LIST", (session, context, request) -> {
85+
listCommandCount.incrementAndGet();
86+
defaultListCommand.execute(session, context, request);
87+
});
88+
FtpProviderTest.setUpClass(VfsTestUtils.getTestDirectory(), null,
89+
cmdFactory.createCommandFactory());
90+
}
91+
92+
@AfterEach
93+
public void tearDown() {
94+
FtpProviderTest.tearDownClass();
95+
if (testFiles != null) {
96+
for (final File f : testFiles) {
97+
f.delete();
98+
}
99+
}
100+
}
101+
102+
private static FileSystemOptions createOptions() {
103+
final FileSystemOptions options = new FileSystemOptions();
104+
final FtpFileSystemConfigBuilder builder = FtpFileSystemConfigBuilder.getInstance();
105+
builder.setUserDirIsRoot(options, true);
106+
builder.setPassiveMode(options, true);
107+
builder.setConnectTimeout(options, Duration.ofSeconds(10));
108+
return options;
109+
}
110+
111+
/**
112+
* Verifies that {@code findFiles()} on a directory with {@code N} children
113+
* issues exactly 1 LIST command. {@code findFiles()} internally calls
114+
* {@code getType()} on each child via {@code traverse()}, which is where
115+
* the cascade occurs.
116+
* <p>
117+
* With the bug, each child's {@code getType()} triggers a parent LIST via
118+
* the ON_RESOLVE cascade, producing ~N LIST commands.
119+
* Without the bug, the parent's {@code childMap} is reused across children,
120+
* resulting in exactly 1 LIST.
121+
* </p>
122+
*/
123+
@Test
124+
public void testGetChildrenIssuesSingleList() throws Exception {
125+
try (final DefaultFileSystemManager manager = new DefaultFileSystemManager()) {
126+
manager.addProvider("ftp", new FtpFileProvider());
127+
manager.setCacheStrategy(CacheStrategy.ON_RESOLVE);
128+
manager.init();
129+
130+
final FileSystemOptions options = createOptions();
131+
final FileObject root = manager.resolveFile(
132+
FtpProviderTest.getConnectionUri(), options);
133+
134+
listCommandCount.set(0);
135+
root.refresh();
136+
final FileObject[] children = root.findFiles(new FileDepthSelector(1, 1));
137+
assertNotNull(children, "findFiles should return results");
138+
assertTrue(children.length >= FILE_COUNT,
139+
"Should find at least " + FILE_COUNT + " files, found: " + children.length);
140+
141+
final int listCount = listCommandCount.get();
142+
143+
// With the bug: LIST count ≈ FILE_COUNT (one per child due to cascade)
144+
// Without the bug: exactly 1 LIST (the getChildren() directory listing)
145+
assertEquals(1, listCount,
146+
"Scan should produce exactly 1 LIST command, but produced "
147+
+ listCount + ". This indicates the ON_RESOLVE childMap "
148+
+ "amplification bug: each child's getParent() re-resolution "
149+
+ "refreshes the parent, clearing its childMap and forcing a new LIST.");
150+
}
151+
}
152+
}

0 commit comments

Comments
 (0)