Distinguish cache hits from real downloads in p2 transport log#5971
Distinguish cache hits from real downloads in p2 transport log#5971vogella wants to merge 2 commits intoeclipse-tycho:mainfrom
Conversation
Tycho's p2 transport previously logged "Downloading from <url>" for every resource access, even when the response was served entirely from the local p2 bundle cache. This made builds look like they were downloading gigabytes when they were effectively offline. Logging now branches on the cache outcome: - "Fetched from cache: <url>" (DEBUG) - no network - "Up-to-date: <url>" (DEBUG) - 304 Not Modified - "Downloading from <url>" (INFO) - real 2xx download (unchanged) The log is emitted from the cache layer after the outcome is known. SharedHttpCacheStorage now records the per-URI CacheState so the transport can suppress the misleading "Downloaded from ... at X/s" summary for cache hits and revalidations while keeping it unchanged for real downloads. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Test Results1 047 files 1 047 suites 5h 9m 11s ⏱️ Results for commit 0f6b621. ♻️ This comment has been updated with latest results. |
This is actually desired to make it mostly similar how Maven works. As you can see you can even suppress these with
This logging is not meant as a debug facility of the caching framework (for what we have system properties available). |
| String id = "p2"; // TODO we might compute the id from the IRepositoryIdManager based on the URI? | ||
| if (cacheConfig.isInteractive()) { | ||
| logger.info("Downloading from " + id + ": " + source); | ||
| } |
There was a problem hiding this comment.
This log message should remain to stay consistent with how maven works and to know that the download has started.
| @Requirement | ||
| HttpCache httpCache; |
There was a problem hiding this comment.
We should not bind directly to a specific cache implementation there might be other caching techniques see comment below.
| logger.info("Downloaded from " + id + ": " + source + " (" | ||
| + FileUtils.byteCountToDisplaySize(downloadStatus.getFileSize()) + " at " | ||
| + FileUtils.byteCountToDisplaySize(downloadStatus.getTransferRate()) + "/s)"); | ||
| CacheState state = httpCache.getLastCacheState(source); |
There was a problem hiding this comment.
Instead of using a side-car approach here, the information should be transported through the DownloadStatusOutputStream
| + FileUtils.byteCountToDisplaySize(downloadStatus.getTransferRate()) + "/s)"); | ||
| CacheState state = httpCache.getLastCacheState(source); | ||
| if (state == CacheState.DOWNLOADED || state == CacheState.UNKNOWN) { | ||
| logger.info("Downloaded from " + id + ": " + source + " (" |
There was a problem hiding this comment.
The info should always be logged and we should just replace the transfer rate by a status e.g.
Downloaded from " + id + ": " + source + " ("
+ FileUtils.byteCountToDisplaySize(downloadStatus.getFileSize()) + " (from cache)"
| * for the three possible outcomes of a cache access: pure cache hit, 304 | ||
| * revalidation, and real download. | ||
| */ | ||
| class SharedHttpCacheStorageLoggingTest { |
There was a problem hiding this comment.
It has shown that mocking does not work well on the long run in Tycho as it always tests internal behavior what is hard to maintain.
Instead please add a real integration-test that uses e.g. a java 21 jdk server (we also have some examples using jetty already).
This proves it works in real world use-cases end-to-.end
| * produce accurate log output (e.g. distinguish "Downloading" from "Fetched | ||
| * from cache"). | ||
| */ | ||
| public enum CacheState { |
There was a problem hiding this comment.
Please think about better naming here that is independent from caching and agnostic from the transport technique. e.g. we also have ftp transports that have no meaning of (http) status codes.
|
@vogella do you plan to further work on this? I want to make a new Tycho release likely end of month and it could then be included if we finish before. |
|
Your initial comment indicates that this does not fit to Maven. So you agree with the change now? I can finish it in this case |
- Restore "Downloading from" INFO log to match Maven behaviour. - Drop direct HttpCache dependency from TychoRepositoryTransport. - Carry the cache-hit flag through DownloadStatusOutputStream (via a thread-local registry) instead of querying the cache as a side-car. - Always emit "Downloaded from" and only swap the suffix: "(from cache)" for cache hits / 304 revalidations, "at X/s" for real transfers. - Remove the CacheState enum; a simple fromCache boolean on the stream carries the same information without transport-specific terminology. - Replace the mock-based SharedHttpCacheStorageLoggingTest with a real integration test that exercises the cache against an actual com.sun.net.httpserver instance, and drop the mockito test dep. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Summary
Tycho's p2 transport currently logs
Downloading from <url>whenever a resource is requested, even when the response is served entirely from the local p2 bundle cache (~/.m2/repository/.cache/tycho/) with no network transfer. This makes incremental / warm-cache builds look like they are downloading gigabytes, and makes it hard to spot real download activity in a CI log.This PR teaches
SharedHttpCacheStorageto report the outcome of every cache access, and adjustsTychoRepositoryTransportso the log matches what actually happened.Before
Every access prints the same two lines even when nothing was fetched from the network:
After
Fetched from cache: <url>Up-to-date: <url>Downloading from <url>(+Downloaded from ... at X/sas before)For a warm-cache build the noisy
Downloading from/Downloaded frompair is now silent at INFO, while real network downloads keep the existing INFO output so CI output is unchanged for that case.Implementation notes
CacheState { FROM_CACHE, NOT_MODIFIED, DOWNLOADED, UNKNOWN }enum.HttpCachegainsgetLastCacheState(URI).SharedHttpCacheStorage.CacheLine.fetchFilenow emits the three-way log at the point where it actually knows the outcome, and signals the state back to the storage via a callback.TychoRepositoryTransport.downloadArtifactno longer emits the misleading pre-downloadDownloading fromlog, and suppresses the post-downloadDownloaded from ... at X/ssummary when the state isFROM_CACHEorNOT_MODIFIED.Test plan
SharedHttpCacheStorageLoggingTestcovers all three cases (cache hit, 304, real 200) plus theUNKNOWNdefault. Uses a Mockito-mockedHttpTransportso it never touches the network.mvn -pl p2-maven-plugin -am test— 4/4 green.mvn -pl tycho-core -am verify— 581/581 green.🤖 Generated with Claude Code