Skip to content

fix: introduce file cache to simplify fetching files including file mode#1280

Merged
chingor13 merged 6 commits intomainfrom
file-cache
Feb 18, 2022
Merged

fix: introduce file cache to simplify fetching files including file mode#1280
chingor13 merged 6 commits intomainfrom
file-cache

Conversation

@chingor13
Copy link
Copy Markdown
Contributor

@chingor13 chingor13 commented Feb 3, 2022

This implements a file cache used by the GitHub instance. All file fetches will go through this file cache. It uses the git tree API. First, it tries to use the recursive=true option to fetch the entire file tree.

If we receive a non-truncated set, we will cache that response and looking all file blob SHAs from that cached tree.

If we receive a truncated set, we will use the tree API without recursive and traverse the tree structure until we find the desired node (caching each tree node along the way).

By using the tree API, we can actually fetch the file mode of the fetched file.

Fixes #1090
Fixes #1027

@bcoe
Copy link
Copy Markdown

bcoe commented Feb 4, 2022

Oh cool, so the approach you came up with doesn't use GraphQL, it just pulls the recursive logic for looking up files into a wrapper, and ensures that files are only fetched once.

If I'm understanding correctly you only cache files as they're required, and the idea is that you would be able to get both their contents and mode without two requests?

@chingor13 chingor13 marked this pull request as ready for review February 8, 2022 18:21
@chingor13 chingor13 requested review from a team February 8, 2022 18:21
@chingor13 chingor13 added this to the Next milestone Feb 10, 2022
@chingor13 chingor13 requested a review from bcoe February 14, 2022 18:21
@chingor13 chingor13 changed the title feat: introduce file cache to simplify fetching files including file mode fix: introduce file cache to simplify fetching files including file mode Feb 14, 2022
@chingor13
Copy link
Copy Markdown
Contributor Author

If I'm understanding correctly you only cache files as they're required, and the idea is that you would be able to get both their contents and mode without two requests?

We lazy cache git tree data as we need to fetch files. We avoid the simple contents API because we can't get file mode data. We still need separate requests for the file metadata and the actual contents though.

Copy link
Copy Markdown

@bcoe bcoe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like that this encapsulates the increasingly ugly file contents logic into a helper class 👍

A couple thoughts:

  • I'm curious to see how this implementation behaves on a paticularly large repository (could you test against yoshi-go perhaps?).
  • I think it would be worthwhile having some basic smoke tests for the cache behavior -- I philisophically coming around to nock being used the same way you'd use an integration tests (don't have may nock tests, but it's good to have a couple for coverage).

Comment thread test/github.ts
@chingor13 chingor13 requested a review from bcoe February 17, 2022 01:14
@chingor13
Copy link
Copy Markdown
Contributor Author

  • I'm curious to see how this implementation behaves on a paticularly large repository (could you test against yoshi-go perhaps?).

Running against google-cloud-go seems to still work. That repo is not actually big enough to trigger the recursive file fetching.

Comment thread test/util/file-cache.ts
))
)
.get(
'/repos/testOwner/testRepo/git/blobs/3c3629e647f5ddf82548912e337bea9826b434af'
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

awesome, thanks for this.

@chingor13 chingor13 merged commit d7280b7 into main Feb 18, 2022
@chingor13 chingor13 deleted the file-cache branch February 18, 2022 00:28
gcf-merge-on-green Bot pushed a commit that referenced this pull request Feb 18, 2022
🤖 I have created a release *beep* *boop*
---


### [13.4.11](v13.4.10...v13.4.11) (2022-02-18)


### Bug Fixes

* introduce file cache to simplify fetching files including file mode ([#1280](#1280)) ([d7280b7](d7280b7))

---
This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Update large files in subdirectories with getFileContentsWithDataAPI (fix created) Do not change modes of existing files

2 participants