Skip to content

Improve local search performance#4818

Closed
bartvanandel wants to merge 2 commits intoScoopInstaller:developfrom
bartvanandel:fix/4239_searchPerformance
Closed

Improve local search performance#4818
bartvanandel wants to merge 2 commits intoScoopInstaller:developfrom
bartvanandel:fix/4239_searchPerformance

Conversation

@bartvanandel
Copy link
Copy Markdown

Description

Improve local search performance by pruning candidates using git ls-files and git grep. This drastically reduces the number of files that are read into memory to find the desired results.

The code accounts for buckets that are not maintained using git. Not sure if that really applies, but for buckets that are not maintained using git, the code will fall back to the previous approach.

Motivation and Context

Closes #4239

How Has This Been Tested?

Compared the output of the previous search method with the new method. The code uses standard PowerShell and git functions that have existed for a long time.

Checklist:

  • I have read the Contributing Guide.
  • I have updated the documentation accordingly. N/A. Well, I sparsely documented the approach in code.
  • I have updated the tests accordingly. N/A, there were no pre-existing tests.

@bartvanandel bartvanandel force-pushed the fix/4239_searchPerformance branch from b925a13 to 0d59e9b Compare March 16, 2022 12:17
Comment thread libexec/scoop-search.ps1 Outdated
@niheaven
Copy link
Copy Markdown
Member

Sorry, but this approach took longer time in my PC:

image

image

And I've tried several times. What about yours @rashil2000 ?

@bartvanandel
Copy link
Copy Markdown
Author

Strange. On my system, scoop search net went down from 32 seconds to 4 seconds with this change. Likewise, scoop search zulu went from 8 seconds to 2 seconds (this contains hits in the java bucket).

BTW due to (presumably) file system caching, next runs of the same command and on the same develop branch are significantly faster. So measuring this in a fair way may be tricky, especially since in practice, usually you don't search the same thing twice in a row.

@bartvanandel bartvanandel force-pushed the fix/4239_searchPerformance branch from 0d59e9b to 00a10cc Compare March 17, 2022 07:54
@rashil2000
Copy link
Copy Markdown
Member

rashil2000 commented Mar 17, 2022

I'm seeing a marginal improvement.

image

image


image

image

@niheaven
Copy link
Copy Markdown
Member

niheaven commented Mar 18, 2022

Search twice or use different methods?

Same here, original method 8sec, this 13sec (scoop search rstudio)

Buckets list:

Name       Source                                             Updated            Manifests
----       ------                                             -------            ---------
dorado     https://github.com/h404bi/dorado                   2022/3/18 8:10:29        221
extras     https://github.com/ScoopInstaller/Extras           2022/3/18 8:33:48       1437
java       https://github.com/ScoopInstaller/Java             2022/3/17 20:26:07       220
main       https://github.com/ScoopInstaller/Main             2022/3/18 8:32:58        992
nerd-fonts https://github.com/matthewjberger/scoop-nerd-fonts 2022/3/16 3:11:59        187
nih        https://github.com/niheaven/scoop-nih.git          2022/3/18 9:53:30         28
rasa       https://github.com/rasa/scoops                     2022/3/15 5:01:36         70
tests      https://github.com/ScoopInstaller/Tests            2022/3/17 4:32:31         62
versions   https://github.com/ScoopInstaller/Versions         2022/3/18 9:55:40        280

@rashil2000
Copy link
Copy Markdown
Member

Search twice or use different methods?

Different methods. See .\bin\scoop search zulu vs scoop search zulu

@niheaven
Copy link
Copy Markdown
Member

More feedback is needed, IMO. Don't know why this one took more time.

@chawyehsu
Copy link
Copy Markdown
Member

DESKTOP in current on  develop
❯ scoop bucket list

Name     Source                                     Updated            Manifests
----     ------                                     -------            ---------
akira    https://github.com/chawyehsu/scoop-akira   2022/4/13 23:52:50         2
dorado   https://github.com/chawyehsu/dorado        2022/5/13 17:08:19       217
extras   https://github.com/ScoopInstaller/Extras   2022/5/16 12:30:48      1501
java     https://github.com/ScoopInstaller/Java     2022/5/15 4:01:48        220
main     https://github.com/ScoopInstaller/Main     2022/5/16 0:30:16       1019
versions https://github.com/ScoopInstaller/Versions 2022/5/16 15:36:44       314

DESKTOP in current on  develop
❯ Measure-Command { .\bin\scoop.ps1 search rstudio }
'extras' bucket:
'versions' bucket:

Days              : 0
Hours             : 0
Minutes           : 0
Seconds           : 6
Milliseconds      : 911
Ticks             : 69116116
TotalDays         : 7.99955046296296E-05
TotalHours        : 0.00191989211111111
TotalMinutes      : 0.115193526666667
TotalSeconds      : 6.9116116
TotalMilliseconds : 6911.6116


DESKTOP in current on  develop took 6s
❯ git sw fix/4239_searchPerformance
Switched to branch 'fix/4239_searchPerformance'
DESKTOP in current on  fix/4239_searchPerformance
❯ Measure-Command { .\bin\scoop.ps1 search rstudio }
'extras' bucket:
'versions' bucket:

Days              : 0
Hours             : 0
Minutes           : 0
Seconds           : 1
Milliseconds      : 320
Ticks             : 13204317
TotalDays         : 1.52827743055556E-05
TotalHours        : 0.000366786583333333
TotalMinutes      : 0.022007195
TotalSeconds      : 1.3204317
TotalMilliseconds : 1320.4317


@niheaven
Copy link
Copy Markdown
Member

Test it again, same result:

Scoop\apps\scoop took 23sgit -C "current" checkout develop
Switched to branch 'develop'
Your branch is behind 'origin/develop' by 1 commit, and can be fast-forwarded.
  (use "git pull" to update your local branch)

Scoop\apps\scoopMeasure-Command { scoop search rstudio }
'extras' bucket:
'versions' bucket:

Days              : 0
Hours             : 0
Minutes           : 0
Seconds           : 10
Milliseconds      : 352
Ticks             : 103529888
TotalDays         : 0.000119826259259259
TotalHours        : 0.00287583022222222
TotalMinutes      : 0.172549813333333
TotalSeconds      : 10.3529888
TotalMilliseconds : 10352.9888



Scoop\apps\scoop took 10sgit -C "current" checkout fix/4239_searchPerformance
Switched to branch 'fix/4239_searchPerformance'

Scoop\apps\scoopMeasure-Command { scoop search rstudio }
'extras' bucket:
'versions' bucket:

Days              : 0
Hours             : 0
Minutes           : 0
Seconds           : 23
Milliseconds      : 881
Ticks             : 238818129
TotalDays         : 0.000276409871527778
TotalHours        : 0.00663383691666667
TotalMinutes      : 0.398030215
TotalSeconds      : 23.8818129
TotalMilliseconds : 23881.8129

Really strange...

@bartvanandel
Copy link
Copy Markdown
Author

Well, if the results are this unpredictable / unreliable, I'd suggest to not merge at the moment. I don't currently have the time to dig any further.

@rashil2000
Copy link
Copy Markdown
Member

Closing in favour of #5644

@rashil2000 rashil2000 closed this Oct 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants