Optimize parser on large Viper files#477
Conversation
Interesting observation! I wonder if it would even be possible (and reasonable) to use some kind of specialised data structure for spatial information here.
You could also benchmark both versions against the Viper test suite, using this benchmark script to repeatedly execute a version and record runtime statistics in a CSV file.
Our test annotations, e.g. |
I don't know the parser well enough to answer this. I would say it depends on what kind of assumptions we can make. For example, if the index argument of consecutive
Nice, thanks! I didn't know that. Before the optimization: After the optimization: Verification takes ~7s less probably due to a different seed, but parsing clearly got 5x faster.
Are those tests already run as part of |
I'm not sure, either. @fabiopakk Do you know more about this?
Nice effect on parsing!
Yes: if you run |
|
@mschwerhoff Should I merge or should someone review this? |
| // implementation of `computeFrom` used to do. | ||
| val lines = s.linesWithSeparators | ||
| _lines = lines.map(_.length).toArray | ||
| _line_offset = (lines.map(_.length) ++ Seq(0)).toArray |
There was a problem hiding this comment.
As described in the comment above, the ++ Seq(0) here is used to accurately reproduce an edge case of the old behavior. However, It's not clear to me how important that edge case is. Maybe we can ignore it.
No objections from my side |
|
Really interesting work, @fpoli. Thanks for that. I'm changing the PosComputer in the new FastParse, but I think your work can still be useful there. I'll merge into the current version and try to make its way into FastParse 2.2.2. |
Profiling Silicon on a large Viper file (6k loc), the
computeFrominPosComputeris marked as "hot". This PR replaces a linear scan in its implementation with a binary search, which even counting the preprocessing should be faster than before on any input case. On the 6k loc Viper file this removes ~3s (tested on 4 cold runs of Silicon).Is there any test suite that checks the precise line/column of Viper's error messages?