Conversation
Co-authored-by: Christoph <siedlerkiller@gmail.com> Co-authored-by: Carl Christian Snethlage <50491877+calixtus@users.noreply.github.com> Co-authored-by: Houssem Nasri <housi.housi2015@gmail.com>
Co-authored-by: Christoph <siedlerkiller@gmail.com> Co-authored-by: Carl Christian Snethlage <50491877+calixtus@users.noreply.github.com> Co-authored-by: Houssem Nasri <housi.housi2015@gmail.com> Co-authored-by: Benedikt Tutzer <btut@users.noreply.github.com>
|
Two implementation ideas:
Think, we could go for 2 even though file loading might be slower?! -- https://www.amitph.com/java-read-write-large-files-efficiently/ |
|
|
||
| ParserResult expected = ParserResult.fromErrorMessage("Found git conflict markers"); | ||
|
|
||
| assertEquals(expected, parserResult); |
There was a problem hiding this comment.
I got the test passing by checking for git markers inside the read method after consuming a newline character.
I also called checkForGitMarkers at the beginning of the file to ensure it's called on the first line.
private int read() throws IOException {
int character = pushbackReader.read();
if (!isEOFCharacter(character)) {
pureTextFromFile.offerLast((char) character);
}
if (character == '\n') {
line++;
checkForGitConflictMarker();
}
return character;
}This is the logic. It looks for a line that starts with the 'ours' marker, which is represented by the symbol <<<<<<<. Then it continues to skip lines until it reaches the 'theirs' marker >>>>>>>.
private void checkForGitConflictMarker() throws IOException {
skipSpace();
int markerCount = 0;
// Looking for the 'ours' marker
char c;
while ((c = (char) peek()) == '<' && !isEOFCharacter(c)) {
read();
markerCount++;
}
if (markerCount == 7) {
parserResult.addWarning("Found git conflict markers at line %d".formatted(line));
// Skip 'ours' marker <<<<<<<
skipLine();
// Keep skipping lines until we hit the beginning of 'theirs' marker >>>>>>>
while (peek() != '>' && !isEOFCharacter(peek())) {
skipLine();
}
// Skip 'theirs' marker if we haven't hit EOF already
if (!isEOFCharacter(peek())) {
skipLine();
}
}
}
private void skipLine() throws IOException {
while (peek() != '\n' && !isEOFCharacter(peek())) {
read();
}
skipOneNewline();
}I had to modify the test slightly to pass because the logic I used would continue parsing after the marker, resulting in a parser result with two entries when the expected parser result is zero. I changed it to check whether the warning list contains the git conflict warning.
assertTrue(parserResult.warnings().contains("Found git conflict markers at line 3"));There was a problem hiding this comment.
I didn't make a commit. I got the inspiration of the solution while working on another PR, so I just made the changes on that PR's branch. However, you can use the code above inside BibTeXParser, and don't forget to set co-authored with me 😁.
Fixes JabRef#9167
WIP, because the parsing architecture is a bit complicated here.
We cannot "just" read the whole file, because it could be very slow when reading large data bases.