Probably one for @kermitt2 :)
I found some very long text elements in the sentence level json files (like > 3000 characters).
e.g.,
file:line "quote to search"
PMC4176174.json:1502 "3D reconstruction of the EC. (A) 3D reconstruction of the yeast MtRNAP EC generated ..."
PMC3140372.json:763 " was conducted in summer 2008 for the first time. It"
PMC2963829.json:1078 "which is the concentration we use for acute treatments, has a negligible eff ect"
PMC3328383.json:1218 " events occur throughout the human 11q segment and forks move across the segment"
There are a bunch between 2000 and 3000, which also seem long, probably the same issue. It's not crucial, but they result in some really large tasks for tagworks (which could probably be dropped).
Probably one for @kermitt2 :)
I found some very long text elements in the sentence level json files (like > 3000 characters).
e.g.,
file:line "quote to search"
PMC4176174.json:1502 "3D reconstruction of the EC. (A) 3D reconstruction of the yeast MtRNAP EC generated ..."
PMC3140372.json:763 " was conducted in summer 2008 for the first time. It"
PMC2963829.json:1078 "which is the concentration we use for acute treatments, has a negligible eff ect"
PMC3328383.json:1218 " events occur throughout the human 11q segment and forks move across the segment"
There are a bunch between 2000 and 3000, which also seem long, probably the same issue. It's not crucial, but they result in some really large tasks for tagworks (which could probably be dropped).