Skip to content

Really long sentences in json files #667

@jameshowison

Description

@jameshowison

Probably one for @kermitt2 :)

I found some very long text elements in the sentence level json files (like > 3000 characters).

e.g.,

file:line "quote to search"

PMC4176174.json:1502 "3D reconstruction of the EC. (A) 3D reconstruction of the yeast MtRNAP EC generated ..."
PMC3140372.json:763 " was conducted in summer 2008 for the first time. It"
PMC2963829.json:1078 "which is the concentration we use for acute treatments, has a negligible eff ect"
PMC3328383.json:1218 " events occur throughout the human 11q segment and forks move across the segment"

There are a bunch between 2000 and 3000, which also seem long, probably the same issue. It's not crucial, but they result in some really large tasks for tagworks (which could probably be dropped).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions