Skip to content

Rough edges on the agent.cli.ehrql_telemtry tooling #1180

@evansd

Description

@evansd

I used this successfully to ingest a large batch of recent ehrQL jobs for investigation. Just recording a few rough edges I encountered so we don't lose track of them. There's no particular urgency about resolving these.

I was using rg to find relevant logs files and then trying to supply those to the command one at a time using xargs -n 1.

The first issue I hit was when trying to use the just jobrunner/cli ehrql_telemetry command from here. This worked when running --help, but when trying to use xargs I would get the error:

input device is not a TTY

The reason for this is probably obvious if I think carefully enough about it, but I didn't want to do that. I found that manually constructing the docker compose run command and using the -no-TTY argument made it work.

The second issue is that the command doesn't actually take paths to log files: it takes paths to log directories (or plain job IDs). I ended up using cut to remove the final part of the path, but it would be nice if it could just the right thing here.

There turned out to be some logs that the ehrQL tooling couldn't parse. These were more annoying to debug than they could have been because I would get a "non-zero subprocess exit" error from the tooling but nothing to tell me what the error was or what file triggered it. So a couple of helpful changes would be:

  • Print the file path for each log file before processing (which would also give a rough idea of progress).
  • Show stderr from the subprocess if it fails.

There was also one mysterious job whose metadata file triggered an error because container_metadata was empty so metadata["container_metadata"]["Config"]["Image"] threw a KeyError. Job Server shows this as "Cancelled by user" while the metadata file reports it as "cancelled by system" so possibly something racey happened here. Probably not worth worrying about but just noting it.

I guess the ultimate in usability would be to have a single command which just ingests the last 60 days worth of ehrQL jobs. I think we could probably knock together something which did this without massive amounts of work and it would make the process much easier for next time.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions