Skip to content

munin: "rebooted" timestamp vs event-time blurring #3

@warner

Description

@warner

While migrating the transit relay to a new host, I noticed that munin wasn't reporting any events for the first hour (under the "events since reboot" plugin named wormhole_transit_events). The server writes the actual (accurate) timestamp of reboot into the usagedb, and the munin plugin uses a SQL query that only looks for events with a timestamp greater than the reboot time. But.. the actual event timestamps are blurred (rounded to the nearest 3600 seconds), causing them to look like they happened before the last reboot, so the munin script ignores them.

For now, I just manually changed the "rebooted" timestamp to be one second less than the blurred value of the current time (so just before the last blur point).

The long-term fix will be to add an extra field to the current DB table, with the blurred reboot time, and have the munin plugins compare against that instead of the actual reboot time. Also, the plugins should compare event_time >= blurred_reboot, instead of the current event_time > reboot, since everything in that first half-ish hour window should be counted.

This will cause some inaccuracies, as some events will be double-counted. I suspect there's a hard tradeoff to be made, between double-counting some events vs never-counting some events.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions