Skip to content

Monitoring performed by systemd-monitor-counter.json file does not work for type FrequentContainerDRestart #1258

@Whisper40

Description

@Whisper40

As described here : #786
We have the same issue, we use the same Helm Chart version 2.4.0 . Actually testing FrequentContainerdRestart.


I0401 06:17:52.093199    4510 custom_plugin_monitor.go:81] Finish parsing custom plugin monitor config file /config/kernel-monitor-counter.json: {Plugin:custom PluginGlobalConfig:{InvokeIntervalString:0xc0003c0af0 TimeoutString:0xc0003c0b00 InvokeInterval:5m0s Timeout:1m0s MaxOutputLength:0xc000433b00 Concurrency:0xc000433b08 EnableMessageChangeBasedConditionUpdate:0x30fdf95 SkipInitialStatus:0x30fdf96} Source:kernel-monitor DefaultConditions:[{Type:FrequentUnregisterNetDevice Status: Transition:0001-01-01 00:00:00 +0000 UTC Reason:NoFrequentUnregisterNetDevice Message:node is functioning properly}] Rules:[0xc000408d90] EnableMetricsReporting:0xc000433b1e}
I0401 06:17:52.093680    4510 custom_plugin_monitor.go:81] Finish parsing custom plugin monitor config file /config/systemd-monitor-counter.json: {Plugin:custom PluginGlobalConfig:{InvokeIntervalString:0xc0003c0bd0 TimeoutString:0xc0003c0be0 InvokeInterval:5m0s Timeout:1m0s MaxOutputLength:0xc000433bf0 Concurrency:0xc000433bf8 EnableMessageChangeBasedConditionUpdate:0x30fdf95 SkipInitialStatus:0x30fdf96} Source:systemd-monitor DefaultConditions:[{Type:FrequentKubeletRestart Status: Transition:0001-01-01 00:00:00 +0000 UTC Reason:NoFrequentKubeletRestart Message:kubelet is functioning properly} {Type:FrequentDockerRestart Status: Transition:0001-01-01 00:00:00 +0000 UTC Reason:NoFrequentDockerRestart Message:docker is functioning properly} {Type:FrequentContainerdRestart Status: Transition:0001-01-01 00:00:00 +0000 UTC Reason:NoFrequentContainerdRestart Message:containerd is functioning properly}] Rules:[0xc000408fc0 0xc000409030 0xc0004090a0] EnableMetricsReporting:0xc000433c0f}
I0401 06:17:52.093973    4510 custom_plugin_monitor.go:81] Finish parsing custom plugin monitor config file /config/iptables-mode-monitor.json: {Plugin:custom PluginGlobalConfig:{InvokeIntervalString:0xc0003c0dc0 TimeoutString:0xc0003c0dd0 InvokeInterval:24h0m0s Timeout:5s MaxOutputLength:0xc000433d58 Concurrency:0xc000433d60 EnableMessageChangeBasedConditionUpdate:0x30fdf95 SkipInitialStatus:0x30fdf96} Source:iptables-mode-monitor DefaultConditions:[] Rules:[0xc000409260] EnableMetricsReporting:0xc000433d68}
I0401 06:17:52.094308    4510 custom_plugin_monitor.go:81] Finish parsing custom plugin monitor config file /config/network-problem-monitor.json: {Plugin:custom PluginGlobalConfig:{InvokeIntervalString:0xc0003c0e40 TimeoutString:0xc0003c0e50 InvokeInterval:30s Timeout:5s MaxOutputLength:0xc000433de0 Concurrency:0xc000433de8 EnableMessageChangeBasedConditionUpdate:0x30fdf95 SkipInitialStatus:0x30fdf96} Source:network-custom-plugin-monitor DefaultConditions:[] Rules:[0xc000409490 0xc000409500] EnableMetricsReporting:0xc000433df0}
I0401 06:17:52.094939    4510 log_monitor.go:80] Finish parsing log monitor config file /config/kernel-monitor.json: {WatcherConfig:{Plugin:kmsg PluginConfig:map[] SkipList:[] LogPath:/dev/kmsg Lookback:5m Delay:} BufferSize:10 Source:kernel-monitor DefaultConditions:[{Type:KernelDeadlock Status: Transition:0001-01-01 00:00:00 +0000 UTC Reason:KernelHasNoDeadlock Message:kernel has no deadlock} {Type:XfsShutdown Status: Transition:0001-01-01 00:00:00 +0000 UTC Reason:XfsHasNotShutDown Message:XFS has not shutdown} {Type:CperHardwareErrorFatal Status: Transition:0001-01-01 00:00:00 +0000 UTC Reason:CperHardwareHasNoFatalError Message:UEFI CPER has no fatal error}] Rules:[{Type:temporary Condition: Reason:OOMKilling Pattern:Killed process \d+ (.+) total-vm:\d+kB, anon-rss:\d+kB, file-rss:\d+kB.* PatternGeneratedMessageSuffix:} {Type:temporary Condition: Reason:TaskHung Pattern:task [\S ]+:\w+ blocked for more than \w+ seconds\. PatternGeneratedMessageSuffix:} {Type:temporary Condition: Reason:UnregisterNetDevice Pattern:unregister_netdevice: waiting for \w+ to become free. Usage count = \d+ PatternGeneratedMessageSuffix:} {Type:temporary Condition: Reason:KernelOops Pattern:BUG: unable to handle kernel NULL pointer dereference at .* PatternGeneratedMessageSuffix:} {Type:temporary Condition: Reason:KernelOops Pattern:divide error: 0000 \[#\d+\] SMP PatternGeneratedMessageSuffix:} {Type:temporary Condition: Reason:Ext4Error Pattern:EXT4-fs error .* PatternGeneratedMessageSuffix:} {Type:temporary Condition: Reason:Ext4Warning Pattern:EXT4-fs warning .* PatternGeneratedMessageSuffix:} {Type:temporary Condition: Reason:IOError Pattern:Buffer I/O error .* PatternGeneratedMessageSuffix:} {Type:permanent Condition:XfsShutdown Reason:XfsHasShutdown Pattern:XFS .* Shutting down filesystem.? PatternGeneratedMessageSuffix:} {Type:temporary Condition: Reason:MemoryReadError Pattern:CE memory read error .* PatternGeneratedMessageSuffix:} {Type:temporary Condition: Reason:CperHardwareErrorCorrected Pattern:.*\[Hardware Error\]: event severity: corrected$ PatternGeneratedMessageSuffix:} {Type:temporary Condition: Reason:CperHardwareErrorRecoverable Pattern:.*\[Hardware Error\]: event severity: recoverable$ PatternGeneratedMessageSuffix:} {Type:permanent Condition:CperHardwareErrorFatal Reason:CperHardwareErrorFatal Pattern:.*\[Hardware Error\]: event severity: fatal$ PatternGeneratedMessageSuffix:} {Type:permanent Condition:KernelDeadlock Reason:DockerHung Pattern:task docker:\w+ blocked for more than \w+ seconds\. PatternGeneratedMessageSuffix:}] EnableMetricsReporting:0xc00041241e}
I0401 06:17:52.095034    4510 log_watchers.go:40] Use log watcher of plugin "kmsg"
I0401 06:17:52.095347    4510 log_monitor.go:80] Finish parsing log monitor config file /config/readonly-monitor.json: {WatcherConfig:{Plugin:kmsg PluginConfig:map[] SkipList:[] LogPath:/dev/kmsg Lookback:5m Delay:} BufferSize:10 Source:readonly-monitor DefaultConditions:[{Type:ReadonlyFilesystem Status: Transition:0001-01-01 00:00:00 +0000 UTC Reason:FilesystemIsNotReadOnly Message:Filesystem is not read-only}] Rules:[{Type:permanent Condition:ReadonlyFilesystem Reason:FilesystemIsReadOnly Pattern:Remounting filesystem read-only PatternGeneratedMessageSuffix:}] EnableMetricsReporting:0xc000412b60}
I0401 06:17:52.095370    4510 log_watchers.go:40] Use log watcher of plugin "kmsg"
I0401 06:17:52.095683    4510 log_monitor.go:80] Finish parsing log monitor config file /config/systemd-monitor.json: {WatcherConfig:{Plugin:journald PluginConfig:map[source:systemd] SkipList:[] LogPath:/var/log/journal Lookback:5m Delay:} BufferSize:10 Source:systemd-monitor DefaultConditions:[] Rules:[{Type:temporary Condition: Reason:KubeletStart Pattern:Started (Kubernetes kubelet|kubelet.service|kubelet.service - Kubernetes kubelet). PatternGeneratedMessageSuffix:} {Type:temporary Condition: Reason:DockerStart Pattern:Starting (Docker Application Container Engine|docker.service|docker.service - Docker Application Container Engine)... PatternGeneratedMessageSuffix:} {Type:temporary Condition: Reason:ContainerdStart Pattern:Starting (containerd container runtime|containerd.service|containerd.service - containerd container runtime)... PatternGeneratedMessageSuffix:}] EnableMetricsReporting:0xc000412c1f}
I0401 06:17:52.095709    4510 log_watchers.go:40] Use log watcher of plugin "journald"
I0401 06:17:52.096482    4510 k8s_exporter.go:56] Waiting for kube-apiserver to be ready (timeout 5m0s)...
I0401 06:17:52.108894    4510 node_problem_detector.go:56] K8s exporter started.
I0401 06:17:52.109801    4510 node_problem_detector.go:60] Prometheus exporter started.
I0401 06:17:52.109829    4510 custom_plugin_monitor.go:112] Start custom plugin monitor /config/systemd-monitor-counter.json
I0401 06:17:52.109841    4510 custom_plugin_monitor.go:112] Start custom plugin monitor /config/iptables-mode-monitor.json
I0401 06:17:52.109849    4510 custom_plugin_monitor.go:112] Start custom plugin monitor /config/network-problem-monitor.json
I0401 06:17:52.109891    4510 log_monitor.go:112] Start log monitor /config/kernel-monitor.json
I0401 06:17:52.109950    4510 log_monitor.go:112] Start log monitor /config/readonly-monitor.json
I0401 06:17:52.109976    4510 log_monitor.go:112] Start log monitor /config/systemd-monitor.json
I0401 06:17:52.110692    4510 custom_plugin_monitor.go:313] Initialized conditions for /config/network-problem-monitor.json: []
I0401 06:17:52.110892    4510 custom_plugin_monitor.go:302] Sending initial status for network-custom-plugin-monitor with conditions: []
I0401 06:17:52.111819    4510 custom_plugin_monitor.go:313] Initialized conditions for /config/iptables-mode-monitor.json: []
I0401 06:17:52.111835    4510 custom_plugin_monitor.go:302] Sending initial status for iptables-mode-monitor with conditions: []
I0401 06:17:52.111780    4510 custom_plugin_monitor.go:313] Initialized conditions for /config/systemd-monitor-counter.json: [{Type:FrequentKubeletRestart Status:False Transition:2026-04-01 06:17:52.111766166 +0000 UTC m=+0.051430149 Reason:NoFrequentKubeletRestart Message:kubelet is functioning properly} {Type:FrequentDockerRestart Status:False Transition:2026-04-01 06:17:52.111766317 +0000 UTC m=+0.051430300 Reason:NoFrequentDockerRestart Message:docker is functioning properly} {Type:FrequentContainerdRestart Status:False Transition:2026-04-01 06:17:52.111766412 +0000 UTC m=+0.051430394 Reason:NoFrequentContainerdRestart Message:containerd is functioning properly}]
I0401 06:17:52.111910    4510 custom_plugin_monitor.go:302] Sending initial status for systemd-monitor with conditions: [{Type:FrequentKubeletRestart Status:False Transition:2026-04-01 06:17:52.111766166 +0000 UTC m=+0.051430149 Reason:NoFrequentKubeletRestart Message:kubelet is functioning properly} {Type:FrequentDockerRestart Status:False Transition:2026-04-01 06:17:52.111766317 +0000 UTC m=+0.051430300 Reason:NoFrequentDockerRestart Message:docker is functioning properly} {Type:FrequentContainerdRestart Status:False Transition:2026-04-01 06:17:52.111766412 +0000 UTC m=+0.051430394 Reason:NoFrequentContainerdRestart Message:containerd is functioning properly}]
I0401 06:17:52.111943    4510 log_monitor.go:237] Initialize condition generated: [{Type:KernelDeadlock Status:False Transition:2026-04-01 06:17:52.111933414 +0000 UTC m=+0.051597396 Reason:KernelHasNoDeadlock Message:kernel has no deadlock} {Type:XfsShutdown Status:False Transition:2026-04-01 06:17:52.111933556 +0000 UTC m=+0.051597539 Reason:XfsHasNotShutDown Message:XFS has not shutdown} {Type:CperHardwareErrorFatal Status:False Transition:2026-04-01 06:17:52.111933662 +0000 UTC m=+0.051597644 Reason:CperHardwareHasNoFatalError Message:UEFI CPER has no fatal error}]
I0401 06:17:52.121795    4510 log_monitor.go:237] Initialize condition generated: [{Type:ReadonlyFilesystem Status:False Transition:2026-04-01 06:17:52.121766199 +0000 UTC m=+0.061430184 Reason:FilesystemIsNotReadOnly Message:Filesystem is not read-only}]
I0401 06:17:52.125033    4510 custom_plugin_monitor.go:112] Start custom plugin monitor /config/kernel-monitor-counter.json
I0401 06:17:52.125090    4510 problem_detector.go:77] Problem detector started
I0401 06:17:52.131797    4510 log_monitor.go:237] Initialize condition generated: []
I0401 06:17:52.131869    4510 custom_plugin_monitor.go:313] Initialized conditions for /config/kernel-monitor-counter.json: [{Type:FrequentUnregisterNetDevice Status:False Transition:2026-04-01 06:17:52.131855316 +0000 UTC m=+0.071519288 Reason:NoFrequentUnregisterNetDevice Message:node is functioning properly}]
I0401 06:17:52.131914    4510 custom_plugin_monitor.go:302] Sending initial status for kernel-monitor with conditions: [{Type:FrequentUnregisterNetDevice Status:False Transition:2026-04-01 06:17:52.131855316 +0000 UTC m=+0.071519288 Reason:NoFrequentUnregisterNetDevice Message:node is functioning properly}]
I0401 06:17:52.192138    4510 log_monitor.go:161] New status generated: &{Source:systemd-monitor Events:[{Severity:warn Timestamp:2026-04-01 06:15:36.77789 +0000 UTC Reason:ContainerdStart Message:Starting containerd.service - containerd container runtime...}] Conditions:[]}
I0401 06:19:41.671949    4510 log_monitor.go:161] New status generated: &{Source:systemd-monitor Events:[{Severity:warn Timestamp:2026-04-01 06:19:41.593223 +0000 UTC Reason:ContainerdStart Message:Starting containerd.service - containerd container runtime...}] Conditions:[]}
I0401 06:20:01.626657    4510 log_monitor.go:161] New status generated: &{Source:systemd-monitor Events:[{Severity:warn Timestamp:2026-04-01 06:20:01.333858 +0000 UTC Reason:ContainerdStart Message:Starting containerd.service - containerd container runtime...}] Conditions:[]}
I0401 06:20:16.786375    4510 log_monitor.go:161] New status generated: &{Source:systemd-monitor Events:[{Severity:warn Timestamp:2026-04-01 06:20:16.519023 +0000 UTC Reason:ContainerdStart Message:Starting containerd.service - containerd container runtime...}] Conditions:[]}
I0401 06:20:22.396804    4510 log_monitor.go:161] New status generated: &{Source:systemd-monitor Events:[{Severity:warn Timestamp:2026-04-01 06:20:22.082176 +0000 UTC Reason:ContainerdStart Message:Starting containerd.service - containerd container runtime...}] Conditions:[]}
I0401 06:20:28.629390    4510 log_monitor.go:161] New status generated: &{Source:systemd-monitor Events:[{Severity:warn Timestamp:2026-04-01 06:20:28.391211 +0000 UTC Reason:ContainerdStart Message:Starting containerd.service - containerd container runtime...}] Conditions:[]}
I0401 06:21:01.648406    4510 log_monitor.go:161] New status generated: &{Source:systemd-monitor Events:[{Severity:warn Timestamp:2026-04-01 06:21:01.309319 +0000 UTC Reason:ContainerdStart Message:Starting containerd.service - containerd container runtime...}] Conditions:[]}
I0401 06:21:35.137999    4510 log_monitor.go:161] New status generated: &{Source:systemd-monitor Events:[{Severity:warn Timestamp:2026-04-01 06:21:34.906559 +0000 UTC Reason:ContainerdStart Message:Starting containerd.service - containerd container runtime...}] Conditions:[]}
I0401 06:21:44.142341    4510 log_monitor.go:161] New status generated: &{Source:systemd-monitor Events:[{Severity:warn Timestamp:2026-04-01 06:21:43.904364 +0000 UTC Reason:ContainerdStart Message:Starting containerd.service - containerd container runtime...}] Conditions:[]}
I0401 06:21:45.891443    4510 log_monitor.go:161] New status generated: &{Source:systemd-monitor Events:[{Severity:warn Timestamp:2026-04-01 06:21:45.612809 +0000 UTC Reason:ContainerdStart Message:Starting containerd.service - containerd container runtime...}] Conditions:[]}
I0401 06:22:02.079859    4510 log_monitor.go:161] New status generated: &{Source:systemd-monitor Events:[{Severity:warn Timestamp:2026-04-01 06:22:01.827101 +0000 UTC Reason:ContainerdStart Message:Starting containerd.service - containerd container runtime...}] Conditions:[]}
I0401 06:22:05.393587    4510 log_monitor.go:161] New status generated: &{Source:systemd-monitor Events:[{Severity:warn Timestamp:2026-04-01 06:22:05.156369 +0000 UTC Reason:ContainerdStart Message:Starting containerd.service - containerd container runtime...}] Conditions:[]}
I0401 06:44:06.497624    4510 log_monitor.go:161] New status generated: &{Source:systemd-monitor Events:[{Severity:warn Timestamp:2026-04-01 06:44:06.257501 +0000 UTC Reason:ContainerdStart Message:Starting containerd.service - containerd container runtime...}] Conditions:[]}
I0401 06:44:42.636206    4510 log_monitor.go:161] New status generated: &{Source:systemd-monitor Events:[{Severity:warn Timestamp:2026-04-01 06:44:42.353282 +0000 UTC Reason:ContainerdStart Message:Starting containerd.service - containerd container runtime...}] Conditions:[]}
I0401 07:02:47.651327    4510 log_monitor.go:161] New status generated: &{Source:systemd-monitor Events:[{Severity:warn Timestamp:2026-04-01 07:02:47.389301 +0000 UTC Reason:ContainerdStart Message:Starting containerd.service - containerd container runtime...}] Conditions:[]}
I0401 07:03:03.414655    4510 log_monitor.go:161] New status generated: &{Source:systemd-monitor Events:[{Severity:warn Timestamp:2026-04-01 07:03:03.027195 +0000 UTC Reason:ContainerdStart Message:Starting containerd.service - containerd container runtime...}] Conditions:[]}
I0401 07:03:05.394581    4510 log_monitor.go:161] New status generated: &{Source:systemd-monitor Events:[{Severity:warn Timestamp:2026-04-01 07:03:05.132242 +0000 UTC Reason:ContainerdStart Message:Starting containerd.service - containerd container runtime...}] Conditions:[]}
I0401 07:03:06.774630    4510 log_monitor.go:161] New status generated: &{Source:systemd-monitor Events:[{Severity:warn Timestamp:2026-04-01 07:03:06.55276 +0000 UTC Reason:ContainerdStart Message:Starting containerd.service - containerd container runtime...}] Conditions:[]}
I0401 07:03:08.057202    4510 log_monitor.go:161] New status generated: &{Source:systemd-monitor Events:[{Severity:warn Timestamp:2026-04-01 07:03:07.709293 +0000 UTC Reason:ContainerdStart Message:Starting containerd.service - containerd container runtime...}] Conditions:[]}
I0401 07:03:09.899276    4510 log_monitor.go:161] New status generated: &{Source:systemd-monitor Events:[{Severity:warn Timestamp:2026-04-01 07:03:09.613928 +0000 UTC Reason:ContainerdStart Message:Starting containerd.service - containerd container runtime...}] Conditions:[]}
I0401 07:03:14.390863    4510 log_monitor.go:161] New status generated: &{Source:systemd-monitor Events:[{Severity:warn Timestamp:2026-04-01 07:03:14.107426 +0000 UTC Reason:ContainerdStart Message:Starting containerd.service - containerd container runtime...}] Conditions:[]}
I0401 07:03:16.369108    4510 log_monitor.go:161] New status generated: &{Source:systemd-monitor Events:[{Severity:warn Timestamp:2026-04-01 07:03:16.111857 +0000 UTC Reason:ContainerdStart Message:Starting containerd.service - containerd container runtime...}] Conditions:[]}
I0401 07:03:18.645132    4510 log_monitor.go:161] New status generated: &{Source:systemd-monitor Events:[{Severity:warn Timestamp:2026-04-01 07:03:18.377895 +0000 UTC Reason:ContainerdStart Message:Starting containerd.service - containerd container runtime...}] Conditions:[]}
I0401 07:03:49.915407    4510 log_monitor.go:161] New status generated: &{Source:systemd-monitor Events:[{Severity:warn Timestamp:2026-04-01 07:03:49.887923 +0000 UTC Reason:ContainerdStart Message:Starting containerd.service - containerd container runtime...}] Conditions:[]}
I0401 07:03:53.394354    4510 log_monitor.go:161] New status generated: &{Source:systemd-monitor Events:[{Severity:warn Timestamp:2026-04-01 07:03:53.101364 +0000 UTC Reason:ContainerdStart Message:Starting containerd.service - containerd container runtime...}] Conditions:[]}
I0401 07:03:56.689072    4510 log_monitor.go:161] New status generated: &{Source:systemd-monitor Events:[{Severity:warn Timestamp:2026-04-01 07:03:56.462619 +0000 UTC Reason:ContainerdStart Message:Starting containerd.service - containerd container runtime...}] Conditions:[]}
I0401 07:04:00.034938    4510 log_monitor.go:161] New status generated: &{Source:systemd-monitor Events:[{Severity:warn Timestamp:2026-04-01 07:03:59.807404 +0000 UTC Reason:ContainerdStart Message:Starting containerd.service - containerd container runtime...}] Conditions:[]}
I0401 07:04:03.155947    4510 log_monitor.go:161] New status generated: &{Source:systemd-monitor Events:[{Severity:warn Timestamp:2026-04-01 07:04:02.894065 +0000 UTC Reason:ContainerdStart Message:Starting containerd.service - containerd container runtime...}] Conditions:[]}
I0401 07:04:05.643607    4510 log_monitor.go:161] New status generated: &{Source:systemd-monitor Events:[{Severity:warn Timestamp:2026-04-01 07:04:05.42907 +0000 UTC Reason:ContainerdStart Message:Starting containerd.service - containerd container runtime...}] Conditions:[]}
I0401 07:04:08.399255    4510 log_monitor.go:161] New status generated: &{Source:systemd-monitor Events:[{Severity:warn Timestamp:2026-04-01 07:04:08.147461 +0000 UTC Reason:ContainerdStart Message:Starting containerd.service - containerd container runtime...}] Conditions:[]}
I0401 07:04:15.120155    4510 log_monitor.go:161] New status generated: &{Source:systemd-monitor Events:[{Severity:warn Timestamp:2026-04-01 07:04:14.699852 +0000 UTC Reason:ContainerdStart Message:Starting containerd.service - containerd container runtime...}] Conditions:[]}
I0401 07:04:17.389640    4510 log_monitor.go:161] New status generated: &{Source:systemd-monitor Events:[{Severity:warn Timestamp:2026-04-01 07:04:17.1056 +0000 UTC Reason:ContainerdStart Message:Starting containerd.service - containerd container runtime...}] Conditions:[]}
I0401 07:04:20.632406    4510 log_monitor.go:161] New status generated: &{Source:systemd-monitor Events:[{Severity:warn Timestamp:2026-04-01 07:04:20.385319 +0000 UTC Reason:ContainerdStart Message:Starting containerd.service - containerd container runtime...}] Conditions:[]}
I0401 07:04:47.400920    4510 log_monitor.go:161] New status generated: &{Source:systemd-monitor Events:[{Severity:warn Timestamp:2026-04-01 07:04:47.16846 +0000 UTC Reason:ContainerdStart Message:Starting containerd.service - containerd container runtime...}] Conditions:[]}
I0401 07:04:49.644431    4510 log_monitor.go:161] New status generated: &{Source:systemd-monitor Events:[{Severity:warn Timestamp:2026-04-01 07:04:49.336994 +0000 UTC Reason:ContainerdStart Message:Starting containerd.service - containerd container runtime...}] Conditions:[]}
I0401 07:05:12.890177    4510 log_monitor.go:161] New status generated: &{Source:systemd-monitor Events:[{Severity:warn Timestamp:2026-04-01 07:05:12.671305 +0000 UTC Reason:ContainerdStart Message:Starting containerd.service - containerd container runtime...}] Conditions:[]}
I0401 07:05:16.082684    4510 log_monitor.go:161] New status generated: &{Source:systemd-monitor Events:[{Severity:warn Timestamp:2026-04-01 07:05:15.971226 +0000 UTC Reason:ContainerdStart Message:Starting containerd.service - containerd container runtime...}] Conditions:[]}
I0401 07:05:23.141585    4510 log_monitor.go:161] New status generated: &{Source:systemd-monitor Events:[{Severity:warn Timestamp:2026-04-01 07:05:22.921637 +0000 UTC Reason:ContainerdStart Message:Starting containerd.service - containerd container runtime...}] Conditions:[]}
I0401 07:05:26.405019    4510 log_monitor.go:161] New status generated: &{Source:systemd-monitor Events:[{Severity:warn Timestamp:2026-04-01 07:05:26.163813 +0000 UTC Reason:ContainerdStart Message:Starting containerd.service - containerd container runtime...}] Conditions:[]}
I0401 07:05:33.653184    4510 log_monitor.go:161] New status generated: &{Source:systemd-monitor Events:[{Severity:warn Timestamp:2026-04-01 07:05:33.424341 +0000 UTC Reason:ContainerdStart Message:Starting containerd.service - containerd container runtime...}] Conditions:[]}
I0401 07:05:58.399398    4510 log_monitor.go:161] New status generated: &{Source:systemd-monitor Events:[{Severity:warn Timestamp:2026-04-01 07:05:58.151135 +0000 UTC Reason:ContainerdStart Message:Starting containerd.service - containerd container runtime...}] Conditions:[]}
I0401 07:06:00.405801    4510 log_monitor.go:161] New status generated: &{Source:systemd-monitor Events:[{Severity:warn Timestamp:2026-04-01 07:06:00.054218 +0000 UTC Reason:ContainerdStart Message:Starting containerd.service - containerd container runtime...}] Conditions:[]}
I0401 07:06:23.895000    4510 log_monitor.go:161] New status generated: &{Source:systemd-monitor Events:[{Severity:warn Timestamp:2026-04-01 07:06:23.622262 +0000 UTC Reason:ContainerdStart Message:Starting containerd.service - containerd container runtime...}] Conditions:[]}
I0401 07:06:26.696025    4510 log_monitor.go:161] New status generated: &{Source:systemd-monitor Events:[{Severity:warn Timestamp:2026-04-01 07:06:26.451342 +0000 UTC Reason:ContainerdStart Message:Starting containerd.service - containerd container runtime...}] Conditions:[]}
I0401 07:08:33.505146    4510 log_monitor.go:161] New status generated: &{Source:systemd-monitor Events:[{Severity:warn Timestamp:2026-04-01 07:08:33.205087 +0000 UTC Reason:ContainerdStart Message:Starting containerd.service - containerd container runtime...}] Conditions:[]}
I0401 07:14:35.785010    4510 log_monitor.go:161] New status generated: &{Source:systemd-monitor Events:[{Severity:warn Timestamp:2026-04-01 07:14:35.446393 +0000 UTC Reason:ContainerdStart Message:Starting containerd.service - containerd container runtime...}] Conditions:[]}

As you can see i have manually restart containerd on a node multiple times (more than 5) :

sudo systemctl restart containerd

NPD seems to detect it without any problem.

If i manually trigger the script, it is well detected :

root@node-problem-detector-xhxkh:/# /home/kubernetes/bin/log-counter \
>   --journald-source=systemd \
>   --log-path=/var/log/journal \
>   --lookback=20m \
>   --count=5 \
>   --pattern="Starting (containerd container runtime|containerd.service|containerd.service - containerd container runtime)..."
Found 32 matching logs, which meets the threshold of 5

But the condition on the node is not being updated.

Conditions:
  Type                          Status  LastHeartbeatTime                 LastTransitionTime                Reason                          Message
  ----                          ------  -----------------                 ------------------                ------                          -------
  FrequentDockerRestart         False   Wed, 01 Apr 2026 09:33:02 +0200   Wed, 01 Apr 2026 08:17:52 +0200   NoFrequentDockerRestart         docker is functioning properly
  FrequentContainerdRestart     False   Wed, 01 Apr 2026 09:33:02 +0200   Wed, 01 Apr 2026 08:17:52 +0200   NoFrequentContainerdRestart     containerd is functioning properly
  KernelDeadlock                False   Wed, 01 Apr 2026 09:33:02 +0200   Wed, 01 Apr 2026 08:17:52 +0200   KernelHasNoDeadlock             kernel has no deadlock
  XfsShutdown                   False   Wed, 01 Apr 2026 09:33:02 +0200   Wed, 01 Apr 2026 08:17:52 +0200   XfsHasNotShutDown               XFS has not shutdown
  CperHardwareErrorFatal        False   Wed, 01 Apr 2026 09:33:02 +0200   Wed, 01 Apr 2026 08:17:52 +0200   CperHardwareHasNoFatalError     UEFI CPER has no fatal error
  ReadonlyFilesystem            False   Wed, 01 Apr 2026 09:33:02 +0200   Wed, 01 Apr 2026 08:17:52 +0200   FilesystemIsNotReadOnly         Filesystem is not read-only
  FrequentUnregisterNetDevice   False   Wed, 01 Apr 2026 09:33:02 +0200   Wed, 01 Apr 2026 08:17:52 +0200   NoFrequentUnregisterNetDevice   node is functioning properly
  FrequentKubeletRestart        False   Wed, 01 Apr 2026 09:33:02 +0200   Wed, 01 Apr 2026 08:17:52 +0200   NoFrequentKubeletRestart        kubelet is functioning properly
  NetworkUnavailable            False   Wed, 01 Apr 2026 08:17:39 +0200   Wed, 01 Apr 2026 08:17:39 +0200   CalicoIsUp                      Calico is running on this node
  MemoryPressure                False   Wed, 01 Apr 2026 09:33:31 +0200   Wed, 01 Apr 2026 08:17:08 +0200   KubeletHasSufficientMemory      kubelet has sufficient memory available
  DiskPressure                  False   Wed, 01 Apr 2026 09:33:31 +0200   Wed, 01 Apr 2026 08:17:08 +0200   KubeletHasNoDiskPressure        kubelet has no disk pressure
  PIDPressure                   False   Wed, 01 Apr 2026 09:33:31 +0200   Wed, 01 Apr 2026 08:17:08 +0200   KubeletHasSufficientPID         kubelet has sufficient PID available
  Ready                         True    Wed, 01 Apr 2026 09:33:31 +0200   Wed, 01 Apr 2026 08:17:27 +0200   KubeletReady                    kubelet is posting ready status

HR :

apiVersion: helm.toolkit.fluxcd.io/v2
kind: HelmRelease
metadata:
  name: node-problem-detector
  namespace: kube-system
spec:
  interval: 2m
  releaseName: node-problem-detector
  targetNamespace: kube-system
  install:
    disableWait: true
    crds: "CreateReplace"
  chart:
    spec:
      chart: node-problem-detector
      version: "2.4.0"
      sourceRef:
        kind: HelmRepository
        name: node-problem-detector
        namespace: flux-system
      interval: 10m
  upgrade:
    force: false
    disableWait: true
    remediation:
      remediateLastFailure: false
  values:
    maxUnavailable: 10%
    hostPID: true
    tolerations:
      - effect: NoSchedule
        operator: Exists
      - effect: NoExecute
        operator: Exists
    resources:
      requests:
        cpu: 20m
        memory: 64Mi
      limits:
        memory: 256Mi

    settings:
      log_monitors:
        - /config/kernel-monitor.json
        - /config/readonly-monitor.json
        - /config/systemd-monitor.json
      custom_plugin_monitors:
        - /config/kernel-monitor-counter.json
        - /config/systemd-monitor-counter.json
        - /config/iptables-mode-monitor.json
        - /config/network-problem-monitor.json

    metrics:
      enabled: true
      serviceMonitor:
        enabled: true
        additionalLabels:
          prometheus: system
        attachMetadata:
          node: true
      prometheusRule:
        enabled: true

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions