Continuous Gossip Warnings (Failed determining organization) causing disk space pressure after old certificates expire (new certificates have been updated)

### Description

Hi everyone,

may related issue: https://github.com/hyperledger/fabric/issues/5111

We are experiencing an issue with our peers and could really use some expert guidance.

After the old certificates of other peers in our channel expired (note: these peers had already been successfully updated with new certificates), we started seeing the following warning in our logs:

`WARN [gossip.gossip] func3 -> Unable to determine org of message tag:EMPTY alive_msg:<membership:<endpoint:...
`

<img width="1371" height="651" alt="Image" src="https://github.com/user-attachments/assets/7960f74a-32f0-415a-9c7f-5d88da2c957b" />
🔺 The first peer whose credentials expired (log starts half an hour after expiration).


Shortly after, a second type of warning started appearing continuously and in massive volumes, which is now causing severe disk space pressure on our machines:

```
WARN [gossip.gossip] func3 -> Failed determining organization of d5bfxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx5aab5ef8b
WARN [gossip.gossip] func3 -> Failed determining organization of b419xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx7f5ac3ad4
```

<img width="1368" height="264" alt="Image" src="https://github.com/user-attachments/assets/41925bbc-9649-451f-abf0-0453249f5da0" />
🔺 Later, the second warning logs start showing.


Because this second warning storm only triggered after the first peer's old certificate officially expired, we suspect these two events are connected, but we were unable to reproduce the issue there, so we aren't sure this two log are related.🤔

Has anyone encountered this specific behavior where the gossip filter continues to spam these hashes even after a container restart? Any guidance on how to properly clear these ghost identities or stop the log spam would be greatly appreciated!

<img width="1218" height="579" alt="Image" src="https://github.com/user-attachments/assets/a7f48705-6a86-46b5-83a1-dcea4f47e97e" />

As can be seen from the graph, it continuously generates repeat log.

### Steps to reproduce

Our Environment & Troubleshooting Steps:

Versions Affected: We checked our hyperledger/fabric-peer images. We have two separate environments running v2.5.10 and v2.5.13, and both are experiencing this exact same log spam.

Container Restarts: We have tried restarting the peer containers to clear the memory. We are aware that the v2.5.11 release notes mention a fix regarding "Gossip handling of expired certificates". However, our understanding is that even without that fix (e.g., on v2.5.10), simply restarting the container should clear the idMapper memory and stop the use of the old cache. Despite the restarts, the warnings persist continuously.

Failed Reproduction: To isolate the issue, we built a completely fresh environment from scratch using the v2.5.10 peer image and re-do certificates updated and expired process, but we were unable to reproduce the issue there.

We are currently stuck and aren't entirely sure if our issue is directly related to the v2.5.11 fix, given that v2.5.13 is also affected and container restarts aren't mitigating the log spam.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Continuous Gossip Warnings (Failed determining organization) causing disk space pressure after old certificates expire (new certificates have been updated) #5458

Description

Steps to reproduce

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Continuous Gossip Warnings (Failed determining organization) causing disk space pressure after old certificates expire (new certificates have been updated) #5458

Description

Description

Steps to reproduce

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions