Silent wrong-service communication when IP_COOLDOWN_PERIOD < nf_conntrack_tcp_timeout_syn_sent causes stale conntrack DNAT to route SYN to recycled pod IP

**What happened**:

<p class="font-claude-response-body break-words whitespace-normal leading-[1.7]"><strong>What happened:</strong></p>
<p class="font-claude-response-body break-words whitespace-normal leading-[1.7]">A silent wrong-service communication occurs when all of the following align:</p>
<ol class="[li_&amp;]:mb-0 [li_&amp;]:mt-1 [li_&amp;]:gap-1 [&amp;:not(:last-child)_ul]:pb-1 [&amp;:not(:last-child)_ol]:pb-1 list-decimal flex flex-col gap-1 pl-8 mb-3">
<li class="whitespace-normal break-words pl-2">Client pod initiates a TCP dial to a Service VIP</li>
<li class="whitespace-normal break-words pl-2">Backend pod dies abruptly mid-handshake (OOMKill / crash / forced eviction)</li>
<li class="whitespace-normal break-words pl-2">Client app times out (e.g. 5s dial timeout) and closes the socket</li>
<li class="whitespace-normal break-words pl-2">OS reuses the same source port on retry (valid — SYN_SENT → CLOSED has no TIME_WAIT)</li>
<li class="whitespace-normal break-words pl-2">Pod IP is recycled within <code class="bg-text-200/5 border border-0.5 border-border-300 text-danger-000 whitespace-pre-wrap rounded-[0.4rem] px-1 py-px text-[0.9rem]">IP_COOLDOWN_PERIOD</code> window</li>
<li class="whitespace-normal break-words pl-2">New pod starts with the same recycled IP</li>
<li class="whitespace-normal break-words pl-2">Stale conntrack DNAT routes new SYN to new pod → TCP handshake <strong>succeeds silently</strong></li>
<li class="whitespace-normal break-words pl-2">Client is now talking to the <strong>wrong service with no error, no RST, no log</strong></li>
</ol>
<hr class="border-border-200 border-t-0.5 my-3 mx-1.5">
<p class="font-claude-response-body break-words whitespace-normal leading-[1.7]"><strong>Root cause — timer mismatch between two independent subsystems:</strong></p>
<div role="group" aria-label="Code" tabindex="0" class="relative group/copy bg-bg-000/50 border-0.5 border-border-400 rounded-lg focus:outline-none focus-visible:ring-2 focus-visible:ring-accent-100"><div class="sticky opacity-0 group-hover/copy:opacity-100 group-focus-within/copy:opacity-100 top-2 py-2 h-12 w-0 float-right"><div class="absolute right-0 h-8 px-2 items-center inline-flex z-10"><button class="inline-flex
  items-center
  justify-center
  relative
  isolate
  shrink-0
  can-focus
  select-none
  disabled:pointer-events-none
  disabled:opacity-50
  disabled:shadow-none
  disabled:drop-shadow-none border-transparent
          transition
          font-base
          duration-300
          ease-[cubic-bezier(0.165,0.85,0.45,1)] h-8 w-8 rounded-md backdrop-blur-md _fill_56vq7_9 _ghost_56vq7_96" type="button" aria-label="Copy to clipboard" data-state="closed"><div class="relative"><div class="transition-all opacity-100 scale-100" style="width: 20px; height: 20px; display: flex; align-items: center; justify-content: center;"><svg width="20" height="20" viewBox="0 0 20 20" fill="currentColor" xmlns="http://www.w3.org/2000/svg" class="transition-all opacity-100 scale-100" aria-hidden="true" style="flex-shrink: 0;"><path d="M12.5 3A1.5 1.5 0 0 1 14 4.5V6h1.5A1.5 1.5 0 0 1 17 7.5v8a1.5 1.5 0 0 1-1.5 1.5h-8A1.5 1.5 0 0 1 6 15.5V14H4.5A1.5 1.5 0 0 1 3 12.5v-8A1.5 1.5 0 0 1 4.5 3zm1.5 9.5a1.5 1.5 0 0 1-1.5 1.5H7v1.5a.5.5 0 0 0 .5.5h8a.5.5 0 0 0 .5-.5v-8a.5.5 0 0 0-.5-.5H14zM4.5 4a.5.5 0 0 0-.5.5v8a.5.5 0 0 0 .5.5h8a.5.5 0 0 0 .5-.5v-8a.5.5 0 0 0-.5-.5z"></path></svg></div><div class="absolute inset-0 flex items-center justify-center"><div class="transition-all opacity-0 scale-50" style="width: 20px; height: 20px; display: flex; align-items: center; justify-content: center;"><svg width="20" height="20" viewBox="0 0 20 20" fill="currentColor" xmlns="http://www.w3.org/2000/svg" class="transition-all opacity-0 scale-50" aria-hidden="true" style="flex-shrink: 0;"><path d="M15.188 5.11a.5.5 0 0 1 .752.626l-.056.084-7.5 9a.5.5 0 0 1-.738.033l-3.5-3.5-.064-.078a.501.501 0 0 1 .693-.693l.078.064 3.113 3.113 7.15-8.58z"></path></svg></div></div></div></button></div></div><div class="overflow-x-auto"><pre class="code-block__code !my-0 !rounded-lg !text-sm !leading-relaxed p-3.5" style="color: rgb(234, 236, 240); background: transparent; font-family: var(--font-mono);"><code style="color: rgb(234, 236, 240); background: transparent; font-family: var(--font-mono); white-space: pre-wrap;"><span><span>IP_COOLDOWN_PERIOD                          (VPC CNI)  default: 30s
</span></span><span>net.netfilter.nf_conntrack_tcp_timeout_syn_sent  (kernel)   default: 120s</span></code></pre></div></div>
<p class="font-claude-response-body break-words whitespace-normal leading-[1.7]">When a TCP handshake fails mid-way (SYN_SENT state), the Linux kernel retains the conntrack DNAT entry for <code class="bg-text-200/5 border border-0.5 border-border-300 text-danger-000 whitespace-pre-wrap rounded-[0.4rem] px-1 py-px text-[0.9rem]">nf_conntrack_tcp_timeout_syn_sent</code> seconds (120s by default). This entry maps:</p>
<div role="group" aria-label="Code" tabindex="0" class="relative group/copy bg-bg-000/50 border-0.5 border-border-400 rounded-lg focus:outline-none focus-visible:ring-2 focus-visible:ring-accent-100"><div class="sticky opacity-0 group-hover/copy:opacity-100 group-focus-within/copy:opacity-100 top-2 py-2 h-12 w-0 float-right"><div class="absolute right-0 h-8 px-2 items-center inline-flex z-10"><button class="inline-flex
  items-center
  justify-center
  relative
  isolate
  shrink-0
  can-focus
  select-none
  disabled:pointer-events-none
  disabled:opacity-50
  disabled:shadow-none
  disabled:drop-shadow-none border-transparent
          transition
          font-base
          duration-300
          ease-[cubic-bezier(0.165,0.85,0.45,1)] h-8 w-8 rounded-md backdrop-blur-md _fill_56vq7_9 _ghost_56vq7_96" type="button" aria-label="Copy to clipboard" data-state="closed"><div class="relative"><div class="transition-all opacity-100 scale-100" style="width: 20px; height: 20px; display: flex; align-items: center; justify-content: center;"><svg width="20" height="20" viewBox="0 0 20 20" fill="currentColor" xmlns="http://www.w3.org/2000/svg" class="transition-all opacity-100 scale-100" aria-hidden="true" style="flex-shrink: 0;"><path d="M12.5 3A1.5 1.5 0 0 1 14 4.5V6h1.5A1.5 1.5 0 0 1 17 7.5v8a1.5 1.5 0 0 1-1.5 1.5h-8A1.5 1.5 0 0 1 6 15.5V14H4.5A1.5 1.5 0 0 1 3 12.5v-8A1.5 1.5 0 0 1 4.5 3zm1.5 9.5a1.5 1.5 0 0 1-1.5 1.5H7v1.5a.5.5 0 0 0 .5.5h8a.5.5 0 0 0 .5-.5v-8a.5.5 0 0 0-.5-.5H14zM4.5 4a.5.5 0 0 0-.5.5v8a.5.5 0 0 0 .5.5h8a.5.5 0 0 0 .5-.5v-8a.5.5 0 0 0-.5-.5z"></path></svg></div><div class="absolute inset-0 flex items-center justify-center"><div class="transition-all opacity-0 scale-50" style="width: 20px; height: 20px; display: flex; align-items: center; justify-content: center;"><svg width="20" height="20" viewBox="0 0 20 20" fill="currentColor" xmlns="http://www.w3.org/2000/svg" class="transition-all opacity-0 scale-50" aria-hidden="true" style="flex-shrink: 0;"><path d="M15.188 5.11a.5.5 0 0 1 .752.626l-.056.084-7.5 9a.5.5 0 0 1-.738.033l-3.5-3.5-.064-.078a.501.501 0 0 1 .693-.693l.078.064 3.113 3.113 7.15-8.58z"></path></svg></div></div></div></button></div></div><div class="overflow-x-auto"><pre class="code-block__code !my-0 !rounded-lg !text-sm !leading-relaxed p-3.5" style="color: rgb(234, 236, 240); background: transparent; font-family: var(--font-mono);"><code style="color: rgb(234, 236, 240); background: transparent; font-family: var(--font-mono); white-space: pre-wrap;"><span><span>clientIP:srcPort → ServiceVIP:port → DNAT → deadPodIP:port</span></span></code></pre></div></div>
<p class="font-claude-response-body break-words whitespace-normal leading-[1.7]">Critically:</p>
<ul class="[li_&amp;]:mb-0 [li_&amp;]:mt-1 [li_&amp;]:gap-1 [&amp;:not(:last-child)_ul]:pb-1 [&amp;:not(:last-child)_ol]:pb-1 list-disc flex flex-col gap-1 pl-8 mb-3">
<li class="whitespace-normal break-words pl-2">The <strong>TCP socket is closed immediately</strong> after the app timeout — no TIME_WAIT for SYN_SENT state connections, port is free instantly</li>
<li class="whitespace-normal break-words pl-2">The <strong>conntrack entry stays alive</strong> for 120 seconds independently</li>
<li class="whitespace-normal break-words pl-2">The <strong>OS port allocator has no awareness of conntrack</strong> — it sees the port as free and can reuse it</li>
</ul>
<p class="font-claude-response-body break-words whitespace-normal leading-[1.7]">If <code class="bg-text-200/5 border border-0.5 border-border-300 text-danger-000 whitespace-pre-wrap rounded-[0.4rem] px-1 py-px text-[0.9rem]">IP_COOLDOWN_PERIOD=30s</code> allows the dead pod's IP to be recycled before the 120s conntrack entry expires, the stale DNAT silently routes the retried SYN to the new pod occupying the recycled IP.</p>
<p class="font-claude-response-body break-words whitespace-normal leading-[1.7]">The required invariant that is currently violated by defaults:</p>
<div role="group" aria-label="Code" tabindex="0" class="relative group/copy bg-bg-000/50 border-0.5 border-border-400 rounded-lg focus:outline-none focus-visible:ring-2 focus-visible:ring-accent-100"><div class="sticky opacity-0 group-hover/copy:opacity-100 group-focus-within/copy:opacity-100 top-2 py-2 h-12 w-0 float-right"><div class="absolute right-0 h-8 px-2 items-center inline-flex z-10"><button class="inline-flex
  items-center
  justify-center
  relative
  isolate
  shrink-0
  can-focus
  select-none
  disabled:pointer-events-none
  disabled:opacity-50
  disabled:shadow-none
  disabled:drop-shadow-none border-transparent
          transition
          font-base
          duration-300
          ease-[cubic-bezier(0.165,0.85,0.45,1)] h-8 w-8 rounded-md backdrop-blur-md _fill_56vq7_9 _ghost_56vq7_96" type="button" aria-label="Copy to clipboard" data-state="closed"><div class="relative"><div class="transition-all opacity-100 scale-100" style="width: 20px; height: 20px; display: flex; align-items: center; justify-content: center;"><svg width="20" height="20" viewBox="0 0 20 20" fill="currentColor" xmlns="http://www.w3.org/2000/svg" class="transition-all opacity-100 scale-100" aria-hidden="true" style="flex-shrink: 0;"><path d="M12.5 3A1.5 1.5 0 0 1 14 4.5V6h1.5A1.5 1.5 0 0 1 17 7.5v8a1.5 1.5 0 0 1-1.5 1.5h-8A1.5 1.5 0 0 1 6 15.5V14H4.5A1.5 1.5 0 0 1 3 12.5v-8A1.5 1.5 0 0 1 4.5 3zm1.5 9.5a1.5 1.5 0 0 1-1.5 1.5H7v1.5a.5.5 0 0 0 .5.5h8a.5.5 0 0 0 .5-.5v-8a.5.5 0 0 0-.5-.5H14zM4.5 4a.5.5 0 0 0-.5.5v8a.5.5 0 0 0 .5.5h8a.5.5 0 0 0 .5-.5v-8a.5.5 0 0 0-.5-.5z"></path></svg></div><div class="absolute inset-0 flex items-center justify-center"><div class="transition-all opacity-0 scale-50" style="width: 20px; height: 20px; display: flex; align-items: center; justify-content: center;"><svg width="20" height="20" viewBox="0 0 20 20" fill="currentColor" xmlns="http://www.w3.org/2000/svg" class="transition-all opacity-0 scale-50" aria-hidden="true" style="flex-shrink: 0;"><path d="M15.188 5.11a.5.5 0 0 1 .752.626l-.056.084-7.5 9a.5.5 0 0 1-.738.033l-3.5-3.5-.064-.078a.501.501 0 0 1 .693-.693l.078.064 3.113 3.113 7.15-8.58z"></path></svg></div></div></div></button></div></div><div class="overflow-x-auto"><pre class="code-block__code !my-0 !rounded-lg !text-sm !leading-relaxed p-3.5" style="color: rgb(234, 236, 240); background: transparent; font-family: var(--font-mono);"><code style="color: rgb(234, 236, 240); background: transparent; font-family: var(--font-mono); white-space: pre-wrap;"><span><span>IP_COOLDOWN_PERIOD &gt; nf_conntrack_tcp_timeout_syn_sent
</span></span><span>     30s           &gt;           120s                     ← VIOLATED</span></code></pre></div></div>
<hr class="border-border-200 border-t-0.5 my-3 mx-1.5">
<p class="font-claude-response-body break-words whitespace-normal leading-[1.7]"><strong>Why preStop hooks do not prevent this:</strong></p>
<p class="font-claude-response-body break-words whitespace-normal leading-[1.7]"><code class="bg-text-200/5 border border-0.5 border-border-300 text-danger-000 whitespace-pre-wrap rounded-[0.4rem] px-1 py-px text-[0.9rem]">preStop</code> hooks only execute during graceful termination (SIGTERM). They are completely bypassed in:</p>
<ul class="[li_&amp;]:mb-0 [li_&amp;]:mt-1 [li_&amp;]:gap-1 [&amp;:not(:last-child)_ul]:pb-1 [&amp;:not(:last-child)_ol]:pb-1 list-disc flex flex-col gap-1 pl-8 mb-3">
<li class="whitespace-normal break-words pl-2"><strong>OOMKill</strong> — SIGKILL is sent directly, preStop never runs</li>
<li class="whitespace-normal break-words pl-2"><strong>Node pressure eviction</strong> — kubelet may skip the graceful period entirely</li>
<li class="whitespace-normal break-words pl-2"><strong>Pod crash</strong> — container exits before preStop can fire</li>
</ul>
<p class="font-claude-response-body break-words whitespace-normal leading-[1.7]">In all these cases no FIN exchange occurs, so conntrack never naturally transitions out of SYN_SENT state — the entry simply ages out after 120 seconds.</p>
<hr class="border-border-200 border-t-0.5 my-3 mx-1.5">
<p class="font-claude-response-body break-words whitespace-normal leading-[1.7]"><strong>Why this is silent and dangerous:</strong></p>
<div role="group" aria-label="Code" tabindex="0" class="relative group/copy bg-bg-000/50 border-0.5 border-border-400 rounded-lg focus:outline-none focus-visible:ring-2 focus-visible:ring-accent-100"><div class="sticky opacity-0 group-hover/copy:opacity-100 group-focus-within/copy:opacity-100 top-2 py-2 h-12 w-0 float-right"><div class="absolute right-0 h-8 px-2 items-center inline-flex z-10"><button class="inline-flex
  items-center
  justify-center
  relative
  isolate
  shrink-0
  can-focus
  select-none
  disabled:pointer-events-none
  disabled:opacity-50
  disabled:shadow-none
  disabled:drop-shadow-none border-transparent
          transition
          font-base
          duration-300
          ease-[cubic-bezier(0.165,0.85,0.45,1)] h-8 w-8 rounded-md backdrop-blur-md _fill_56vq7_9 _ghost_56vq7_96" type="button" aria-label="Copy to clipboard" data-state="closed"><div class="relative"><div class="transition-all opacity-100 scale-100" style="width: 20px; height: 20px; display: flex; align-items: center; justify-content: center;"><svg width="20" height="20" viewBox="0 0 20 20" fill="currentColor" xmlns="http://www.w3.org/2000/svg" class="transition-all opacity-100 scale-100" aria-hidden="true" style="flex-shrink: 0;"><path d="M12.5 3A1.5 1.5 0 0 1 14 4.5V6h1.5A1.5 1.5 0 0 1 17 7.5v8a1.5 1.5 0 0 1-1.5 1.5h-8A1.5 1.5 0 0 1 6 15.5V14H4.5A1.5 1.5 0 0 1 3 12.5v-8A1.5 1.5 0 0 1 4.5 3zm1.5 9.5a1.5 1.5 0 0 1-1.5 1.5H7v1.5a.5.5 0 0 0 .5.5h8a.5.5 0 0 0 .5-.5v-8a.5.5 0 0 0-.5-.5H14zM4.5 4a.5.5 0 0 0-.5.5v8a.5.5 0 0 0 .5.5h8a.5.5 0 0 0 .5-.5v-8a.5.5 0 0 0-.5-.5z"></path></svg></div><div class="absolute inset-0 flex items-center justify-center"><div class="transition-all opacity-0 scale-50" style="width: 20px; height: 20px; display: flex; align-items: center; justify-content: center;"><svg width="20" height="20" viewBox="0 0 20 20" fill="currentColor" xmlns="http://www.w3.org/2000/svg" class="transition-all opacity-0 scale-50" aria-hidden="true" style="flex-shrink: 0;"><path d="M15.188 5.11a.5.5 0 0 1 .752.626l-.056.084-7.5 9a.5.5 0 0 1-.738.033l-3.5-3.5-.064-.078a.501.501 0 0 1 .693-.693l.078.064 3.113 3.113 7.15-8.58z"></path></svg></div></div></div></button></div></div><div class="overflow-x-auto"><pre class="code-block__code !my-0 !rounded-lg !text-sm !leading-relaxed p-3.5" style="color: rgb(234, 236, 240); background: transparent; font-family: var(--font-mono);"><code style="color: rgb(234, 236, 240); background: transparent; font-family: var(--font-mono); white-space: pre-wrap;"><span><span>Normal failure (RST / timeout):
</span></span><span>  Client gets error → retry logic fires → recoverable ✅
</span><span>
</span><span>This failure (stale conntrack + recycled IP):
</span><span>  TCP handshake succeeds to wrong pod
</span><span>  HTTP response returns 404 from wrong service
</span><span>  No error, no RST, no log entry anywhere
</span><span>  Client silently processes wrong data ❌</span></code></pre></div></div>
<p class="font-claude-response-body break-words whitespace-normal leading-[1.7]">This is not detectable at the TCP or HTTP layer without mTLS identity verification or explicit pod identity headers.</p>
<hr class="border-border-200 border-t-0.5 my-3 mx-1.5">
<p class="font-claude-response-body break-words whitespace-normal leading-[1.7]"><strong>Conditions required for this bug to trigger:</strong></p>
<div class="overflow-x-auto w-full px-2 mb-6">
Condition | Detail
-- | --
Abrupt pod termination | OOMKill, crash, forced eviction — no FIN exchange
Client dial timeout < 120s | App closes socket before conntrack entry expires
OS source port reuse | Ephemeral pool reuses same port (especially under tcp_tw_reuse=1)
IP recycled within 120s | IP_COOLDOWN_PERIOD=30s default allows this
New pod gets same IP | K8s IP pool recycles IPs — common on busy nodes

</div>
<p class="font-claude-response-body break-words whitespace-normal leading-[1.7]">All five conditions are <strong>default behavior</strong> in a standard EKS cluster. This is not a corner case.</p>

**Environment**: EKS
- Kubernetes version (use `kubectl version`): 1.33
- CNI Version v1.20.1-eksbuild.1
- OS (e.g: `cat /etc/os-release`): Bottlerocket 



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Silent wrong-service communication when IP_COOLDOWN_PERIOD < nf_conntrack_tcp_timeout_syn_sent causes stale conntrack DNAT to route SYN to recycled pod IP #3634

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Silent wrong-service communication when IP_COOLDOWN_PERIOD < nf_conntrack_tcp_timeout_syn_sent causes stale conntrack DNAT to route SYN to recycled pod IP #3634

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions