Skip to content

admin interface hangs on 24-core machine #523

@jbyler

Description

@jbyler

Primary symptom: on a machine with 22 or more cores and the default server configuration, the admin interface accepts TCP connections and then never processes the requests, causing a browser to hang forever. This can happen on any machine given the wrong config parameters.

Details:

  • Server thread configuration must adhere to the following invariants, determined through debugging and trial and error:
    • maxThreads > ∑ (acceptorThreads + selectorThreads) over all applicationConnectors
    • adminMaxThreads > ∑ (acceptorThreads + selectorThreads) over all adminConnectors
  • Presumably what this means is that maxThreads includes all the acceptorThreads and selectorThreads, and what's left over is used for handling requests. If there's nothing left over, the requests queue up and never get handled.
  • If this invariant is not satisfied, either the applicationConnectors or the adminConnectors (respectively) will accept TCP connections but then never handle them
  • DropWizard's defaults for maxThreads (1024) and adminMaxThreads (64) are fixed, while the defaults for acceptorThreads (#CPUs/2) and selectorThreads (#CPUs) vary on different machines.
  • So on a machine with 22 or more cores and an admin interface with 2 connectors (one for HTTP, one for HTTPS), the invariant doesn't hold for the admin interface. With 342 or more cores and 2 connectors, it won't hold for the application interface.

There are potentially 3 parts to this bug:

  • DropWizard defaults should probably be made so they work on all modern hardware.
  • DropWizard should probably validate the configuration and fail to start if an invariant isn't satisfied, rather than silently hanging.
  • Perhaps the invariant should be documented.
  • Alternatively, the meaning of the parameters could be changed so that they are independent. Using maxThreads only for processing requests (and not for selector threads and acceptor threads) might be more intuitive.

Tested with version v0.7.0.rc3.

How to reproduce: use the dropwizard-example application on a 24-core machine, or use the following server config on any machine:

server:
  minThreads: 2
  maxThreads: 2
  applicationConnectors:
    - type: http
      port: 8080
      acceptorThreads: 1
      selectorThreads: 1
  adminConnectors:
    - type: http
      port: 8081
      acceptorThreads: 1
      selectorThreads: 1

App starts successfully, but requests to either the application or the admin interface hang.

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions