Add async server.start() function#4981
Conversation
54b9cbd to
cf11bcd
Compare
|
@abernix This is ready for review. I think it's basically code complete. I haven't updated docs fully yet though I've tagged a bunch of places in the docs and READMEs to look at. I do want to make sure that the various integrations are actually compatible with an async start (ie, that it's OK in something like Lambda to not assign the handler synchronously). If that doesn't work I may have to make some slight changes to those integrations and make the instructions a bit more complex. |
cf11bcd to
a0d42cf
Compare
|
I'm pretty sure that at least Lambda (and probably all of the "serverless" integrations) will not work with an API that requires you to do anything async before calling |
5e54115 to
43879ff
Compare
|
@abernix OK, I think I've resolved the serverless story. Serverless integrations now work pretty much like they used to: their constructors now call |
ec51e85 to
3ade55a
Compare
9922900 to
a33d2d5
Compare
|
|
||
| > The changes noted within this `vNEXT` section have not been released yet. New PRs and commits which introduce changes should include an entry in this `vNEXT` section as part of their development. With few exceptions, the format of the entry should follow convention (i.e., prefix with package name, use markdown `backtick formatting` for package names and code, suffix with a link to the change-set à la `[PR #YYY](https://link/pull/YYY)`, etc.). When a release is being prepared, a new header will be (manually) created below and the appropriate changes within that release will be moved into the new section. | ||
|
|
||
| - Improve startup error handling by ensuring that your server has loaded its schema and executed its `serverWillStart` handlers successfully before starting an HTTP server. If you're using the `apollo-server` package, no code changes are necessary. If you're using an integration such as `apollo-server-express` that is not a "serverless framework", you can should insert [`await server.start()`](https://www.apollographql.com/docs/apollo-server/api/apollo-server/#start) between `server = new ApolloServer()` and `server.applyMiddleware`. (If you don't call `server.start()` yourself, your server will still work, but the previous behavior of starting a web server that may fail to load its schema still applies.) The serverless framework integrations (Lambda, Azure Functions, and Cloud Functions) do not support this functionality. The protected method `willStart` has been removed; `start` or the new protected method `ensureStarting` should fulfill the same purpose if you were using it. [PR #4981](https://github.com/apollographql/apollo-server/pull/4981) |
There was a problem hiding this comment.
"you can should insert"
abernix
left a comment
There was a problem hiding this comment.
I left a few comments within. I have some resolvable concerns about the "readiness" (in terms of copy-pastability) of the textual-examples in the README.md files given that the async startApolloServer functions are not being invoked in the functions which now wrap that logic, but I can understand the approach that is taken since it would resolve a different limitation (e.g., top-level await). With top-level await being supported out of the box in Node.js 14, I suspect those examples can be changed back to a more simple usage in the not-too-distant future, presumably when Apollo Server 3.x makes the minimum supported Node.js version v14 (as I suspect it should/could!)
All in all, this is an important long-needed API and I'm glad this brings that. The documentation and some of the severity of the changes (e.g., the recommended change to the usage and the removal of the willStart method) coupled with the need (within this PR) to concern ourselves with backcompat does leave me to believe that this might have paired even better with Apollo Server 3.x but this certainly does solve problems which have manifested in one case or another for some users for some time now, so 👍 .
| app.listen({ port: 4000 }, () => | ||
| console.log(`🚀 Server ready at http://localhost:4000${server.graphqlPath}`) | ||
| ) | ||
| async function startApolloServer() { |
There was a problem hiding this comment.
This example could benefit from calling startApolloServer, otherwise it may be indiscernible to someone copying and pasting this example as to why the server hasn't started.
There was a problem hiding this comment.
I mean, the problem is... how are you supposed to call it? The whole point is I don't want to encourage people to call it in a context where errors will turn into unhandledRejection. So I could show it being called from top-level await... but then I might as well just revert back to not having a function. Or I could show it being called with .catch() but that seems like it would get distracting (and it's unclear that a particular bit of .catch() from our docs is actually the best way to integrate it into your system).
While Node does now support top-level await (as of v14.8.0), it requires you to be in a special "module" mode where your file ends with .mjs, or set a flag in package.json. And it's different in TypeScript... here you have to be in module module (which should be the case for any of these examples because of the use of import) but you also have to set the target and/or module tsconfig option appropriately...
It feels like this is a big bundle of worms, compared to "here's an async function and it's your job to know how to use it". Since this mostly affects "advanced" uses (I'm not editing any calls to apollo-server's listen()) I'm a little less worried?
I mean, the fact that our apollo-server examples tell you to call listen() with only .then and not .catch is another issue, and perhaps we should go through and change those all to use top-level await with explicit guidance to use .mjs files, but that's a project for another day...
| app.listen({ port: 4000 }, () => | ||
| console.log(`🚀 Server ready at http://localhost:4000${server.graphqlPath}`) | ||
| ); | ||
| async function startApolloServer() { |
There was a problem hiding this comment.
Same concerns as above about startApolloServer not being invoked.
| app.listen({ port: 4000 }, () => | ||
| console.log(`🚀 Server ready at http://localhost:4000${server.graphqlPath}`) | ||
| ); | ||
| async function startApolloServer() { |
There was a problem hiding this comment.
Same concerns as above about startApolloServer not being invoked within the examples. This would yield a non-functioning server right now and we have expressly (pun certainly intended for this express integration!) agreed in the past that examples should just work. Can we just invoke it?
| return await this._start(); | ||
| } | ||
|
|
||
| protected async _start(): Promise<void> { |
There was a problem hiding this comment.
Is it necessary for this to be protected? I didn't catch this being invoked outside of this particular file and I don't quite know if we're expecting someone to subclass it.
There was a problem hiding this comment.
It's called by apollo-server's listen().
I'll document that it's not guaranteed to be stable.
| }; | ||
| } | ||
|
|
||
| protected async willStart() { |
There was a problem hiding this comment.
I could very much believe that an unknown number of third-party Apollo Server integrations or any extensions to ApolloServerBase might be relying on this. Anyway to shim it?
There was a problem hiding this comment.
Yeah, the deal is that it's roughly equivalent to calling start(), but the places that use it should probably just call ensureStarting() instead. My first instinct today was to have it do a deprecation log, but fixing it is probably something you'll need to ask your integration to do, not something you can fix yourself, so the deprecation log might get annoying. I'm going to leave out the log and try to replicate the old kinda weird semantics where only some errors throw.
We can remove willStart in AS3 (#5050)
If you think I should add a deprecation warning we can do that though right now I'm feeling like just leaving it mostly working and removing it in AS3 is fine.
Previously, server startup worked like this:
- `new ApolloServer`
- If no gateway, calculate schema and schema derived data immediately
- If gateway, kick off gateway.load from the end of the constructor, and if it
async-throws, log an error once and make the server kinda broken forever
- At various spots in the framework integration code, call (but don't await)
the protected `willStart` function, which is an async function that first
waits for the gateway to load the schema if necessary and then runs
serverWillStart plugin functions; save the Promise returned by calling this.
- At request time in the framework integration code, await that Promise.
And also, if there's no schema, fail with an error.
Now server startup works like this:
- ApolloServer represents its state explicitly with a new ServerState
- `new ApolloServer`
- If no gateway, initialize all the schema-derived state directly like
before (though the state now lives inside ServerState)
- If gateway, the constructor DOES NOT KICK OFF `gateway.load()`
- You can now call `await server.start()` yourself, which will first await
`gateway.load` if necessary, and then await all serverWillStart calls.
- If you're using `apollo-server` rather than an integration, `server.listen()`
will just transparently do this for you; explicit `start()` is just for
integrations!
- The integration places that used to call willStart now call
`server.ensureStarting()` instead which will kick off server.start in the
background if you didn't (and log any errors thrown).
- The places that used to await promiseWillStart no longer do so; generally
right after that code we end up calling `graphqlServerOptions`
- `graphqlServerOptions` now awaits `server.ensureStarted` which will start the
server if necessary and throw if it threw.
The overall change to user experience:
- If you're using `apollo-server`, startup errors will cause `listen` to reject;
no code changes are necessary.
- If you're using an integration you are encouraged to call `await
server.start()` yourself immediately after the constructor, which will let
you detect startup errors.
- But if you don't do that, the server will call `start` itself eventually. When
you try to execute your first GraphQL request, `start` will happen if it
hasn't already. Also an integration call like `server.applyMiddleware` will
initiate a background `start`. If startup fails, the startup error will be
logged on *every* failed graphql request, not just the first time like
happened before.
- If you have your own ApolloServer subclass that calls the protected
`willStart` method, it won't work before that method is gone. Consider whether
you can eliminate that call by just calling `start`, or perhaps call
`ensureStarting` instead.
This is close enough to backwards-compatible to be appropriate for a v2 minor
release. We are likely to make `start()` required in Apollo Server 3 (other than
for `apollo-server`).
Also:
- Previously we used the deprecated `ApolloServer.schema` field to determine
whether to install ApolloServerPluginInlineTrace, which we want to have active
by default for federated schemas only. If you're using a gateway, this field
isn't actually set at the time that ensurePluginInstantiation reads it.
That's basically OK because we don't want to turn on the plugin automatically
in the gateway, but in the interest of avoiding use of the deprecated field, I
refactored it so that `ApolloServerPluginInlineTrace` is installed by default
(ie, if you don't install your own version or install
`ApolloServerPluginInlineTraceDisabled`) without checking the schema, and
then (if it's installed automatically) it decides whether or not to be active
by checking the schema at `serverWillStart` time.
- Similarly, schema reporting now throws in its `serverWillStart` if the schema
is federated, instead of in `ensurePluginInstantiation`. (This does mean that
if you're not using the new `start()` or `apollo-server`, that failure won't
make your app fail as fast as if the `ApolloServer` constructor threw.)
- Fix some fastify tests that used a fixed listen port to not do that.
- I am doing my best to never accidentally run `prettier` on whole files and
instead to very carefully select specific blocks of the file to format them
several times per minute. Apparently I screwed up once and ran it once on
`packages/apollo-server-core/src/ApolloServer.ts`. The ratio of "prettier
changes" to "actual changes" in that file is low enough that I'd rather just
leave the changes in this PR rather than spending time carefully reverting
them. (It's one of the files I work on the most and being able to keep it
prettier-clean will be helpful anyway.)
- Replace a hacky workaround for the lack of `start` in the op reg tests!
- Replace a use of a `Barrier` class I added recently in tests with the
`@josephg/resolvable` npm package, which does basically the same thing.
Use that package in new tests and in the core state machine itself.
- While running tests I found that some test files hung if run separately due to
lack of cleanup. I ended up refactoring the cache tests to:
- make who is responsible for calling cache.close more consistent
- make the Redis client mocks self-contained mocks of the ioredis API instead
of starting with an actual ioredis implementation and mocking out some
internals
- clean up Jest fake timers when a certain test is done
I'm not super certain exactly which of these changes fixed the hangs but it
does seem better this way. (Specifically I think the fake timer fix, which I
did last, is what actually fixed it, but the other changes made it easier for
me to reason about what was going on.) Can factor out into another PR if
helpful.
Fixes #4921. Fixes apollographql/federation#335.
TODO:
- [ ] Go through all docs and READMEs that have 'FIXME start' and add calls to
start. This involves verifying that you can actually do top-level await in
the contexts that matter. (eg if it turns out that you really can't call await
before you assign a handler in Lambda, that's interesting and may require some
other changes to this PR!)
- [ ] Actually document start() in the apollo-server reference
- [ ] Document start() in all the integrations references
- [ ] CHANGELOG
- [ ] consider whether removing the protected willStart function is OK
Don't add it for serverless because we don't do that as of today Need to investigate micro still (is it sync-only like serverless?) and write actual docs.
hooray for vercel/micro#335
Co-authored-by: Stephen Barlow <stephenbarlow@APOLLO-StephenBarlow.attlocal.net> Co-authored-by: Stephen Barlow <stephenbarlow@apollo-stephenbarlow.gateway.sonic.net>
1a29a7e to
15426d9
Compare
|
I'll merge this today but I'm happy to continue the conversation about sample code. My main instinct is I'd rather have sample code that requires users to be slightly creative than sample code that either will give weird error messages if they don't have top-level await enabled or that will encourage users to run async functions without handling possible rejections. Re the AS3 question, I agree that not thinking about back-compat could be simple, but to be honest I expect we will learn interesting things about the impact of this change when we release it post-alpha-phase, and I'd like to be able to carry those learnings over to AS3 instead of only releasing this change as part of AS3. |
|
I have an alpha out that resolves this issue. First, install the alpha in your app by installing v2.22.0-alpha.0 of whatever Apollo Server packages your app directly depends on. This might look something like If you're using the Otherwise, if you're using (say) If you're using a serverless package like Please provide feedback at #5051. More details at https://www.apollographql.com/docs/apollo-server/api/apollo-server/#start (though oops, the docs shouldn't be up until the API is in a non-alpha version, so if this doesn't get released pretty soon I might roll that back). |
|
This is released as Apollo Server v2.22.0! |
This is a regression in #4981. If the server start process is begun implicitly by the execution of an operation (ensureStarted inside graphQLServerOptions) rather than by an explicit start call (or the implicit start calls in `apollo-server` or the serverless integrations) and startup throws, the log-and-redact logic wasn't being invoked. Add some tests of this behavior and fix some TypeScript issues in the test file.
This is a regression in #4981. If the server start process is begun implicitly by the execution of an operation (ensureStarted inside graphQLServerOptions) and startup throws, the log-and-redact logic wasn't being invoked. Note that this case doesn't usually happen in practice, because: - If you're using `apollo-server`, startup is begun in `listen()` before you can serve requests - If you're using a serverless framework integration, startup is begun in the constructor - If you're using a non-serverless framework integration, the function you call to connect it to your framework begins startup with `ensureStarting()` So mostly this just affects the case that you're running `executeOperation` without calling `start()` or `listen()`, or maybe you have your own custom framework integration that doesn't call `ensureStarting()`. But it's still worth missing. Add some tests of this behavior and fix some TypeScript issues in the test file.
This is a regression in #4981. If the server start process is begun implicitly by the execution of an operation (ensureStarted inside graphQLServerOptions) and startup throws, the log-and-redact logic wasn't being invoked. Note that this case doesn't usually happen in practice, because: - If you're using `apollo-server`, startup is begun in `listen()` before you can serve requests - If you're using a serverless framework integration, startup is begun in the constructor - If you're using a non-serverless framework integration, the function you call to connect it to your framework begins startup with `ensureStarting()` So mostly this just affects the case that you're running `executeOperation` without calling `start()` or `listen()`, or maybe you have your own custom framework integration that doesn't call `ensureStarting()`. But it's still worth missing. Add some tests of this behavior and fix some TypeScript issues in the test file.
This is a regression in #4981. If the server start process is begun implicitly by the execution of an operation (ensureStarted inside graphQLServerOptions) and startup throws, the log-and-redact logic wasn't being invoked. Note that this case doesn't usually happen in practice, because: - If you're using `apollo-server`, startup is begun in `listen()` before you can serve requests - If you're using a serverless framework integration, startup is begun in the constructor - If you're using a non-serverless framework integration, the function you call to connect it to your framework begins startup with `ensureStarting()` So mostly this just affects the case that you're running `executeOperation` without calling `start()` or `listen()`, or maybe you have your own custom framework integration that doesn't call `ensureStarting()`. But it's still worth missing. Add some tests of this behavior and fix some TypeScript issues in the test file.
This is a regression in #4981. If the server start process is begun implicitly by the execution of an operation (ensureStarted inside graphQLServerOptions) and startup throws, the log-and-redact logic wasn't being invoked. Note that this case doesn't usually happen in practice, because: - If you're using `apollo-server`, startup is begun in `listen()` before you can serve requests - If you're using a serverless framework integration, startup is begun in the constructor - If you're using a non-serverless framework integration, the function you call to connect it to your framework begins startup with `ensureStarting()` So mostly this just affects the case that you're running `executeOperation` without calling `start()` or `listen()`, or maybe you have your own custom framework integration that doesn't call `ensureStarting()`. But it's still worth missing. Add some tests of this behavior and fix some TypeScript issues in the test file.
This is a regression in #4981. If the server start process is begun implicitly by the execution of an operation (ensureStarted inside graphQLServerOptions) and startup throws, the log-and-redact logic wasn't being invoked. Note that this case doesn't usually happen in practice, because: - If you're using `apollo-server`, startup is begun in `listen()` before you can serve requests - If you're using a serverless framework integration, startup is begun in the constructor - If you're using a non-serverless framework integration, the function you call to connect it to your framework begins startup with `ensureStarting()` So mostly this just affects the case that you're running `executeOperation` without calling `start()` or `listen()`, or maybe you have your own custom framework integration that doesn't call `ensureStarting()`. But it's still worth missing. Add some tests of this behavior and fix some TypeScript issues in the test file.
Previously, server startup worked like this:
new ApolloServerasync-throws, log an error once and make the server kinda broken forever
the protected
willStartfunction, which is an async function that firstwaits for the gateway to load the schema if necessary and then runs
serverWillStart plugin functions; save the Promise returned by calling this.
And also, if there's no schema, fail with an error.
Now server startup works like this:
new ApolloServerbefore (though the state now lives inside ServerState)
gateway.load()await server.start()yourself, which will first awaitgateway.loadif necessary, and then await all serverWillStart calls.apollo-serverrather than an integration,server.listen()will just transparently do this for you; explicit
start()is just forintegrations!
ApolloServerServerlessFrameworkBase)also call it automatically for you in the background because their startup has to be
synchronous; if it fails then future requests will all fail (and log) as before.
server.ensureStarting()instead which will kick off server.start in thebackground if you didn't (and log any errors thrown).
right after that code we end up calling
graphqlServerOptionsgraphqlServerOptionsnow awaitsserver.ensureStartedwhich will start theserver if necessary and throw if it threw.
The overall change to user experience:
apollo-server, startup errors will causelistento reject;no code changes are necessary.
except that the startup error will be logged on all requests instead of just
the first one.
await server.start()yourself immediately after the constructor, which will letyou detect startup errors.
startitself eventually. Whenyou try to execute your first GraphQL request,
startwill happen if ithasn't already. Also an integration call like
server.applyMiddlewarewillinitiate a background
start. If startup fails, the startup error will belogged on every failed graphql request, not just the first time like
happened before.
willStartmethod, it won't work because that method is gone. Consider whetheryou can eliminate that call by just calling
start, or perhaps callensureStartinginstead.This is close enough to backwards-compatible to be appropriate for a v2 minor
release. We are likely to make
start()required in Apollo Server 3 (other thanfor
apollo-server).Also:
ApolloServer.schemafield to determinewhether to install ApolloServerPluginInlineTrace, which we want to have active
by default for federated schemas only. If you're using a gateway, this field
isn't actually set at the time that ensurePluginInstantiation reads it.
That's basically OK because we don't want to turn on the plugin automatically
in the gateway, but in the interest of avoiding use of the deprecated field, I
refactored it so that
ApolloServerPluginInlineTraceis installed by default(ie, if you don't install your own version or install
ApolloServerPluginInlineTraceDisabled) without checking the schema, andthen (if it's installed automatically) it decides whether or not to be active
by checking the schema at
serverWillStarttime.serverWillStartif the schemais federated, instead of in
ensurePluginInstantiation. (This does mean thatif you're not using the new
start()orapollo-server, that failure won'tmake your app fail as fast as if the
ApolloServerconstructor threw.)prettieron whole files andinstead to very carefully select specific blocks of the file to format them
several times per minute. Apparently I screwed up once and ran it once on
packages/apollo-server-core/src/ApolloServer.ts. The ratio of "prettierchanges" to "actual changes" in that file is low enough that I'd rather just
leave the changes in this PR rather than spending time carefully reverting
them. (It's one of the files I work on the most and being able to keep it
prettier-clean will be helpful anyway.)
startin the op reg tests!Barrierclass I added recently in tests with the@josephg/resolvablenpm package, which does basically the same thing.Use that package in new tests and in the core state machine itself.
lack of cleanup. I ended up refactoring the cache tests to:
of starting with an actual ioredis implementation and mocking out some
internals
I'm not super certain exactly which of these changes fixed the hangs but it
does seem better this way. (Specifically I think the fake timer fix, which I
did last, is what actually fixed it, but the other changes made it easier for
me to reason about what was going on.) Can factor out into another PR if
helpful.
Fixes #4921. Fixes apollographql/federation#335.
TODO:
start.