Skip to content

fix(3127): Only run the qss log sync interval once the auth connection is properly up and running#3126

Merged
islathehut merged 8 commits intodevelopfrom
fix/qss-log-pull-delay
Mar 31, 2026
Merged

fix(3127): Only run the qss log sync interval once the auth connection is properly up and running#3126
islathehut merged 8 commits intodevelopfrom
fix/qss-log-pull-delay

Conversation

@islathehut
Copy link
Copy Markdown
Collaborator

@islathehut islathehut commented Mar 10, 2026

Pull Request Checklist

  • I have linked this PR to a related GitHub issue.
  • I have added a description of the change (and Github issue number, if any) to the root CHANGELOG.md.

(Optional) Mobile checklist

Please ensure you completed the following checks if you did any changes to the mobile package:

  • I have run e2e tests for mobile
  • I have updated base screenshots for visual regression tests

@islathehut islathehut changed the title fix: Only run the qss log sync interval once the auth connection is properly up and running fix(3127): Only run the qss log sync interval once the auth connection is properly up and running Mar 10, 2026
@islathehut islathehut added this to Quiet Mar 11, 2026
@islathehut
Copy link
Copy Markdown
Collaborator Author

The overview of why this works:

We were triggering the log pull interval in two places under normal circumstances:

  1. immediately after starting the connection if the connection was "active" and we are a member on the chain
  2. when the auth connection emits the connected event

The first trigger is what was causing issues. In this context the "active" status was based solely on whether the connection had been stopped or not. Trying to pull log entries would be impossible at this point as both quiet and qss would reject since the connection wasn't fully established.

The second trigger was firing correctly but since the interval already existed we ignored it and continued waiting for the timer to finish before starting a new attempt.

The fix here is to track the LFA connection status in more detail so we know which stage of its lifecycle its in and migrate actions like "start pulling logs from qss" on whether we have completed the identity challenge or not and only using the more vague "active" state of the connection (i.e. is the connection actively doing something) for actions such as restarting/replacing stopped connections that don't rely on the LFA lifecycle.

Something odd to note here:

I originally migrated to using a new event, connectionSecured that fired when we derived the encryption key the auth connection used since that's guaranteed to happen after the identity challenge and I was seeing two instances of the connected event being fired at two points in the LFA connection lifecycle. I migrated back because while checking the logs to provide an accurate write-up here I'm no longer seeing the duplicate events and I don't have the old logs to double check. I may have hallucinated it but if I didn't and it is intermittent we should keep it in mind in the future if the delay shows up again or any other weird behavior happens when triggered by this event.

@islathehut islathehut removed this from Quiet Mar 11, 2026
@adrastaea adrastaea self-requested a review March 17, 2026 23:03
Copy link
Copy Markdown
Collaborator

@adrastaea adrastaea left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the improvements to the auth conn classes status tracking, but I also think that we could have just done the simplification I describe in my QSSService comment, and removed the need for them. It's fine if the first couple sync requests fail as long as we start an interval.

I'm approving, but consider my style comment before merging please :)

*/
public get active(): boolean {
return this._active
return [QSSAuthConnStatus.STARTING, QSSAuthConnStatus.ACTIVE].includes(this.connStatus)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The naming of this getter gets a little confusing when there's a status called QSSAuthConnStatus.ACTIVE, so intuitively it seems like this should only return true when the state is ACTIVE, not also when it is only STARTING

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I originally decided to rename this to running but decided to go back to active but renamed the ACTIVE status to CONNECTED which is more accurate.


authConnection?.on(QSSEvents.QSS_AUTH_CONNECTED, startLogPullInterval)
authConnection?.on(QSSEvents.QSS_AUTH_CONNECTED, (teamId: string) => {
this.startLogPullInterval(teamId)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It kind of seems like this was actually the issue which is that if you got the connected event from LFA, but were not a member yet, that we just wouldn't start an interval. So it would actually be circumstance 2 (referencing your comment), that was broken. Circumstance 1 would occur if you were an existing member, or if you were the creator, but 2 would be broken if you were joining for the first time.

Seems like we could just cut out any check on being a member or not, and just start the log pull interval as soon as we get the connected event. Basically, just remove lines 616:619 and 599:606. That would be functionally equivalent to these changes without any of the updated state tracking.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue was that we were optimistically starting the interval when the connection began but at that point QSS hadn't confirmed that the user was actually a member of the community. The interval was reliably starting but the first log pull didn't happen until it hit that 30 second threshold to rerun the interval logic. Within that time the connection was able to work since the auth connection was stable and QSS could confirm membership. This was happening with new users and with existing users.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was primarily because the active property of the auth connection would be true as long as the connection was doing something and was more of an "in-use" check. Since we called the interval at both the connected event and on creation if active was true we would call the interval twice but the first time (when active was true) we had an working ws connection with QSS but no record of authentication yet since the LFA handshake was in-progress.

@islathehut islathehut merged commit 4110170 into develop Mar 31, 2026
45 of 48 checks passed
@adrastaea adrastaea deleted the fix/qss-log-pull-delay branch April 1, 2026 21:56
islathehut added a commit that referenced this pull request Apr 7, 2026
…n is properly up and running (#3126)

* Only run the qss log sync interval once the auth connection is properly up and running

* Update CHANGELOG.md

* Go back to using connected event because I'm losing my mind

* Update auth to latest

* Clarify naming convention

* Go back to active but rename event to connected
islathehut added a commit that referenced this pull request Apr 15, 2026
* update tor

* Publish

 - @quiet/desktop@7.0.1-alpha.0
 - @quiet/mobile@7.0.1-alpha.0

* Update packages CHANGELOG.md

* Publish

 - @quiet/desktop@7.0.1-alpha.1
 - @quiet/mobile@7.0.1-alpha.1

* Update packages CHANGELOG.md

* Publish

 - @quiet/desktop@7.0.1-alpha.2
 - @quiet/mobile@7.0.1-alpha.2

* Update packages CHANGELOG.md

* Publish

 - @quiet/desktop@7.0.1-alpha.3
 - @quiet/mobile@7.0.1-alpha.3

* Update packages CHANGELOG.md

* fix hidden title bar on linux

* fix: Wait for tor kill to finish before moving on (#3134)

* Wait for tor kill to finish before moving on

* Update CHANGELOG.md

* Closes #3123 Removes Superflous @peculiar/webcrypto dependency (#3129)

* Closes #3123 Removes Superflous @peculiar/webcrypto dependency. Removes hacky JavaScript that was needed to override global.crypto

- this was probably needed for along time because iOS was using nodejs12 until some recent work, hence
having no global.crypto flag.
- iOS and Android need the --experimental-global-webcrypto flag until nodejs mobile is rebuilt

* fix soon to be broken github actions that need the node24 runner (#3130)

* iOS Push Notification Support (#3125)

* Add Firebase support and Network Service Extension for push notifications

* fix extension plist build issues

* add encrypted Firebase plist

* add decryption code for GoogleService-Info.plist

* integrate with Taea's communication module

* move personal changes from .xcode.env to .xcode.env.local

* compatibility fixes

* allow easier overriding of development team and bundle id

* fix more xcodeproj woes, using a fork that works for now

* fix node version build issues for .xcode.env.local

---------

Co-authored-by: taea <taelxvie@gmail.com>

* Feat/3086 client push service (#3114)

* request permissions when ios app opens, detect changes to permission on app open

* register when permission is granted and rely solely on event channels for payloads

* simplify sagas, ensure event channels are set up first

* update changelog

* fix mobile unit tests

* pass token to backend and scaffold registration

* cache token until we are part of a community and connected to qss

* update changelog

* fix typo in changelog

* remove caching of device token in redux because it is not needed

* adjust mocks

* try to get around ci weirdness

* fix mobile tests

* add qps consts to test module

* fix state-manager test

* implement storage

* add todos for future work

* remove http related deprecated code

* fix merge conflict

* add qps consts to test module

* adjust mocks

* try to get around ci weirdness

* fix mobile tests

* fix state-manager test

* remove http related deprecated code

* fix: Backend fails to start on GrapheneOS in 6.5.1 (#3106)

* Standardize build config fields to boolean

* Remove call to `free` because it breaks the backend on graphene

* feat(3058): Self-assign member role on joining with QSS and migrate to LFA-based OrbitDB identity (#3102)

* Add lockbox service and create an invite lockbox on invite creation

* Update changelog

* Self-assign member role on join with QSS

* Fix self-assign

* Use event to trigger storage setup after self-assign

* Pull log entries when fully joined via qss (update later to handle joining with peers)

* Move identitieswithstorage

* Get janky LFA identity working with orbitdb and get syncing on join with qss working

* Also pull entries on connection to qss when already a member

* Remove debugging log

* Add comments

* Add comments and return random signature

* Update qss and auth modules to use feature branches for testing

* Update qss e2e test to include joining without peers

* Add self-assign unit tests and fix some unit tests post-LFA identity

* Fix userProfile integration tests

* Update submodules

* Remove changes left in from testing

* Fix last of integration tests and add initializing check to storage init since we can start initialization via qss or libp2p events

* Forgot a comment

* Update CHANGELOG.md

* Update CHANGELOG.md

* Allow strings

* Missed one unit test update

* PR comment fixes

* Add real signatures back to log entries

* Update lfa-identity.service.ts

* release: 6.6.0 (#3109)

* Publish

 - @quiet/desktop@6.6.0-alpha.0
 - @quiet/mobile@6.6.0-alpha.0

* Update packages CHANGELOG.md

* fix: Backend fails to start on GrapheneOS in 6.5.1 (#3106)

* Standardize build config fields to boolean

* Remove call to `free` because it breaks the backend on graphene

* Publish

 - @quiet/desktop@6.6.0-alpha.1
 - @quiet/mobile@6.6.0-alpha.1

* Update packages CHANGELOG.md

* Publish

 - @quiet/desktop@6.6.0
 - @quiet/mobile@6.6.0

* Update packages CHANGELOG.md

* basic implementation

* implement storage

* add todos for future work

* return state

* update test

* skip test for device linking

* skip test for device linking

* test fix

* match send-push name to qps-send-push

* add an individual push option, rename batch push

* match server message format for batch

* increase batch size to 500

* fixed tests

* update changelog

* fix registration flow

* revert interval use in registration

* fix registration

* formatting

* reinforce endpoint fuzzy match with env flag

---------

Co-authored-by: Isla <5048549+islathehut@users.noreply.github.com>

* rm order expectation from test (#3139)

* fix: Allow more fuzziness in team link timestamp validations and use updated logging (#3137)

* Pass logger into team/connection and update auth to use new logging

* Update auth

* Use main auth branch

* Publish

 - @quiet/desktop@7.0.1-alpha.4
 - @quiet/mobile@7.0.1-alpha.4

* Update packages CHANGELOG.md

* mess around with the deployment options

* Publish

 - @quiet/desktop@7.0.1-alpha.5
 - @quiet/mobile@7.0.1-alpha.5

* Update packages CHANGELOG.md

* try macos 15 and xcode 16

* Publish

 - @quiet/desktop@7.0.1-alpha.6
 - @quiet/mobile@7.0.1-alpha.6

* Update packages CHANGELOG.md

* Revert "try macos 15 and xcode 16"

This reverts commit 5af7de9.

* change xcode project compatibility to 16 rather than 12

* Publish

 - @quiet/desktop@7.0.1-alpha.7
 - @quiet/mobile@7.0.1-alpha.7

* Update packages CHANGELOG.md

* add IOS_FIREBASE_KEY secret to env

* Publish

 - @quiet/desktop@7.0.1-alpha.8
 - @quiet/mobile@7.0.1-alpha.8

* Update packages CHANGELOG.md

* remove unused app groups section from entitlements

* add new mobile provisioning profile for network service extension

* decrypt mobile provisioning profiles to the right spot

* Publish

 - @quiet/desktop@7.0.1-alpha.9
 - @quiet/mobile@7.0.1-alpha.9

* Update packages CHANGELOG.md

* Fix the provisioning profile specifier

* Publish

 - @quiet/desktop@7.0.1-alpha.10
 - @quiet/mobile@7.0.1-alpha.10

* Update packages CHANGELOG.md

* add missing provisioning profile mapping

* Publish

 - @quiet/desktop@7.0.1-alpha.11
 - @quiet/mobile@7.0.1-alpha.11

* Update packages CHANGELOG.md

* fix(3146): Fix slow electron startup (#3147)

* Load splash before main view

* Add extra logging to uncover issue

* Try apple silicon build

* Update desktop-build.yml

* Update desktop-build.yml

* Make logs traces

* Add prod arm64 deploy job and run e2e tests on latest macos

* Update main.ts

* Fix issue in unit tests and use latest intel mac runner

* Update CHANGELOG.md

* Update .github/workflows/desktop-build.yml

Co-authored-by: Jake McGinty <me@jakebot.org>

---------

Co-authored-by: Jake McGinty <me@jakebot.org>

* Update notification for arm mac builds (#3151)

* Publish

 - @quiet/desktop@7.0.1-alpha.12
 - @quiet/mobile@7.0.1-alpha.12

* Update packages CHANGELOG.md

* fix(3180): Add mac entitlement to fix arm64 builds (#3181)

* Add entitlement to fix arm64 binaries

* Update changelog

* Publish

 - @quiet/desktop@7.0.1-alpha.13
 - @quiet/mobile@7.0.1-alpha.13

* Update packages CHANGELOG.md

* fix(3140): Validate qss endpoint when qss is allowed to avoid registration loops (#3141)

* Validate qss endpoint when qss is allowed to avoid registration loops

* Update CHANGELOG.md

* Update desktop tests

* Update CreateCommunity.test.tsx

* fix(3127): Only run the qss log sync interval once the auth connection is properly up and running (#3126)

* Only run the qss log sync interval once the auth connection is properly up and running

* Update CHANGELOG.md

* Go back to using connected event because I'm losing my mind

* Update auth to latest

* Clarify naming convention

* Go back to active but rename event to connected

* fix hidden title bar on linux (#3153)

Co-authored-by: taea <taelxvie@gmail.com>

* Publish

 - @quiet/desktop@7.0.1-alpha.14
 - @quiet/mobile@7.0.1-alpha.14

* Update packages CHANGELOG.md

* Publish

 - @quiet/desktop@7.0.1
 - @quiet/mobile@7.0.1

* Update packages CHANGELOG.md

* Move changelog entries to the correct versions

* Update back compat version to 7.0

* Log error on before-quit

* Bump webdriver

* Bump back compat version to 7.0.1

---------

Co-authored-by: Jake McGinty <me@jakebot.org>
Co-authored-by: taea <taelxvie@gmail.com>
Co-authored-by: bitmold <dsnake@protonmail.com>
Co-authored-by: Taea <88346289+adrastaea@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Fix: Delay between joining/signing in to QSS and syncing historical log entries

2 participants