Skip to content

checker: avoid unnecessary remove disconnected peer with multi orphan peers#7315

Merged
ti-chi-bot[bot] merged 9 commits intotikv:masterfrom
lhy1024:fix-check3
Nov 6, 2023
Merged

checker: avoid unnecessary remove disconnected peer with multi orphan peers#7315
ti-chi-bot[bot] merged 9 commits intotikv:masterfrom
lhy1024:fix-check3

Conversation

@lhy1024
Copy link
Copy Markdown
Contributor

@lhy1024 lhy1024 commented Nov 3, 2023

What problem does this PR solve?

Issue Number: Close #7249

When there are many orphan peers, we don't think disconnected peer are healthy
And when we decide to remove peer, we will pick disconnected peer firstly.

What is changed and how does it work?

Check List

Tests

  • Unit test
  • Integration test
    image

Release note

None.

@ti-chi-bot
Copy link
Copy Markdown
Contributor

ti-chi-bot bot commented Nov 3, 2023

[REVIEW NOTIFICATION]

This pull request has been approved by:

  • nolouch
  • rleungx

To complete the pull request process, please ask the reviewers in the list to review by filling /cc @reviewer in the comment.
After your PR has acquired the required number of LGTMs, you can assign this pull request to the committer in the list by filling /assign @committer in the comment to help you merge this pull request.

The full list of commands accepted by this bot can be found here.

Details

Reviewer can indicate their review by submitting an approval review.
Reviewer can cancel approval by submitting a request changes review.

@ti-chi-bot ti-chi-bot bot added the release-note-none Denotes a PR that doesn't merit a release note. label Nov 3, 2023
@ti-chi-bot ti-chi-bot bot requested review from nolouch and rleungx November 3, 2023 03:43
@ti-chi-bot ti-chi-bot bot added the size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. label Nov 3, 2023
@ti-chi-bot ti-chi-bot bot added the status/LGT1 Indicates that a PR has LGTM 1. label Nov 3, 2023
Signed-off-by: lhy1024 <admin@liudos.us>
@codecov
Copy link
Copy Markdown

codecov bot commented Nov 3, 2023

Codecov Report

Merging #7315 (f12d0b4) into master (ab8bf7b) will decrease coverage by 0.03%.
The diff coverage is 84.61%.

@@            Coverage Diff             @@
##           master    #7315      +/-   ##
==========================================
- Coverage   74.49%   74.46%   -0.03%     
==========================================
  Files         446      446              
  Lines       48346    48352       +6     
==========================================
- Hits        36016    36007       -9     
- Misses       9160     9165       +5     
- Partials     3170     3180      +10     
Flag Coverage Δ
unittests 74.46% <84.61%> (-0.03%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

@ti-chi-bot ti-chi-bot bot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Nov 3, 2023
return operator.CreateDemoteLearnerOperatorAndRemovePeer("replace-down-peer-with-orphan-peer", c.cluster, region, orphanPeer, pinDownPeer)
case orphanPeerRole == metapb.PeerRole_Voter && destRole == metapb.PeerRole_Voter &&
isDisconnectedPeer(pinDownPeer) && !dstStore.IsDisconnected():
case orphanPeerRole == destRole && isDisconnectedPeer(pinDownPeer) && !dstStore.IsDisconnected():
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

allow replace learner

if hasHealthPeer {
// there already exists a healthy orphan peer, so we can remove other orphan Peers.
ruleCheckerRemoveOrphanPeerCounter.Inc()
// if there exists a disconnected orphan peer, we will pick it to remove firstly.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

avoid to remove normal peer

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider that we have 3 orphan peer, two healthy and one disconnected. Is it possiable that we remove a healthy peer first?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we always remove disconnected peer first?

Copy link
Copy Markdown
Contributor Author

@lhy1024 lhy1024 Nov 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we always remove disconnected peer first?

I think so.

@lhy1024 lhy1024 requested a review from nolouch November 3, 2023 11:46
Signed-off-by: lhy1024 <admin@liudos.us>
Signed-off-by: lhy1024 <admin@liudos.us>
@lhy1024
Copy link
Copy Markdown
Contributor Author

lhy1024 commented Nov 6, 2023

@rleungx PTAL

Signed-off-by: lhy1024 <admin@liudos.us>
Signed-off-by: lhy1024 <admin@liudos.us>
@ti-chi-bot ti-chi-bot bot added status/LGT2 Indicates that a PR has LGTM 2. and removed status/LGT1 Indicates that a PR has LGTM 1. labels Nov 6, 2023
Signed-off-by: lhy1024 <admin@liudos.us>
Signed-off-by: lhy1024 <admin@liudos.us>
Signed-off-by: lhy1024 <admin@liudos.us>
@lhy1024
Copy link
Copy Markdown
Contributor Author

lhy1024 commented Nov 6, 2023

image

succesfully run test

image
no failure in tidb

image

all orphan peers are removed with 11min down store time config

@lhy1024
Copy link
Copy Markdown
Contributor Author

lhy1024 commented Nov 6, 2023

/merge

@ti-chi-bot
Copy link
Copy Markdown
Contributor

ti-chi-bot bot commented Nov 6, 2023

@lhy1024: It seems you want to merge this PR, I will help you trigger all the tests:

/run-all-tests

You only need to trigger /merge once, and if the CI test fails, you just re-trigger the test that failed and the bot will merge the PR for you after the CI passes.

If you have any questions about the PR merge process, please refer to pr process.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

@ti-chi-bot
Copy link
Copy Markdown
Contributor

ti-chi-bot bot commented Nov 6, 2023

This pull request has been accepted and is ready to merge.

DetailsCommit hash: f12d0b4

@ti-chi-bot ti-chi-bot bot added the status/can-merge Indicates a PR has been approved by a committer. label Nov 6, 2023
@ti-chi-bot ti-chi-bot bot merged commit c332ddc into tikv:master Nov 6, 2023
lhy1024 added a commit to lhy1024/pd that referenced this pull request Nov 6, 2023
… peers (tikv#7315)

close tikv#7249

Signed-off-by: lhy1024 <admin@liudos.us>
lhy1024 added a commit to lhy1024/pd that referenced this pull request Nov 8, 2023
… peers (tikv#7315)

close tikv#7249

Signed-off-by: lhy1024 <admin@liudos.us>
lhy1024 added a commit to lhy1024/pd that referenced this pull request Nov 8, 2023
… peers (tikv#7315)

close tikv#7249

Signed-off-by: lhy1024 <admin@liudos.us>
lhy1024 added a commit to lhy1024/pd that referenced this pull request Nov 8, 2023
… peers (tikv#7315)

close tikv#7249

Signed-off-by: lhy1024 <admin@liudos.us>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

release-note-none Denotes a PR that doesn't merit a release note. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. status/can-merge Indicates a PR has been approved by a committer. status/LGT2 Indicates that a PR has LGTM 2.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

checker: reduces the probability of deleting normal peers when the store becomes unavailable

3 participants