Skip to content

Commit 585d464

Browse files
authored
Improve issue dedupe detect workflow reliability (#6115)
Signed-off-by: Heng Qian <qianheng@amazon.com>
1 parent ef052ab commit 585d464

File tree

1 file changed

+76
-29
lines changed

1 file changed

+76
-29
lines changed

.github/workflows/issue-dedupe-detect.yml

Lines changed: 76 additions & 29 deletions
Original file line numberDiff line numberDiff line change
@@ -21,6 +21,7 @@ on:
2121
default: ''
2222
description: 'Issue number to check (defaults to github.event.issue.number)'
2323

24+
2425
jobs:
2526
detect:
2627
runs-on: ubuntu-latest
@@ -42,9 +43,8 @@ jobs:
4243
aws-region: us-east-1
4344

4445
- name: Run Claude Code for duplicate detection
46+
id: claude
4547
uses: anthropics/claude-code-action@2ff1acb3ee319fa302837dad6e17c2f36c0d98ea # v1
46-
env:
47-
DUPLICATE_GRACE_DAYS: ${{ inputs.grace_days }}
4848
with:
4949
use_bedrock: 'true'
5050
github_token: ${{ secrets.GITHUB_TOKEN }}
@@ -53,16 +53,14 @@ jobs:
5353
--model ${{ inputs.claude_model }}
5454
--allowedTools
5555
"Bash(gh issue view *)"
56-
"Bash(gh issue comment *)"
57-
"Bash(gh issue edit *)"
5856
"Bash(gh search issues *)"
59-
"Bash(gh label create *)"
57+
--json-schema '{"type":"object","properties":{"duplicates":{"type":"array","items":{"type":"integer"}},"skip_reason":{"type":"string"}},"required":["duplicates"]}'
6058
prompt: |
6159
Find up to 3 likely duplicate issues for issue #${{ env.ISSUE_NUMBER }} in ${{ github.repository }}.
6260
6361
Follow these steps precisely:
6462
65-
1. Use `gh issue view ${{ env.ISSUE_NUMBER }} --repo ${{ github.repository }} --comments` to read the issue and its comments. If the issue is closed, or is broad product feedback without a specific bug/feature request, or already has a duplicate detection comment (containing `<!-- duplicate-detection -->`), stop and report why you are not proceeding.
63+
1. Use `gh issue view ${{ env.ISSUE_NUMBER }} --repo ${{ github.repository }} --comments` to read the issue and its comments. If the issue is closed, or is broad product feedback without a specific bug/feature request, or already has a duplicate detection comment (containing `<!-- duplicate-detection -->`), return an empty duplicates array with skip_reason explaining why.
6664
6765
2. Summarize the issue's core problem in 2-3 sentences. Identify the key terms, error messages, and affected components.
6866
@@ -74,36 +72,85 @@ jobs:
7472
7573
4. For each candidate issue that looks like a potential match, read it with `gh issue view <number> --repo ${{ github.repository }}` to verify it is truly about the same problem. Filter out false positives — issues that merely share keywords but describe different problems.
7674
77-
5. If you find 1-3 genuine duplicates, you MUST run the following commands exactly as shown. Do NOT paraphrase or reformat the comment body — the `<!-- duplicate-detection -->` marker is required for automated processing.
78-
79-
a. For each duplicate, get its title:
80-
`gh issue view <dup_number> --repo ${{ github.repository }} --json title -q .title`
75+
5. Return the confirmed duplicate issue numbers. If no duplicates found, return an empty array.
8176
82-
b. Ensure the duplicate label exists:
83-
`gh label create "duplicate" --description "Issue is a duplicate of an existing issue" --color "cccccc" --repo ${{ github.repository }} 2>/dev/null || true`
84-
85-
c. Add the label:
86-
`gh issue edit ${{ env.ISSUE_NUMBER }} --repo ${{ github.repository }} --add-label "duplicate"`
77+
Important notes:
78+
- Only flag issues as duplicates when you are confident they describe the **same underlying problem**
79+
- Prefer open issues as duplicates, but closed issues can be referenced too
80+
- Do not flag the issue as a duplicate of itself
8781
88-
d. Post the comment. You MUST use this EXACT command with the EXACT format below (only replace N with the count and <dup_list> with lines like `- #123 — Title`):
89-
```
90-
gh issue comment ${{ env.ISSUE_NUMBER }} --repo ${{ github.repository }} --body "<!-- duplicate-detection -->
82+
# The --json-schema flag in claude_args makes Claude return validated JSON
83+
# via steps.claude.outputs.structured_output (instead of free-form text in execution_file).
84+
# Schema: { "duplicates": [int], "skip_reason"?: string }
85+
# - duplicates: issue numbers confirmed as duplicates (empty if none or skipped)
86+
# - skip_reason: optional, explains why detection was skipped (for debugging)
87+
- name: Process structured output
88+
id: process
89+
shell: bash
90+
env:
91+
STRUCTURED_OUTPUT: ${{ steps.claude.outputs.structured_output }}
92+
run: |
93+
echo "Structured output: $STRUCTURED_OUTPUT"
94+
95+
# No output means Claude failed to return structured data
96+
if [ -z "$STRUCTURED_OUTPUT" ]; then
97+
echo "No structured output"
98+
echo "has_duplicates=false" >> "$GITHUB_OUTPUT"
99+
exit 0
100+
fi
101+
102+
# skip_reason is set when Claude decides not to proceed
103+
# (e.g., issue is closed, already processed, or broad feedback).
104+
# Log it for debugging — takes priority over any duplicates.
105+
SKIP_REASON=$(echo "$STRUCTURED_OUTPUT" | jq -r '.skip_reason // empty')
106+
if [ -n "$SKIP_REASON" ]; then
107+
echo "Skipped: $SKIP_REASON"
108+
echo "has_duplicates=false" >> "$GITHUB_OUTPUT"
109+
exit 0
110+
fi
111+
112+
COUNT=$(echo "$STRUCTURED_OUTPUT" | jq '.duplicates | length')
113+
if [ "$COUNT" -eq 0 ]; then
114+
echo "No duplicates found"
115+
echo "has_duplicates=false" >> "$GITHUB_OUTPUT"
116+
exit 0
117+
fi
118+
119+
echo "has_duplicates=true" >> "$GITHUB_OUTPUT"
120+
echo "count=$COUNT" >> "$GITHUB_OUTPUT"
121+
122+
# Format as "- #123" lines — GitHub auto-renders issue links with titles
123+
DUP_LIST=$(echo "$STRUCTURED_OUTPUT" | jq -r '.duplicates[] | "- #\(.)"')
124+
{
125+
echo "duplicate_list<<EOF"
126+
echo "$DUP_LIST"
127+
echo "EOF"
128+
} >> "$GITHUB_OUTPUT"
129+
130+
- name: Add duplicate label
131+
if: steps.process.outputs.has_duplicates == 'true'
132+
env:
133+
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
134+
run: |
135+
gh label create "duplicate" --description "Issue is a duplicate of an existing issue" --color "cccccc" --repo "${{ github.repository }}" 2>/dev/null || true
136+
gh issue edit "${{ env.ISSUE_NUMBER }}" --repo "${{ github.repository }}" --add-label "duplicate"
137+
echo "Added duplicate label to issue #${{ env.ISSUE_NUMBER }}"
138+
139+
- name: Post duplicate detection comment
140+
if: steps.process.outputs.has_duplicates == 'true'
141+
uses: peter-evans/create-or-update-comment@e8674b075228eee787fea43ef493e45ece1004c9 # v5
142+
with:
143+
issue-number: ${{ env.ISSUE_NUMBER }}
144+
body: |
145+
<!-- duplicate-detection -->
91146
### Possible Duplicate
92147
93-
Found **N** possible duplicate issue(s):
148+
Found **${{ steps.process.outputs.count }}** possible duplicate issue(s):
94149
95-
<dup_list>
150+
${{ steps.process.outputs.duplicate_list }}
96151
97152
If this is **not** a duplicate:
98153
- Add a comment on this issue, or
99154
- 👎 this comment to prevent auto-closure
100155
101-
Otherwise, this issue will be **automatically closed in ${DUPLICATE_GRACE_DAYS:-7} days**."
102-
```
103-
104-
6. If no genuine duplicates are found, report that no duplicates were detected and take no further action.
105-
106-
Important notes:
107-
- Only flag issues as duplicates when you are confident they describe the **same underlying problem**
108-
- Prefer open issues as duplicates, but closed issues can be referenced too
109-
- Do not flag the issue as a duplicate of itself
156+
Otherwise, this issue will be **automatically closed in ${{ inputs.grace_days }} days**.

0 commit comments

Comments
 (0)