diff --git a/.gitignore b/.gitignore
index 6825061db1..573d5c5e10 100644
--- a/.gitignore
+++ b/.gitignore
@@ -65,7 +65,6 @@ CLAUDE.md
 node_modules/
 package-lock.json
 package.json
-AGENTS.md
 
 microbench/build/
 microbench/output/
diff --git a/AGENTS.md b/AGENTS.md
new file mode 100644
index 0000000000..0b9c48f8f5
--- /dev/null
+++ b/AGENTS.md
@@ -0,0 +1,103 @@
+# Repository Guidelines
+
+This file defines the collaboration rules and workflow entry points for this repository.
+
+- For complex tasks, follow the process first instead of jumping straight into code changes.
+- For non-trivial behavioral changes, explain the background, assumptions, risks, and validation plan.
+- If documentation conflicts, treat the source code as the ground truth, then consult architecture and process docs.
+
+## Planning
+
+If a task involves complex feature development, long-running debugging, performance or behavior alignment, larger refactors, or analysis that spans multiple turns, first create or update an ExecPlan according to [PLANS.md](PLANS.md) before continuing.
+
+Typical cases that should use an ExecPlan include:
+
+- gem5 / RTL behavior alignment
+- Frontend / BPU / FTQ / redirect / flush investigations
+- Performance regression analysis
+- Refactors that cross multiple modules
+- New features that need to be landed in stages
+
+## Repository Map
+
+Start with these directories first:
+
+- `src/`: core source code (C++ / Python), especially `arch/riscv/`, `cpu/o3/`, and `cpu/pred/`
+- `configs/`: runtime configurations, especially `configs/example/kmhv3.py`
+- `tests/`: test entry points
+- `util/`: helper scripts and tools
+- `docs/`: documentation, including architecture and execution plans
+
+For a higher-level map of the codebase, see [ARCHITECTURE.md](ARCHITECTURE.md).
+
+## Environment Assumptions
+
+This repository is primarily developed on shared Linux servers.
+
+- For full-system, checkpoint, and difftest-related tasks, prefer assuming `GCBV_REF_SO` is available.
+- The default CI-style reference path is:
+  `GCBV_REF_SO=/nfs/home/share/gem5_ci/ref/normal/riscv64-nemu-interpreter-so`
+- The default explicit setting for `GCB_RESTORER` is:
+  `GCB_RESTORER=""`
+- Whether `GCB_RESTORER` and `AM_HOME` are needed depends on the task:
+  - restore-related workflows may require checking `GCB_RESTORER`
+  - some frontend micro-tests and bare-metal test flows may require checking `AM_HOME`
+- Before running environment-dependent tasks, check the relevant variables instead of assuming local defaults are correct.
+
+## Build, Run, and Test Entry Points
+
+Common entry points:
+
+- Build optimized binary:
+  `scons build/RISCV/gem5.opt --gold-linker -j64`
+- Build debug binary:
+  `scons build/RISCV/gem5.debug --gold-linker -j64 --debug-cycle`
+- Run the XiangShan configuration:
+  `./build/RISCV/gem5.opt ./configs/example/kmhv3.py --raw-cpt --generic-rv-cpt=<path>`
+- SE mode example:
+  `./build/RISCV/gem5.opt ./configs/example/se.py -c <binary>`
+- Build all unit tests:
+  `scons build/RISCV/unittests.opt -j100 --unit-test`
+
+If you need a more systematic understanding of module boundaries, configuration entry points, or execution flow, read [ARCHITECTURE.md](ARCHITECTURE.md) first.
+
+## Style and Naming
+
+- C / C++: follow `.clang-format`
+- Python: follow the repository's existing formatting and checking workflow
+- Naming:
+  - types / classes: UpperCamelCase
+  - functions / methods: lower_snake_case
+  - constants: ALL_CAPS
+- Use English for code comments and commit messages
+- Keep changes simple and avoid introducing functionality unrelated to the current task
+
+## Validation Expectations
+
+For non-trivial changes, do not stop at code edits alone. Validation should match the level of risk.
+
+Prefer these principles:
+
+- Behavioral changes: provide a minimal reproduction, key logs, statistics, or test results
+- Refactors: confirm there is no behavioral regression, and compare key statistics when needed
+- Frontend / BPU / timing-related changes: prefer targeted workloads, unit tests, or checkpoint-based regression
+- Analysis tasks: clearly distinguish confirmed facts, current hypotheses, and unresolved questions
+
+If full validation cannot be completed in the current environment, explicitly state the gap and the remaining risk.
+
+## Commit and PR Expectations
+
+- Use imperative English in commit messages
+- Prefer module-prefixed commit titles focused on a single change, for example:
+  `cpu-o3: Fix tage allocation`
+- PRs should explain:
+  - motivation
+  - approach
+  - scope of impact
+  - validation method and results
+- Run the repository's style checks and required tests before submission
+
+## Related Documents
+
+- [PLANS.md](PLANS.md): ExecPlan rules for complex tasks
+- [ARCHITECTURE.md](ARCHITECTURE.md): high-level architecture map of the repository
diff --git a/PLANS.md b/PLANS.md
new file mode 100644
index 0000000000..50eff30819
--- /dev/null
+++ b/PLANS.md
@@ -0,0 +1,268 @@
+# ExecPlan Guide
+
+This document is adapted from [OpenAI's ExecPlan](https://developers.openai.com/cookbook/articles/codex_exec_plans) guidance and tailored to the needs of this repository.
+
+It defines how execution plans (ExecPlans) should be written and maintained in this codebase.
+
+ExecPlans are meant for tasks like these:
+
+- Tasks that cannot realistically be completed in one or two exchanges
+- Work that spans multiple steps, files, or experiments
+- Investigations that need recorded evidence, decisions, intermediate findings, and current status
+- Long efforts where context can easily drift if it is not written down
+
+If the task is just a very small change, a simple bug fix, or a single-file adjustment, a separate ExecPlan is usually unnecessary.
+
+## 1. What an ExecPlan Is
+
+An ExecPlan is not a casual TODO list and not a lightweight checklist.
+
+It is a living execution document that should answer questions like:
+
+- What problem is being solved?
+- Why is it worth doing?
+- What is already known?
+- What exactly happens next?
+- How do we know the result is correct?
+- What did we learn along the way?
+- Why did the plan change midway?
+
+A good ExecPlan should let someone who does not know the previous conversation pick up the task and continue with reasonable confidence.
+
+## 2. When to Use an ExecPlan
+
+Create or update an ExecPlan when one or more of the following is true:
+
+1. The task will likely take significant time or span multiple conversations
+2. The work combines at least two of: research, experimentation, debugging, implementation
+3. The task touches multiple modules, files, or evidence sources
+4. There is meaningful uncertainty and assumptions need to be validated first
+5. The task needs a record of why a decision was made, not just what changed
+6. The task may be paused and resumed later
+
+Typical examples:
+
+- Aligning gem5 behavior with RTL
+- Investigating frontend / BPU / FTQ / flush / redirect behavior
+- Analyzing a performance regression
+- Landing a larger refactor
+- Building a feature that must be implemented in phases
+- Prototyping before committing to a final design
+
+## 3. Writing Principles
+
+### 3.0 Language
+
+By default, ExecPlans may be written in Chinese for internal development efficiency.
+
+Use English when the expected audience includes external contributors, or when the plan is intended to be referenced from public-facing documentation, PR discussion, or broader cross-team communication.
+
+### 3.1 Self-Contained
+
+An ExecPlan should be as self-contained as possible.
+
+Do not assume the reader remembers earlier chat history, and do not write things like "same as discussed above".
+
+If background is necessary to move the task forward, write it into the current document.
+
+### 3.2 Outcome-Oriented
+
+Do not stop at "change function X" or "add field Y".
+
+Explain:
+
+- What effect you expect
+- What the user or developer should be able to observe afterward
+- How that outcome will be validated
+
+### 3.3 Evidence-Oriented
+
+Especially for analysis tasks, do not write guesses as if they were facts.
+
+Clearly separate:
+
+- confirmed facts
+- current hypotheses
+- open questions
+- the evidence supporting a conclusion
+
+### 3.4 Continuously Updated
+
+An ExecPlan is a living document, not a one-time writeup.
+
+As the work progresses, update:
+
+- current status
+- new findings
+- decision changes
+- next actions
+
+### 3.5 Explain Why First
+
+Implementation details can be expanded later, but important decisions must include their rationale.
+
+Someone picking up the task later should be able to understand why this approach was chosen instead of another one.
+
+## 4. Recommended Location
+
+Prefer one ExecPlan file per complex task rather than putting everything into one large document.
+
+Recommended directory structure:
+
+```text
+docs/
+  exec-plans/
+    active/
+    completed/
+    blocked/
+```
+
+Meaning:
+
+- `active/`: tasks currently in progress
+- `completed/`: finished tasks
+- `blocked/`: tasks paused pending external conditions
+
+Use concise, descriptive file names, for example:
+
+- `docs/exec-plans/active/gem5-rtl-fetch-align.md`
+- `docs/exec-plans/active/bpu-override-investigation.md`
+- `docs/exec-plans/active/spec06-regression-debug.md`
+
+## 5. Recommended Structure
+
+Each ExecPlan should usually contain at least the following sections.
+
+## Title
+
+Use a short sentence that describes the goal.
+The title should prefer "action + object" over vague naming.
+
+For example:
+
+- Align gem5 frontend flush behavior with RTL
+- Investigate the IPC regression of a SPEC06 benchmark
+- Add verifiable observability for BPU override
+
+---
+
+## Background and Goal
+
+Use a few paragraphs to explain:
+
+- what the current problem is
+- why it matters
+- what result should be achieved
+- how that result will be observed
+
+Focus first on the value of the task and the end result, not on implementation details.
+
+---
+
+## Current Known Information
+
+Record the facts, observations, and constraints already confirmed.
+This can include:
+
+- relevant modules, files, and paths
+- current behavior and how it differs from expectations
+- logs, counters, traces, waveforms, or test results already observed
+- environment constraints
+
+Do not write guesses as facts in this section.
+
+## Hypotheses and Open Questions
+
+If uncertainty remains, list it explicitly. For example:
+
+- We currently suspect the issue is the timing of override activation
+- We are not yet sure whether the second target comes from mainBTB
+- We need to verify whether a counter covers the split-request case
+
+The purpose of this section is to keep the analysis from becoming muddled over time.
+
+---
+
+## Planned Steps
+
+List the next steps in order.
+Each step should ideally be written as "action + goal + expected output".
+
+For example:
+
+1. Read the frontend redirect path and confirm the actual control flow in gem5
+2. Cross-check RTL documentation and implementation, then summarize the flush taxonomy
+3. Add the required logs or counters and construct a minimal reproduction
+4. Run the chosen workload and verify whether the behavior converges
+5. Decide whether to keep the current approach or revise it based on the results
+
+Avoid cryptic shorthand that only the original author can understand.
+
+---
+
+## Validation
+
+Always specify how success will be judged.
+
+Validation may mean:
+
+- tests pass
+- logs match expectations
+- counters move in the expected direction
+- a workload now behaves like RTL
+- a performance regression is eliminated
+- a scenario is reproducible and then fixed
+
+Even for analysis-only tasks, define what "done" means. For example:
+
+- root cause confirmed
+- minimal reproduction identified
+- candidate causes ruled out
+- a concrete recommendation for the next phase is available
+
+## Progress
+
+Progress must be updated continuously.
+Use checkboxes with timestamps.
+
+Example:
+
+- [x] 2026-03-24 10:00 Read the main frontend redirect path and identify the primary entry points
+- [x] 2026-03-24 11:20 Cross-check RTL docs and discover that the flush taxonomy differs from the earlier assumption
+- [ ] Add counters for the split-request case and verify whether `inflightLoads` fully covers it
+- [ ] Construct a minimal workload to validate the second-target selection logic
+
+If a step is only partially complete, say what has been finished and what remains.
+
+---
+
+## Findings and Surprises
+
+Record important new findings that appear during the work.
+Especially note things like:
+
+- an earlier understanding was wrong
+- docs and code disagree
+- a counter definition is unreliable
+- a path is more important than expected
+- an experiment disproved an earlier hypothesis
+
+This section matters because long tasks often fail when important intermediate learning is not written down.
+
+---
+
+## Decision Log
+
+Whenever an important decision is made or the direction changes, record it.
+
+Recommended format:
+
+- Decision: ...
+- Reason: ...
+- Date: ...
+
+For example:
+
+- Decision: add observability before changing behavior
+- Reason: the root cause is not fully confirmed yet, so changing behavior immediately is too risky
+- Date: 2026-03-24
diff --git a/docs/exec-plans/completed/phr-rtl-alignment.md b/docs/exec-plans/completed/phr-rtl-alignment.md
new file mode 100644
index 0000000000..1e136b715d
--- /dev/null
+++ b/docs/exec-plans/completed/phr-rtl-alignment.md
@@ -0,0 +1,102 @@
+# 对齐 gem5 与 RTL 的 PHR target 更新语义
+
+## 背景与目标
+
+当前在评审 PR #814（`Fix path history info target`）时，发现 gem5 旧实现中
+`FullBTBPrediction::getTarget()` 与 `getPHistInfo()` 对 indirect/return target 的处理不一致。
+PR 将两者统一为同一套 target 解析逻辑。
+
+但这个修复是否“符合 RTL”仍需单独验证。这里的核心问题不是
+“PHR 是否必须使用真实 target”，而是：
+
+- XiangShan RTL 中 path history/PHR 更新时，实际使用的是哪一个 target
+- 这个 target 是 BTB entry 原始 target，还是经过 override 后的最终预测 target
+- gem5 当前 PR 的行为是否在语义上与 RTL 一致
+
+本任务的目标是给出基于代码证据的结论，而不是仅凭经验判断。
+
+## 当前已知信息
+
+- gem5 预测后更新 path history 的入口在
+  `src/cpu/pred/btb/decoupled_bpred.cc`，通过 `finalPred.getPHistInfo()` 取得
+  `(pc, target, taken)` 后调用 `pHistShiftIn(...)`。
+- PR #814 之前，`getPHistInfo()` 直接使用 `entry.target`；
+  `getTarget()` 则会对 indirect target 和 return target 做 override。
+- 因此 gem5 旧实现存在“最终预测 target”和“PHR 使用 target”不一致的可能。
+- 已确认 XiangShan RTL 的 PHR 更新显式依赖 `(cfiPc, target)` 的 path hash，而不是仅依赖 PC。
+
+## 假设与待验证问题
+
+该部分已完成验证。初始假设中的两个分支里，最终结论如下：
+
+1. RTL 不要求 PHR 在推测更新时必须等于 backend 最终真实执行 target。
+2. 但 RTL 也不是“随便使用一个近似 target”即可；它要求 PHR 使用当前真正驱动 fetch 前进的 target。
+3. 如果后续出现更晚阶段的 override 或 backend redirect，RTL 会基于新的 target 对 PHR 做修正。
+
+## 计划步骤
+
+1. 阅读 XiangShan `frontend/bpu/tage` 相关 Scala 实现，确认 `tage` 仅消费 folded PHR，不直接维护 PHR。
+2. 追踪 `frontend/bpu/history/phr` 中的 PHR 更新逻辑，确认更新使用 `pathHash(cfiPc, target)`。
+3. 回到 `frontend/bpu/Bpu.scala`，确认 `s1_prediction.target`、`s3_prediction.target` 与 `redirect.bits.target` 都是“当前生效的 fetch target”，包含 RAS / ITTAGE / override 路径。
+4. 对照 gem5 的 `getPHistInfo()`、`getTarget()` 路径，判断 PR #814 是否与 RTL 一致。
+5. 补充 PR 的 `gcc12-spec06-0.8c` 性能数据分析，确认收益方向是否与该类修复的预期一致。
+
+## 验证方式
+
+- 在 RTL 中找到 path history 更新代码及其输入来源。
+- 能明确回答“PHR 更新使用的 target 是什么”。
+- 能把该结论映射回 gem5 PR #814 的具体代码修改点。
+- 如有现成 CI 数据，验证性能变化是否主要体现在 conditional-path 学习质量相关指标上。
+
+## 结论
+
+### RTL 语义
+
+- XiangShan `tage` 自身只读取 folded PHR，入口位于
+  `frontend/bpu/tage/Tage.scala` 的 `io.fromPhr.foldedPathHist` /
+  `foldedPathHistForTrain`。
+- 真正维护 PHR 的模块是
+  `frontend/bpu/history/phr/Phr.scala`。
+- RTL 在 `Phr.scala` 中用 `pathHash(updateCfiPc, updateTarget)` 更新 PHR；
+  因此 PHR 的更新输入明确包含 target。
+- `updateTarget` 的来源优先级为：
+  `redirect > s3_override > s1_valid`。
+- 这些 target 并不是某个静态 BTB entry target：
+  - `s1_prediction.target` 会在 return 场景下被 uRAS override；
+  - `s3_prediction.target` 会在 return 场景下被 RAS override，在其他 indirect 场景下可被 ITTAGE override；
+  - `redirect.bits.target` 则来自 backend/redirect 的修正结果。
+
+### 对 gem5 PR #814 的判断
+
+- gem5 旧实现的问题，不是“PHR 没有使用真实 target”，而是“PHR 没有使用当前最终生效的预测 target”。
+- 旧代码中 `getTarget()` 已经会对 indirect / return target 做 override，
+  但 `getPHistInfo()` 仍直接使用 `entry.target`。
+- 这会导致 fetch 实际沿着 override 后的 target 前进，而 PHR 却按未 override 的 target 更新。
+- XiangShan RTL 的行为明显要求 PHR 跟随当前生效的 fetch target，而不是允许它与 fetch path 脱节。
+- 因此，PR #814 将 `getPHistInfo()` 与 `getTarget()` 统一到同一套 target 解析逻辑，是与 RTL 语义一致的修复。
+
+### 性能结果
+
+- PR 已合入。评审过程中使用 `gcc12-spec06-0.8c` 的 Ideal BTB 性能数据做了对比分析。
+- 使用 `python3 run.py <archive> --slice gcc12` 对 PR run 与主线 Ideal BTB baseline 做对比后，结果为：
+  - Int score / GHz：`20.6907 -> 20.8375`，`+0.71%`
+  - Total branch wrong MPKI：`4.9768 -> 4.8501`
+  - Conditional branch MPKI：`4.8504 -> 4.7279`
+- 主要收益集中在 `gobmk`、`sjeng`、`gcc`，且更明显地体现在 conditional-path 相关错误下降上。
+- indirect / return 自身的聚合 MPKI 变化很小，这与“PHR/path history 一致性修复主要改善后续条件分支学习质量”的预期一致。
+
+## 进度
+
+- [x] 2026-04-09 12:10 确认该问题属于 gem5/RTL 行为对齐，需要单独记录执行计划。
+- [x] 2026-04-09 12:12 复核 gem5 中 `getTarget()` 与 `getPHistInfo()` 的旧差异。
+- [x] 2026-04-09 12:30 阅读 XiangShan RTL 中 `tage`、`history/phr` 与 `Bpu.scala` 路径，确认 PHR 的 target 来源与 update 优先级。
+- [x] 2026-04-09 12:40 给出 gem5 PR #814 与 RTL 的一致性结论。
+- [x] 2026-04-09 14:10 使用 `gem5_data_proc/run.py --slice gcc12` 分析 PR 的 `gcc12-spec06-0.8c` 数据，并将结论评论到 PR。
+- [x] 2026-04-09 14:20 PR 已合入，执行计划转移到 `completed/`。
+
+## 发现与意外情况
+
+- 当前用户提出了一个关键反问：PHR 使用的 target 不一定必须等于真实 target。
+  这说明评审不能只看 gem5 内部“是否自洽”，还必须核对 RTL 的实际设计语义。
+- GitHub 上部分 `Manual Performance Test` run 的显示元信息会混入 `xs-dev` 头信息，但实际 perf job checkout 的 commit 可能不同。
+  这次分析中最终以 archive 目录内的 `metadata.txt` 为准，避免错误选取 baseline。