Skip to content

Checkpoint files are not generated #873

@rseac

Description

@rseac

Before start

  • I have read the XiangShan Documents. 我已经阅读过香山文档。
  • I have searched the previous issues and did not find anything relevant. 我已经搜索过之前的 issue,并没有找到相关的。
  • I have searched the previous discussions and did not find anything relevant. 我已经搜索过之前的 discussions,并没有找到相关的。
  • I have reproduced the problem using the latest commit on the master branch. 我已经使用 master 分支最新的 commit 复现了问题。

Describe you problem

I am trying to generate checkpoints using NEMU so that I can run it on XiangShan. I am following the instructions to do so, but the checkpoint files are not generated. It appears that the profiling is done, and the clustering is done. But the generation of the checkpoints doesn't result in any files being generated.

What did you do before

Setup tools

git clone https://github.com/OpenXiangShan/xs-env.git
cd /xs-env && sudo -s ./setup-tools.sh && ./setup.sh && source env.sh && source update-submodule.sh

Setup NEMU and simpoint

cd $NEMU_HOME
git submodule update --init

cd $NEMU_HOME/resource/simpoint/simpoint_repo
make clean
make

cd $NEMU_HOME
make clean
make riscv64-xs-cpt_defconfig
make -j 8

cd $NEMU_HOME/resource/gcpt_restore
make 

Set an example from nexus-am/apps for checkpoint

cd /xs-env/nexus-am/apps/hello/

Rework the hello.c to so that the traps are set.

#define DISABLE_TIME_INTR 0x100
#define NOTIFY_PROFILER 0x101
#define GOOD_TRAP 0x0

void nemu_signal(int a){
    asm volatile ("mv a0, %0\n\t"
                  ".insn r 0x6B, 0, 0, x0, x0, x0\n\t"
                  :
                  : "r"(a)
                  : "a0");
}
#include <klib.h>

int main()
{

    nemu_signal(DISABLE_TIME_INTR);
    nemu_signal(NOTIFY_PROFILER);
    printf("Hello, XiangShan!\n");
    nemu_signal(GOOD_TRAP);
    return 0;
}

Compile hello

make ARCH=riscv64-xs

Run the checkpoint steps

I used the following script.

#!/bin/bash

# prepare env

export NEMU_HOME=/xs-env/NEMU
export NEMU=$NEMU_HOME/build/riscv64-nemu-interpreter
export GCPT=$NEMU_HOME/resource/gcpt_restore/build/gcpt.bin
export SIMPOINT=$NEMU_HOME/resource/simpoint/simpoint_repo/bin/simpoint

export WORKLOAD_ROOT_PATH=/xs-env/nexus-am/apps/hello/build/
export LOG_PATH=$NEMU_HOME/hello/logs
export RESULT=$NEMU_HOME/hello_result
export profiling_result_name=simpoint-profiling
export PROFILING_RES=$RESULT/$profiling_result_name
export interval=$((2))

# Profiling
# using config: riscv64-xs-cpt_defconfig
profiling(){
    set -x
    workload=$1
    log=$LOG_PATH/profiling_logs
    mkdir -p $log

    $NEMU ${WORKLOAD_ROOT_PATH}/${workload}.bin \
        -D $RESULT -w $workload -C $profiling_result_name    \
        -b --simpoint-profile --cpt-interval ${interval} > $log/${workload}-out.txt 2>${log}/${workload}-err.txt
}

export -f profiling

# Cluster

cluster(){
    set -x
    workload=$1

    export CLUSTER=$RESULT/cluster/${workload}
    mkdir -p $CLUSTER

    random1=`head -20 /dev/urandom | cksum | cut -c 1-6`
    random2=`head -20 /dev/urandom | cksum | cut -c 1-6`

    log=$LOG_PATH/cluster_logs/cluster
    mkdir -p $log

    $SIMPOINT \
        -loadFVFile $PROFILING_RES/${workload}/simpoint_bbv.gz \
        -saveSimpoints $CLUSTER/simpoints0 -saveSimpointWeights $CLUSTER/weights0 \
        -inputVectorsGzipped -maxK 30 -numInitSeeds 2 -iters 1000 -seedkm ${random1} -seedproj ${random2} \
        > $log/${workload}-out.txt 2> $log/${workload}-err.txt
}

export -f cluster
# Checkpointing
# using config: riscv64-xs-cpt_defconfig
checkpoint(){
    set -x
    workload=$1

    export CLUSTER=$RESULT/cluster
    log=$LOG_PATH/checkpoint_logs
    mkdir -p $log
    $NEMU ${WORKLOAD_ROOT_PATH}/${workload}.bin \
         -D $RESULT -w ${workload} -C spec-cpt  \
         -b -S $CLUSTER --cpt-interval $interval \
         --checkpoint-format zstd > $log/${workload}-out.txt 2>$log/${workload}-err.txt
}

export -f checkpoint

profiling hello-riscv64-xs
cluster hello-riscv64-xs
checkpoint hello-riscv64-xs

The files I see generated

tree NEMU/hello*

NEMU/hello
`-- logs
    |-- checkpoint_logs
    |   |-- hello-riscv64-xs-err.txt
    |   `-- hello-riscv64-xs-out.txt
    |-- cluster_logs
    |   `-- cluster
    |       |-- hello-riscv64-xs-err.txt
    |       `-- hello-riscv64-xs-out.txt
    `-- profiling_logs
        |-- hello-riscv64-xs-err.txt
        `-- hello-riscv64-xs-out.txt
NEMU/hello_result
|-- cluster
|   `-- hello-riscv64-xs
|       |-- simpoints0
|       `-- weights0
|-- simpoint-profiling
|   `-- hello-riscv64-xs
|       `-- simpoint_bbv.gz
`-- spec-cpt
    `-- hello-riscv64-xs
        `-- 1

Environment

  • XiangShan branch: master
  • XiangShan commit id: 4bbdccbb077840af5e1b65c7138d31af3966f625
  • NEMU commit id: 4a24b77
  • SPIKE commit id:
  • Operating System: Ubuntu 22.04
  • gcc version: 11.4.0
  • mill version: 0.12.10
  • java version: 11.0.26

Additional context

I also tried this with the application stream (as it has been used in some of the tutorials such as ASPLOS 2025), but I had the same problem: nexus-am/apps/stream.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions