Skip to content

Support HIP#225

Merged
E3SM-Bot merged 5 commits into
masterfrom
ambrad/hip
Jun 27, 2022
Merged

Support HIP#225
E3SM-Bot merged 5 commits into
masterfrom
ambrad/hip

Conversation

@ambrad

@ambrad ambrad commented Jun 15, 2022

Copy link
Copy Markdown
Member
  • Make ekat.hpp the go-to header for Kokkos_Core.hpp (instead of including Kokkos_Core.hpp in various files) and EkatGpuSpace. This way, we can define Kokkos things that EKAT needs right after and be sure they are seen.
  • Use EKAT_ENABLE_GPU in several spots instead of KOKKOS_ENABLE_CUDA.
  • Use EkatGpuSpace in several spots instead of Kokkos::Cuda.
  • Create some Kokkos::Experimental::HIP specializations.

ambrad added 4 commits June 14, 2022 18:05
First draft:
* Make ekat.hpp the entry point for Kokkos_Core and EkatGpuSpace.
* Use EKAT_ENABLE_GPU in several spots instead of KOKKOS_ENABLE_CUDA.
* Use EkatGpuSpace in several spots instead of Kokkos::Cuda.
* Create some Kokkos::Experimental::HIP specializations.
@request-info

request-info Bot commented Jun 15, 2022

Copy link
Copy Markdown

The maintainers of this repository would appreciate it if you could provide more information.

@ambrad

ambrad commented Jun 15, 2022

Copy link
Copy Markdown
Member Author

Tested on crusher with environment:

# Default env except for:
module load rocm

EKAT config:

rm -rf CMake*               
cmake \                         
    -D CMAKE_CXX_COMPILER=CC \    
    -D Kokkos_ENABLE_HIP=ON \           
    -D Kokkos_ARCH_VEGA90A=ON \    
    -D CMAKE_CXX_STANDARD=14 \    
    ~/repo/ekat

Didn't try to figure out MPI stuff, so all tests pass except

    8 - comm_np1 (Failed)                # and the other np values
   15 - debug_tools (Failed)                                                      
   25 - mpi_file_log_tests_np1 (Failed)  # and the other np values
   29 - console_only_log_np4 (Failed)

@ambrad ambrad added AT: WIP and removed AT: WIP labels Jun 15, 2022
@ambrad ambrad changed the title [WIP] Support HIP Support HIP Jun 15, 2022
@E3SM-Bot

Copy link
Copy Markdown
Collaborator

Status Flag 'Pre-Test Inspection' - Auto Inspected - Inspection Is Not Necessary for this Pull Request.

@ambrad ambrad requested a review from bartgol June 15, 2022 01:34
@E3SM-Bot

Copy link
Copy Markdown
Collaborator

Status Flag 'Pull Request AutoTester' - Failure: Timed out waiting for job EKAT_PullRequest_Autotester_Mappy to start: Total Wait = 1803

  • Other jobs have been previously started - We must stop them...

@E3SM-Bot

Copy link
Copy Markdown
Collaborator

Status Flag 'Pre-Test Inspection' - Auto Inspected - Inspection Is Not Necessary for this Pull Request.

@E3SM-Bot

Copy link
Copy Markdown
Collaborator

Status Flag 'Pull Request AutoTester' - Failure: Timed out waiting for job EKAT_PullRequest_Autotester_Weaver to start: Total Wait = 1803

  • Other jobs have been previously started - We must stop them...

@E3SM-Bot

Copy link
Copy Markdown
Collaborator

Status Flag 'Pull Request AutoTester' - Failure: Timed out waiting for job EKAT_PullRequest_Autotester_Mappy to start: Total Wait = 1803

  • Other jobs have been previously started - We must stop them...

@E3SM-Bot

Copy link
Copy Markdown
Collaborator

Status Flag 'Pull Request AutoTester' - Failure: Timed out waiting for job EKAT_PullRequest_Autotester_Weaver to start: Total Wait = 1803

  • Other jobs have been previously started - We must stop them...

@ambrad

ambrad commented Jun 15, 2022

Copy link
Copy Markdown
Member Author

A Summit case ran fine with this version of EKAT. Below, the max timer for EAMxx::run in the Eulerian-transport run is 4% longer than the best used for a recent data collection campaign, which is well within system variability. The second timer set is when running with

    fac=6
    ./atmchange transport_alg=12
    ./atmchange dt_tracer_factor=$fac
    ./atmchange hypervis_subcycle_q=$fac
Eulerian transport             #GPU    callcount     timer sum        max     min
"a:EAMxx::run"                   24 5.760000e+03   1.651179e+03    71.496  65.400
"a:EAMxx::Dynamics::run"         24 5.760000e+03   7.185435e+02    30.001  29.869
"a:EAMxx::physics::run"          24 5.760000e+03   9.299047e+02    41.433  35.356
"a:EAMxx::Macrophysics::run"     24 3.456000e+04   9.677942e+01     4.258   3.935
"a:EAMxx::Microphysics::run"     24 3.456000e+04   5.307392e+02    24.998  19.379
SL transport
"a:EAMxx::run"                   24 5.760000e+03   1.475551e+03    63.912  58.009
"a:EAMxx::Dynamics::run"         24 5.760000e+03   5.470805e+02    22.836  22.757
"a:EAMxx::physics::run"          24 5.760000e+03   9.257130e+02    41.028  35.125
"a:EAMxx::Macrophysics::run"     24 3.456000e+04   9.689708e+01     4.289   3.945
"a:EAMxx::Microphysics::run"     24 3.456000e+04   5.256874e+02    24.558  19.177

@bartgol bartgol left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Neat. Thanks!

@ambrad

ambrad commented Jun 25, 2022

Copy link
Copy Markdown
Member Author

@bartgol, do I need to do something other than add the label AT:RETEST to get this PR to go through the AT? Previous attempts a week ago were blocked by weaver being down. Thanks.

@E3SM-Bot

Copy link
Copy Markdown
Collaborator

Status Flag 'Pull Request AutoTester' - User Requested Retest - Label AT: RETEST will be reset after testing.

@E3SM-Bot

Copy link
Copy Markdown
Collaborator

Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects:

Pull Request Auto Testing STARTING (click to expand)

Build Information

Test Name: EKAT_PullRequest_Autotester_Mappy

  • Build Num: 314
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
EKAT_SOURCE_BRANCH ambrad/hip
EKAT_SOURCE_REPO https://github.com/E3SM-Project/EKAT
EKAT_SOURCE_SHA ecfbe71
EKAT_TARGET_BRANCH master
EKAT_TARGET_REPO https://github.com/E3SM-Project/EKAT
EKAT_TARGET_SHA 0b3486f
PR_LABELS AT: AUTOMERGE;AT: RETEST
PULLREQUESTNUM 225
TEST_REPO_ALIAS EKAT

Build Information

Test Name: EKAT_PullRequest_Autotester_Weaver

  • Build Num: 411
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
EKAT_SOURCE_BRANCH ambrad/hip
EKAT_SOURCE_REPO https://github.com/E3SM-Project/EKAT
EKAT_SOURCE_SHA ecfbe71
EKAT_TARGET_BRANCH master
EKAT_TARGET_REPO https://github.com/E3SM-Project/EKAT
EKAT_TARGET_SHA 0b3486f
PR_LABELS AT: AUTOMERGE;AT: RETEST
PULLREQUESTNUM 225
TEST_REPO_ALIAS EKAT

Build Information

Test Name: EKAT_PullRequest_Autotester_Blake

  • Build Num: 428
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
EKAT_SOURCE_BRANCH ambrad/hip
EKAT_SOURCE_REPO https://github.com/E3SM-Project/EKAT
EKAT_SOURCE_SHA ecfbe71
EKAT_TARGET_BRANCH master
EKAT_TARGET_REPO https://github.com/E3SM-Project/EKAT
EKAT_TARGET_SHA 0b3486f
PR_LABELS AT: AUTOMERGE;AT: RETEST
PULLREQUESTNUM 225
TEST_REPO_ALIAS EKAT

Using Repos:

Repo: EKAT (E3SM-Project/EKAT)
  • Branch: ambrad/hip
  • SHA: ecfbe71
  • Mode: TEST_REPO

Pull Request Author: ambrad

@E3SM-Bot

Copy link
Copy Markdown
Collaborator

Status Flag 'Pull Request AutoTester' - Jenkins Testing: all Jobs PASSED

Pull Request Auto Testing has PASSED (click to expand)

Build Information

Test Name: EKAT_PullRequest_Autotester_Mappy

  • Build Num: 314
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
EKAT_SOURCE_BRANCH ambrad/hip
EKAT_SOURCE_REPO https://github.com/E3SM-Project/EKAT
EKAT_SOURCE_SHA ecfbe71
EKAT_TARGET_BRANCH master
EKAT_TARGET_REPO https://github.com/E3SM-Project/EKAT
EKAT_TARGET_SHA 0b3486f
PR_LABELS AT: AUTOMERGE;AT: RETEST
PULLREQUESTNUM 225
TEST_REPO_ALIAS EKAT

Build Information

Test Name: EKAT_PullRequest_Autotester_Weaver

  • Build Num: 411
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
EKAT_SOURCE_BRANCH ambrad/hip
EKAT_SOURCE_REPO https://github.com/E3SM-Project/EKAT
EKAT_SOURCE_SHA ecfbe71
EKAT_TARGET_BRANCH master
EKAT_TARGET_REPO https://github.com/E3SM-Project/EKAT
EKAT_TARGET_SHA 0b3486f
PR_LABELS AT: AUTOMERGE;AT: RETEST
PULLREQUESTNUM 225
TEST_REPO_ALIAS EKAT

Build Information

Test Name: EKAT_PullRequest_Autotester_Blake

  • Build Num: 428
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
EKAT_SOURCE_BRANCH ambrad/hip
EKAT_SOURCE_REPO https://github.com/E3SM-Project/EKAT
EKAT_SOURCE_SHA ecfbe71
EKAT_TARGET_BRANCH master
EKAT_TARGET_REPO https://github.com/E3SM-Project/EKAT
EKAT_TARGET_SHA 0b3486f
PR_LABELS AT: AUTOMERGE;AT: RETEST
PULLREQUESTNUM 225
TEST_REPO_ALIAS EKAT

@E3SM-Bot

Copy link
Copy Markdown
Collaborator

Status Flag 'Pre-Merge Inspection' - SUCCESS: The last commit to this Pull Request has been INSPECTED AND APPROVED by [ bartgol ]!

@E3SM-Bot

Copy link
Copy Markdown
Collaborator

Status Flag 'Pull Request AutoTester' - Pull Request will be Automerged

@E3SM-Bot E3SM-Bot merged commit 28529f2 into master Jun 27, 2022
@E3SM-Bot E3SM-Bot deleted the ambrad/hip branch June 27, 2022 13:22
@E3SM-Bot

Copy link
Copy Markdown
Collaborator

Merge on Pull Request# 225: IS A SUCCESS - Pull Request successfully merged

@bartgol

bartgol commented Jun 27, 2022

Copy link
Copy Markdown
Contributor

Sorry, just saw your comment. Looks like weaver is back online. Until next time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants