With -p ncrc2.pgi -t repro-openmp ocean-ice testcases crash around day 17 with the following message:
[NID 00151] 2014-05-01 20:41:14 Apid 121706: initiated application termination
[NID 00151] 2014-05-02 00:41:17 Apid 121706: OOM killer terminated this process.
Application 121706 exit signals: Killed
print_memory_usage shows a uniform memory increase in each timestep:
20140502 102929.609: Memuse(MB) at Main loop at coupling timestep= 1= 2.120E+02 2.236E+02 2.231E+00 2.187E+02
20140502 103440.266: Memuse(MB) at Main loop at coupling timestep= 100= 1.007E+03 1.030E+03 6.008E+00 1.015E+03
20140502 103944.863: Memuse(MB) at Main loop at coupling timestep= 200= 1.808E+03 1.835E+03 7.737E+00 1.818E+03
20140502 104046.886: Memuse(MB) at Main loop at coupling timestep= 219= 1.964E+03 1.992E+03 6.961E+00 1.975E+03
[NID 00028] 2014-05-02 10:40:50 Apid 121760: initiated application termination
[NID 00028] 2014-05-02 14:40:53 Apid 121760: OOM killer terminated this process.
These are the tests that crashed:
/lustre/f1/Niki.Zadeh/tikal_201403_cjgUpdates_mom6_20140501_libs/MOM6_GOLD_SIS/ncrc2.pgi-repro-openmp/stdout/run/MOM6_GOLD_SIS_1x0m20d_32pe.o5032301
/lustre/f1/Niki.Zadeh/tikal_201403_cjgUpdates_mom6_20140501_libs/MOM6_GOLD_SIS_symmetric/ncrc2.pgi-repro-openmp/stdout/run/MOM6_GOLD_SIS_symmetric_1x0m20d_32pe.o5032302
/lustre/f1/Niki.Zadeh/tikal_201403_cjgUpdates_mom6_20140501_libs/MOM6_GOLD_SIS_icebergs/ncrc2.pgi-repro-openmp/stdout/run/MOM6_GOLD_SIS_icebergs_1x0m20d_32pe.o5032303
/lustre/f1/Niki.Zadeh/tikal_201403_cjgUpdates_mom6_20140501_libs/MOM6_SIS2/ncrc2.pgi-repro-openmp/stdout/run/MOM6_SIS2_1x0m20d_32pe.o5032311
/lustre/f1/Niki.Zadeh/tikal_201403_cjgUpdates_mom6_20140501_libs/MOM6_SIS2_cgrid/ncrc2.pgi-repro-openmp/stdout/run/MOM6_SIS2_cgrid_1x0m20d_32pe.o5032314
With -p ncrc2.pgi -t repro-openmp ocean-ice testcases crash around day 17 with the following message:
print_memory_usage shows a uniform memory increase in each timestep:
These are the tests that crashed:
/lustre/f1/Niki.Zadeh/tikal_201403_cjgUpdates_mom6_20140501_libs/MOM6_GOLD_SIS/ncrc2.pgi-repro-openmp/stdout/run/MOM6_GOLD_SIS_1x0m20d_32pe.o5032301
/lustre/f1/Niki.Zadeh/tikal_201403_cjgUpdates_mom6_20140501_libs/MOM6_GOLD_SIS_symmetric/ncrc2.pgi-repro-openmp/stdout/run/MOM6_GOLD_SIS_symmetric_1x0m20d_32pe.o5032302
/lustre/f1/Niki.Zadeh/tikal_201403_cjgUpdates_mom6_20140501_libs/MOM6_GOLD_SIS_icebergs/ncrc2.pgi-repro-openmp/stdout/run/MOM6_GOLD_SIS_icebergs_1x0m20d_32pe.o5032303
/lustre/f1/Niki.Zadeh/tikal_201403_cjgUpdates_mom6_20140501_libs/MOM6_SIS2/ncrc2.pgi-repro-openmp/stdout/run/MOM6_SIS2_1x0m20d_32pe.o5032311
/lustre/f1/Niki.Zadeh/tikal_201403_cjgUpdates_mom6_20140501_libs/MOM6_SIS2_cgrid/ncrc2.pgi-repro-openmp/stdout/run/MOM6_SIS2_cgrid_1x0m20d_32pe.o5032314