Geant4 Profiling and Benchmarking

Geant4 CPU Performance by Version (Geant4.10.x-series and Geant4-11.0)

1) The Current profiling activity is a part of Geant4 Computing Performance Task

2) Profiling Results

Since January 2021 and release 10.7.r01 migrated to IntelXeonCPUE52650v2@2.60GHzS
(releases 10.5.p01, 10.6.p03, 10.7 re-profiled on the new resources)
(cyan: gcc 8.3.0 -O3), (light purple: gcc 11.3.0 -O3)
Geant4 Version Application Performance Summary
11.1.gcc11 SimplifiedCalo Open|Speedshop IgProf(Memory) CPU MEM
11.1.gcc11 cmsExp Open|Speedshop IgProf(Memory) CPU MEM
11.1.gcc11 cmsExpVecGeom Open|Speedshop IgProf(Memory) CPU MEM
11.1 SimplifiedCalo Open|Speedshop IgProf(Memory) CPU MEM
11.1 cmsExp Open|Speedshop IgProf(Memory) CPU MEM
11.1 cmsExpVecGeom Open|Speedshop IgProf(Memory) CPU MEM
11.1.c02 SimplifiedCalo Open|Speedshop IgProf(Memory) CPU MEM
11.1.c02 cmsExp Open|Speedshop IgProf(Memory) CPU MEM
11.1.c02 cmsExpVecGeom Open|Speedshop IgProf(Memory) CPU MEM
11.1.c01 SimplifiedCalo Open|Speedshop IgProf(Memory) CPU MEM
11.1.c01 cmsExp Open|Speedshop IgProf(Memory) CPU MEM
11.1.c01 cmsExpVecGeom Open|Speedshop IgProf(Memory) CPU MEM
11.1.c00 SimplifiedCalo Open|Speedshop IgProf(Memory) CPU MEM
11.1.c00 cmsExp Open|Speedshop IgProf(Memory) CPU MEM
11.1.c00 cmsExpVecGeom Open|Speedshop IgProf(Memory) CPU MEM
11.0.r09.gcc11 SimplifiedCalo Open|Speedshop IgProf(Memory) CPU MEM
11.0.r09.gcc11 cmsExp Open|Speedshop IgProf(Memory) CPU MEM
11.0.r09 SimplifiedCalo Open|Speedshop IgProf(Memory) CPU MEM
11.0.r09 cmsExp Open|Speedshop IgProf(Memory) CPU MEM
11.0.r09 cmsExpVecGeom Open|Speedshop IgProf(Memory) CPU MEM
11.0.r08 SimplifiedCalo Open|Speedshop IgProf(Memory) CPU MEM
11.0.r08 cmsExp Open|Speedshop IgProf(Memory) CPU MEM
11.0.r08 cmsExpVecGeom Open|Speedshop IgProf(Memory) CPU MEM
11.0.p03 SimplifiedCalo Open|Speedshop IgProf(Memory) CPU MEM
11.0.p03 cmsExp Open|Speedshop IgProf(Memory) CPU MEM
11.0.p03 cmsExpVecGeom Open|Speedshop IgProf(Memory) CPU MEM
10.7.p04 SimplifiedCalo Open|Speedshop IgProf(Memory) CPU MEM
10.7.p04 cmsExp Open|Speedshop IgProf(Memory) CPU MEM
10.7.p04 cmsExpVecGeom Open|Speedshop IgProf(Memory) CPU MEM
11.0.r07 SimplifiedCalo Open|Speedshop IgProf(Memory) CPU MEM
11.0.r07 cmsExp Open|Speedshop IgProf(Memory) CPU MEM
11.0.r07 cmsExpVecGeom Open|Speedshop IgProf(Memory) CPU MEM
11.0.r06 SimplifiedCalo Open|Speedshop IgProf(Memory) CPU MEM
11.0.r06 cmsExp Open|Speedshop IgProf(Memory) CPU MEM
11.0.r06 cmsExpVecGeom Open|Speedshop IgProf(Memory) CPU MEM
11.1.b01.c00 cmsExp Open|Speedshop IgProf(Memory) CPU MEM
11.0.r05 SimplifiedCalo Open|Speedshop IgProf(Memory) CPU MEM
11.0.r05 cmsExp Open|Speedshop IgProf(Memory) CPU MEM
11.0.r05 cmsExpVecGeom Open|Speedshop IgProf(Memory) CPU MEM
11.0.p02 SimplifiedCalo Open|Speedshop IgProf(Memory) CPU MEM
11.0.p02 cmsExp Open|Speedshop IgProf(Memory) CPU MEM
11.0.p02 cmsExpVecGeom Open|Speedshop IgProf(Memory) CPU MEM
11.0.r04 SimplifiedCalo Open|Speedshop IgProf(Memory) CPU MEM
11.0.r04 cmsExp Open|Speedshop IgProf(Memory) CPU MEM
11.0.r04 cmsExpVecGeom Open|Speedshop IgProf(Memory) CPU MEM
11.0.r03 SimplifiedCalo Open|Speedshop IgProf(Memory) CPU MEM
11.0.r03 cmsExp Open|Speedshop IgProf(Memory) CPU MEM
11.0.r03 cmsExpVecGeom Open|Speedshop IgProf(Memory) CPU MEM
11.0.p01 SimplifiedCalo Open|Speedshop IgProf(Memory) CPU MEM
11.0.p01 cmsExp Open|Speedshop IgProf(Memory) CPU MEM
11.0.p01 cmsExpVecGeom Open|Speedshop IgProf(Memory) CPU MEM
11.0.r02+MR2622 cmsExp Open|Speedshop IgProf(Memory) CPU MEM
11.0.r02 SimplifiedCalo Open|Speedshop IgProf(Memory) CPU MEM
11.0.r02 cmsExp Open|Speedshop IgProf(Memory) CPU MEM
11.0.r02 cmsExpVecGeom Open|Speedshop IgProf(Memory) CPU MEM
11.0.r01 SimplifiedCalo Open|Speedshop IgProf(Memory) CPU MEM
11.0.r01 cmsExp Open|Speedshop IgProf(Memory) CPU MEM
11.0.r01 cmsExpVecGeom Open|Speedshop IgProf(Memory) CPU MEM
11.0 SimplifiedCalo Open|Speedshop IgProf(Memory) CPU MEM
11.0-serial SimplifiedCalo Open|Speedshop IgProf(Memory) CPU MEM
11.0 cmsExp Open|Speedshop IgProf(Memory) CPU MEM
11.0-serial cmsExp Open|Speedshop IgProf(Memory) CPU MEM
11.0 cmsExpVecGeom Open|Speedshop IgProf(Memory) CPU MEM
10.7.p03 SimplifiedCalo Open|Speedshop IgProf(Memory) CPU MEM
10.7.p03 cmsExp Open|Speedshop IgProf(Memory) CPU MEM
10.7.p03 cmsExpVecGeom Open|Speedshop IgProf(Memory) CPU MEM
10.6.p03 SimplifiedCalo Open|Speedshop IgProf(Memory) CPU MEM
10.6.p03 cmsExp Open|Speedshop IgProf(Memory) CPU MEM
10.6.p03 cmsExpVecGeom Open|Speedshop IgProf(Memory) CPU MEM
10.5.p01 SimplifiedCalo Open|Speedshop IgProf(Memory) CPU MEM
10.5.p01 cmsExp Open|Speedshop IgProf(Memory) CPU MEM

Old Profiling Results: 9.4 9.5 9.6 10.0 10.1 10.2 10.3 10.4 10.5 10.6 10.7

3) CPU per Event: Summary Plots by Versions

SimplifiedCalo PYTHIA H->ZZ electrons pions protons anti-protons gamma
cmsExp PYTHIA H->ZZ

4) Total Memory Count: Summary Plots by Versions

SimplifiedCalo PYTHIA H->ZZ electrons pions protons anti-protons gamma

5) Geant4 MT/Tasking Performance

Geant4 Version Application Performance
11.1.gcc11 cmsExpTasking Intel Open|SpeedShop
11.1 cmsExpTasking Intel Open|SpeedShop
11.1.c02 cmsExpTasking Intel Open|SpeedShop
11.1.c01 cmsExpTasking Intel Open|SpeedShop
11.1.c00 cmsExpTasking Intel Open|SpeedShop
11.0.r09 cmsExpTasking Intel Open|SpeedShop
11.0.r08 cmsExpTasking Intel Open|SpeedShop
11.0.p03 cmsExpTasking Intel Open|SpeedShop
10.7.p04 cmsExpTasking Intel Open|SpeedShop
11.0.r07 cmsExpTasking Intel Open|SpeedShop
11.0.r06 cmsExpTasking Intel Open|SpeedShop
11.0.r05 cmsExpTasking Intel Open|SpeedShop
11.0.p02 cmsExpTasking Intel Open|SpeedShop
11.0.r04 cmsExpTasking Intel Open|SpeedShop
11.0.r03 cmsExpTasking Intel Open|SpeedShop
11.0.p01 cmsExpTasking Intel Open|SpeedShop
11.0.r02 cmsExpTasking Intel Open|SpeedShop
11.0.r01 cmsExpTasking Intel Open|SpeedShop
11.0.c01 cmsExpTasking Intel Open|SpeedShop
10.7.p01 cmsExpMT Intel Open|SpeedShop
10.7 cmsExpMT Intel Open|SpeedShop

6) Other Test Results and Activities

  1. Exploring details of OSS 2.4.1 when profiling Geant4 MT application, version 10.7 (50 GeV e- + cmsExpMT)
    The following output has been obtained with the use of environment variable CBTF_FORCE_ITIMER_SIGNAL=1
    (e.g. CBTF_FORCE_ITIMER_SIGNAL=1 osspcsamp " < your app > " [sampling frequency] )
    Log file from running osspcsamp with 1 threads
    Log file from running ossusertime with 1 threads
    Log file from running osspcsamp with 4 threads
    Log file from running ossusertime with 4 threads
    OSS DB file from running osspcsamp with 1 thread
    OSS DB file from running ossusertime with 1 thread
    OSS DB file from running osspcsamp with 4 threads
    OSS DB file from running ossusertime with 4 threads

7) Useful Links for Performance Tools and Optimization

  1. Open|Speedshop: Home page
  2. IgProf: Ignominous Profiler is a simple tool for measuring and analysing application memory and
    performance characteristics. For more information, see IgProf home page

  3. Other HPC Performance Tools: HPCToolkits TAU
  4. Software Optimization Resources: Agner Fog