Geant4 Profiling and Benchmarking

Geant4 CPU Performance by Version (from Geant4.10.5.p01 through Geant4-11.1)

1) The Current profiling activity is a part of Geant4 Computing Performance Task

2) Profiling Results

Since January 2021 and release 10.7.r01 migrated to IntelXeonCPUE52650v2@2.60GHzS
(releases 10.5.p01, 10.6.p03, 10.7 re-profiled on the new resources)
Since January 2024 and release 11.2.r01 migrated to EL8 (release 11.2 reprofilied for the new OS and updated version of gcc)
(cyan: SL7 gcc 8.3.0 -O3), (medium purple: SL7 gcc 11.3.0 -O3), (light purple: EL8 gcc 11.4.0 -O3)
Geant4 Version Application Performance Summary
11.2.r02 SimplifiedCalo Open|Speedshop IgProf(Memory) CPU MEM
11.2.r02 cmsExp Open|Speedshop IgProf(Memory) CPU MEM
11.2.r02 cmsExpVG Open|Speedshop IgProf(Memory) CPU MEM
11.2.p01 SimplifiedCalo Open|Speedshop IgProf(Memory) CPU MEM
11.2.p01 cmsExp Open|Speedshop IgProf(Memory) CPU MEM
11.2.p01 cmsExpVG Open|Speedshop IgProf(Memory) CPU MEM
11.2.r01 SimplifiedCalo Open|Speedshop IgProf(Memory) CPU MEM
11.2.r01 cmsExp Open|Speedshop IgProf(Memory) CPU MEM
11.2.r01 cmsExpVG Open|Speedshop IgProf(Memory) CPU MEM
11.2.el8 SimplifiedCalo Open|Speedshop IgProf(Memory) CPU MEM
11.2.el8 cmsExp Open|Speedshop IgProf(Memory) CPU MEM
11.2.el8 cmsExpVG Open|Speedshop IgProf(Memory) CPU MEM
11.2 SimplifiedCalo Open|Speedshop IgProf(Memory) CPU MEM
11.2 cmsExp Open|Speedshop IgProf(Memory) CPU MEM
11.2 cmsExpVecGeom Open|Speedshop IgProf(Memory) CPU MEM
11.1.p03 SimplifiedCalo Open|Speedshop IgProf(Memory) CPU MEM
11.1.p03 cmsExp Open|Speedshop IgProf(Memory) CPU MEM
11.1.p03 cmsExpVecGeom Open|Speedshop IgProf(Memory) CPU MEM
11.0.p04 SimplifiedCalo Open|Speedshop IgProf(Memory) CPU MEM
11.0.p04 cmsExp Open|Speedshop IgProf(Memory) CPU MEM
11.0.p04 cmsExpVecGeom Open|Speedshop IgProf(Memory) CPU MEM
11.0.p03.gcc11 SimplifiedCalo Open|Speedshop IgProf(Memory) CPU MEM
11.0.p03.gcc11 cmsExp Open|Speedshop IgProf(Memory) CPU MEM
11.0.p03.gcc11 cmsExpVecGeom Open|Speedshop IgProf(Memory) CPU MEM
11.0.p03 SimplifiedCalo Open|Speedshop IgProf(Memory) CPU MEM
11.0.p03 cmsExp Open|Speedshop IgProf(Memory) CPU MEM
11.0.p03 cmsExpVecGeom Open|Speedshop IgProf(Memory) CPU MEM
10.7.p04 SimplifiedCalo Open|Speedshop IgProf(Memory) CPU MEM
10.7.p04 cmsExp Open|Speedshop IgProf(Memory) CPU MEM
10.7.p04 cmsExpVecGeom Open|Speedshop IgProf(Memory) CPU MEM
10.6.p03 SimplifiedCalo Open|Speedshop IgProf(Memory) CPU MEM
10.6.p03 cmsExp Open|Speedshop IgProf(Memory) CPU MEM
10.6.p03 cmsExpVecGeom Open|Speedshop IgProf(Memory) CPU MEM
10.5.p01 SimplifiedCalo Open|Speedshop IgProf(Memory) CPU MEM
10.5.p01 cmsExp Open|Speedshop IgProf(Memory) CPU MEM

Old Profiling Results: 9.4 9.5 9.6 10.0 10.1 10.2 10.3 10.4 10.5 10.6 10.7 11.0 11.1

3) CPU per Event: Summary Plots by Versions

SimplifiedCalo PYTHIA H->ZZ electrons pions protons anti-protons gamma
cmsExp PYTHIA H->ZZ

4) Total Memory Count: Summary Plots by Versions

SimplifiedCalo PYTHIA H->ZZ electrons pions protons anti-protons gamma

5) Geant4 MT/Tasking Performance

Geant4 Version Application Performance
11.2.r02 cmsExpTasking Intel Open|SpeedShop
11.2.p01 cmsExpTasking Intel Open|SpeedShop
11.2.r01 cmsExpTasking Intel Open|SpeedShop
11.2.el8 cmsExpTasking Intel Open|SpeedShop
11.2 cmsExpTasking Intel Open|SpeedShop
11.1.p03 cmsExpTasking Intel Open|SpeedShop
11.0.p04 cmsExpTasking Intel Open|SpeedShop
11.0.p03.gcc11 cmsExpTasking Intel Open|SpeedShop
11.0.p03 cmsExpTasking Intel Open|SpeedShop
10.7.p04 cmsExpTasking Intel Open|SpeedShop

6) Other Test Results and Activities

  1. Exploring details of OSS 2.4.1 when profiling Geant4 MT application, version 10.7 (50 GeV e- + cmsExpMT)
    The following output has been obtained with the use of environment variable CBTF_FORCE_ITIMER_SIGNAL=1
    (e.g. CBTF_FORCE_ITIMER_SIGNAL=1 osspcsamp " < your app > " [sampling frequency] )
    Log file from running osspcsamp with 1 threads
    Log file from running ossusertime with 1 threads
    Log file from running osspcsamp with 4 threads
    Log file from running ossusertime with 4 threads
    OSS DB file from running osspcsamp with 1 thread
    OSS DB file from running ossusertime with 1 thread
    OSS DB file from running osspcsamp with 4 threads
    OSS DB file from running ossusertime with 4 threads

7) Useful Links for Performance Tools and Optimization

  1. Open|Speedshop: Home page
  2. IgProf: Ignominous Profiler is a simple tool for measuring and analysing application memory and
    performance characteristics. For more information, see IgProf home page

  3. Other HPC Performance Tools: HPCToolkits TAU
  4. Software Optimization Resources: Agner Fog