Geant4 Profiling and Benchmarking

1) The Current profiling activity is a part of Geant4 Computing Performance Task

2) Profiling Results

*: recompiled & rerun at the date
Versions in pink color were profiled on the Wilson cluster using AMD 6128HE Opteron 2GHz & gcc 4.9.2
Geant4 Version Application Performance Summary
10.2.r00 (10.2) SimplifiedCalo Simple Profiler Memory Profiler CPU MEM
10.1.r10rr (11/10) SimplifiedCalo Simple Profiler Memory Profiler CPU MEM
10.1.r10 SimplifiedCalo Simple Profiler Memory Profiler CPU MEM
10.1.r09 SimplifiedCalo Simple Profiler Memory Profiler CPU MEM
10.1.r08 SimplifiedCalo Simple Profiler Memory Profiler CPU MEM
10.1.r07 SimplifiedCalo Simple Profiler Memory Profiler CPU MEM
10.1.r06 (10.2.beta) SimplifiedCalo Simple Profiler Memory Profiler CPU MEM
10.1.r05 SimplifiedCalo Simple Profiler Memory Profiler CPU MEM
10.1.r04 SimplifiedCalo Simple Profiler Memory Profiler CPU MEM
10.1.r03 SimplifiedCalo Simple Profiler Memory Profiler CPU MEM
10.1.r02 SimplifiedCalo Simple Profiler Memory Profiler CPU MEM
10.1.r01 SimplifiedCalo Simple Profiler Memory Profiler CPU MEM
10.1.p02 SimplifiedCalo Simple Profiler Memory Profiler CPU MEM
10.1.p01 SimplifiedCalo Simple Profiler Memory Profiler CPU MEM
10.1.r00 SimplifiedCalo Simple Profiler Memory Profiler CPU MEM
10.0.p04 SimplifiedCalo Simple Profiler Memory Profiler CPU MEM
10.0.p03 SimplifiedCalo Simple Profiler Memory Profiler CPU MEM
10.0 SimplifiedCalo Simple Profiler Memory Profiler CPU MEM
9.6.p04 SimplifiedCalo Simple Profiler Memory Profiler CPU MEM
Geant4 Version Application Performance Summary
10.0 cmsExp (Calo) Simple Profiler Memory Profiler CPU MEM
9.6 cmsExp (Calo) Simple Profiler Memory Profiler CPU MEM

Old Profiling Results: 9.4 9.5 9.6 10.0

3) CPU per Event: Summary Plots by Versions

SimplifiedCalo PYTHIA H->ZZ electrons pions protons anti-protons

4) Total Memory Count: Summary Plots by Versions

SimplifiedCalo PYTHIA H->ZZ electrons pions protons anti-protons

5) CPU Summary Plots by Physics Lists: 9.6 10.0

6) Geant4 MT Performance

Geant4 Version Application Performance
10.2.r00 cmsExpMT Summary Open|SpeedShop
ParFullCMS Summary XeonPhi (10.2.cand03)
10.1.r10 cmsExpMT Summary Open|SpeedShop
ParFullCMS Summary XeonPhi
10.1.r09 cmsExpMT Summary Open|SpeedShop
10.1.r08 cmsExpMT Summary Open|SpeedShop
10.1.r07 cmsExpMT Summary Open|SpeedShop
10.1.r06 cmsExpMT Summary Open|SpeedShop
ParFullCMS Summary XeonPhi
10.1.r05 cmsExpMT Summary Open|SpeedShop
10.1.r04 cmsExpMT Summary Open|SpeedShop
10.1.r03 cmsExpMT Summary Open|SpeedShop
10.1.r03 (B=0) cmsExpMT Summary Open|SpeedShop
10.1.r02 cmsExpMT Summary Open|SpeedShop
10.1.r01 cmsExpMT Summary Open|SpeedShop
10.1.p02 cmsExpMT Summary Open|SpeedShop
ParFullCMS Summary XeonPhi
10.1.p01 cmsExpMT Summary Open|SpeedShop
10.1 cmsExpMT Summary Open|SpeedShop
10.0.p04 cmsExpMT Summary Open|SpeedShop
ParFullCMS Summary XeonPhi
10.0 cmsExpMT Summary Open|SpeedShop

7) Other Test Results and Activities

  • Performance studies: 9.6 10.0 10.1
  • Code reviews (summary pdf files): CHIPS Physics Field Propagation EM Physics
  • 8) Useful Links for Performance Tools and Optimization

    1. Fast: FAST is a set of tools for collecting, managing, and analyzing data about code performance.
      Instructions for use of the FAST toolkit is available at FAST project page

    2. IgProf: Ignominous Profiler is a simple tool for measuring and analysing application memory and
      performance characteristics. For more information, see IgProf home page

    3. HPC Performance Tools: HPCToolkits Open|SpeedShop TAU
    4. Software Optimization Resources: Agner Fog