This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision Last revision Both sides next revision | ||
hpc:bghep:benchmarks [2014/02/17 16:48] asc |
hpc:bghep:benchmarks [2014/05/27 16:33] edmay |
||
---|---|---|---|
Line 42: | Line 42: | ||
{{: | {{: | ||
+ | |||
+ | This plot shows the efficiency (R_1c / Nc / R_Nc) v. Nc, where | ||
+ | |||
+ | < | ||
+ | R == sec/ | ||
+ | Nc == number of cores used in job | ||
+ | </ | ||
+ | |||
+ | For a perfect speed-up this would always be 1. My experience with clusters of | ||
+ | smaller size is that 80% is usually achievable while 20% is quite low and the | ||
+ | usual interpretation is the code has high fraction of serialization. For this | ||
+ | case it would be more efficient to run 8 jobs of 512 cores than 1 job of 4096. | ||
+ | This of course is speculation on my part as I have not identified to cause of | ||
+ | the inefficiency! | ||
===== Fig 4 ===== | ===== Fig 4 ===== | ||
Line 47: | Line 61: | ||
{{: | {{: | ||
+ | The ALCF experts suggested the I/O model of 1 directory and many files in that 1 directory would preform badly due to lock contention | ||
+ | on the directory! Thus the example code was modified to use a model | ||
+ | of 1 output promc data file per directory. Running the modified code | ||
+ | produced the following figures: | ||
+ | http:// | ||
+ | Focusing on the ' | ||
+ | improvements both at low core numbers and a high core numbers | ||
+ | |||
+ | |||
+ | As part of the bootcamp for MIRA the code was moved to the | ||
+ | BG/Q Mira and a subset up the benchmarks were run in the new | ||
+ | IO model. The results are shown in | ||
+ | http:// | ||
[[hpc: | [[hpc: | ||
- | --- // | + | --- // |
+ | --- // | ||
--- // | --- // |