Документ взят из кэша поисковой машины. Адрес оригинального документа : http://classic.chem.msu.su/gran/gamess/skif.html
Дата изменения: Sat Apr 25 14:30:36 2009
Дата индексирования: Mon Oct 1 19:35:00 2012
Кодировка:
PC GAMESS benchmarks and scalability on SKIF K-1000 - large Infiniband Opteron Linux cluster

PC GAMESS v. 7.0.4 benchmarks and scalability on SKIF K-1000 - large Opteron Linux Infiniband cluster


Number of processors used

Test 1, Wall clock time and relative speedup

Test 3, Wall clock time and relative speedup

Test 5, Wall clock time and relative speedup

Test 6, Wall clock time and relative speedup

2

2489.3

2.00

4715.7

2.00

5454.7

2.00

13843.6

2.00

4

1273.4

3.91

2283.6

4.13

2958.4

3.69

7061.6

3.92

6

862.0

5.78

1528.8

6.17

2088.6

5.22

4799.4

5.77

8

660.8

7.53

1158.6

8.14

1720.8

6.34

3727.7

7.43

10

540.4

9.21

928.2

10.16

1465.0

7.45

2991.3

9.26

12

455.8

10.92

777.7

12.13

1288.1

8.47

2555.6

10.83

14

398.9

12.48

671.6

14.04

1157.6

9.42

2221.3

12.46

16

373.6

13.33

600.4

15.71

1117.4

9.76

2009.2

13.78

18

323.0

15.41

528.9

17.83

1047.8

10.41

1786.4

15.50

20

292.8

17.00

479.0

19.69

973.8

11.20

1637.4

16.91

22

266.6

18.67

436.7

21.60

898.8

12.13

1512.1

18.31

24

250.5

19.87

401.4

23.50

925.7

11.79

1405.6

19.70

26

233.3

21.34

374.0

25.22

848.7

12.85

1328.4

20.84

28

220.2

22.61

350.9

26.88

797.0

13.69

1262.3

21.93

30

207.4

24.00

326.6

28.88

788.1

13.84

1192.8

23.21

32

213.7

23.30

318.3

29.63

808.3

13.50

1213.4

22.82

34

194.0

25.66

292.2

32.28

832.9

13.10

1093.8

25.31

36

186.0

26.77

276.0

34.17

803.1

13.58

1054.9

26.25

38

178.4

27.91

263.4

35.81

720.1

15.15

1015.6

27.26

40

175.5

28.37

251.4

37.52

751.2

14.52

971.6

28.50

42

168.7

29.51

240.8

39.17

694.7

15.70

945.4

29.29

44

163.9

30.38

231.6

40.72

805.0

13.55

913.9

30.30

46

155.4

32.04

221.8

42.52

681.9

16.00

882.3

31.38

48

154.5

32.22

213.6

44.15

678.1

16.09

860.0

32.19

50

143.5

34.69

209.7

44.98

680.2

16.04

827.7

33.45

52

145.5

34.22

201.3

46.85

742.8

14.69

816.0

33.93

54

142.1

35.04

194.9

48.39

697.5

15.64

793.3

34.90

56

136.1

36.58

189.8

49.69

671.7

16.24

779.8

35.51

58

132.6

37.55

183.3

51.45

671.8

16.24

774.2

35.76

60

132.5

37.57

178.1

52.96

641.6

17.00

760.6

36.40

62

136.0

36.61

173.1

54.49

594.7

18.34

744.7

37.18

64

140.0

35.56

177.8

53.04

776.2

14.05

764.3

36.23

 

Graphical representation of scalability



OS and hardware description


SKIF K-1000 cluster, AMD Opteron 248, 2.2 GHz, dual processor, TYAN S2881G2NR baseboard, RAM - 8x512 MB DDR 333 (Corsair Memory 512 MB PC2700), HDD 160 GB IDE, 288 nodes, interconnect: Fat Tree Full Bisectional Bandwidth Infiniband, Mellanox MTS-2400 IB switch, Fedora Core Linux, InfiniBand software stack OpenIB v. 1.1, mvapich version 0.9.8 (ibverbs).  More information on this cluster (in Russian)



Tests description


Test 1, single-point direct DFT (B3LYP) energy plus gradient for medium-size system (623 basis functions). View image

Test 3, single-point direct MP2 energy for medium-size system (623 basis functions, the same system as one used for Test 1). View image

Test 5, single-point direct CASSCF(12,12) for medium-size system (retinal molecule, cc-pVDZ, 565 Cartesian basis functions) using ALDET code. View image

Test 6, single-point direct CIS energy plus gradient of first excited state of medium-size system (porphyrin molecule, cc-pVTZ (aug-cc on Nitrogens), 1130 Cartesian basis functions, D2h group). View image


Test comments


All tests were run in standard parallel mode using dynamic load balancing over p2p interface, with two processes per each dual-CPU node. Wall clock times are given on master node in seconds. Test 5 is the most communication intensive and would scale better for larger job. For these tests, the most commonly used MPI calls are MPI_Allreduce and MPI_Bcast. Note that none of these tests was specifically designed to test scalability on large clusters using large number of nodes.


We are grateful to Dr. Oleg Tchij (United Institute of Informatics Problems, NAS, Belarus) for providing access to SKIF K-1000 cluster and helpful comments.

Copyright © 2007 by Alex A. Granovsky

Press to visit PC GAMESS v. 7.0.4 benchmarks and scalability on 21-node Pentium 4 Infiniband Linux cluster page

Press to visit PC GAMESS' eight core systems performance comparison page

Press to visit PC GAMESS' Woodcrest vs. Opteron performance comparison page

Press to visit PC GAMESS Pentium 4 family Xeon processor benchmarks page to compare the results of these benchmarks with those obtained on Xeon DP processors.

Press to visit PC GAMESS Pentium 4 family benchmarks page to compare the results of these benchmarks with those obtained on various Netburst (Pentium 4 and Pentium D) processors.

Press to visit the PC GAMESS vs. WinGamess performance comparison page to compare the results of these benchmarks with those obtained on older processors. Input files can be found there too.