X-Git-Url: http://info.iut-bm.univ-fcomte.fr/pub/gitweb/simgrid.git/blobdiff_plain/364eee0fc6ab77fddc5437ac273527bd27711724..b9625f82f86db0674e911887addce45dca31b57f:/src/smpi/colls/reduce/reduce-rab.cpp diff --git a/src/smpi/colls/reduce/reduce-rab.cpp b/src/smpi/colls/reduce/reduce-rab.cpp index 5b2f4b20fb..c1e5cf7e02 100644 --- a/src/smpi/colls/reduce/reduce-rab.cpp +++ b/src/smpi/colls/reduce/reduce-rab.cpp @@ -46,7 +46,7 @@ /* Fast reduce and allreduce algorithm for longer buffers and predefined operations. - This algorithm is explaned with the example of 13 nodes. + This algorithm is explained with the example of 13 nodes. The nodes are numbered 0, 1, 2, ... 12. The sendbuf content is a, b, c, ... m. The buffer array is notated with ABCDEFGH, this means that @@ -65,7 +65,7 @@ Exa.: size=13 ==> n=3, r=5 (i.e. size == 13 == 2**n+r == 2**3 + 5) - The algoritm needs for the execution of one Colls::reduce + The algorithm needs for the execution of one colls::reduce - for r==0 exec_time = n*(L1+L2) + buf_lng * (1-1/2**n) * (T1 + T2 + O/d) @@ -207,7 +207,7 @@ Step 5.n) 7: { [(a+b)+(c+d)] + [(e+f)+(g+h)] } + { [(i+j)+k] + [l+m] } for H -For Colls::allreduce: +For colls::allreduce: ------------------ Step 6.1) @@ -249,7 +249,7 @@ Step 7) on all nodes 0..12 -For Colls::reduce: +For colls::reduce: --------------- Step 6.0) @@ -376,10 +376,10 @@ Benchmark results on CRAY T3E 2) This line shows the limit for the count argument. If count < limit then the vendor protocol is used, otherwise the new protocol is used (see variable Ldb). - 3) These lines show the bandwidth (=bufer length / execution time) + 3) These lines show the bandwidth (= buffer length / execution time) for both protocols. - 4) This line shows that the limit is choosen well if the ratio is - between 0.95 (loosing 5% for buffer length near and >=limit) + 4) This line shows that the limit is chosen well if the ratio is + between 0.95 (losing 5% for buffer length near and >=limit) and 1.10 (not gaining 10% for buffer length near and