#! ./tesh p Testing a simple master/worker example application handling failures TCP crosstraffic DISABLED ! output sort 19 $ $SG_TEST_EXENV ${bindir:=.}/platform-failures$EXEEXT --log=xbt_cfg.thres:critical --log=no_loc ${srcdir:=.}/small_platform_with_failures.xml ${srcdir:=.}/../msg/app-masterworker/app-masterworker_d.xml --cfg=path:${srcdir} --cfg=network/crosstraffic:0 "--log=root.fmt:[%10.6r]%e(%i:%P@%h)%e%m%n" > [ 0.000000] (0:maestro@) Cannot launch process 'worker' on failed host 'Fafard' > [ 0.000000] (1:master@Tremblay) Got 5 workers and 20 tasks to process > [ 0.010309] (1:master@Tremblay) Send completed > [ 0.010309] (2:worker@Tremblay) Received "Task" > [ 0.010309] (2:worker@Tremblay) Communication time : "0.010309" > [ 0.010309] (2:worker@Tremblay) Processing "Task" > [ 1.000000] (0:maestro@) Restart processes on host: Fafard > [ 1.000000] (1:master@Tremblay) Mmh. Something went wrong with 'worker-1'. Nevermind. Let's keep going! > [ 1.000000] (3:worker@Jupiter) Gloups. The cpu on which I'm running just turned off!. See you! > [ 2.000000] (0:maestro@) Restart processes on host: Jupiter > [ 2.010309] (2:worker@Tremblay) "Task" done > [ 11.000000] (1:master@Tremblay) Mmh. Got timeouted while speaking to 'worker-2'. Nevermind. Let's keep going! > [ 12.030928] (1:master@Tremblay) Send completed > [ 12.030928] (4:worker@Ginette) Received "Task" > [ 12.030928] (4:worker@Ginette) Communication time : "1.030928" > [ 12.030928] (4:worker@Ginette) Processing "Task" > [ 13.061856] (1:master@Tremblay) Send completed > [ 13.061856] (5:worker@Bourassa) Received "Task" > [ 13.061856] (5:worker@Bourassa) Communication time : "1.030928" > [ 13.061856] (5:worker@Bourassa) Processing "Task" > [ 13.072165] (1:master@Tremblay) Send completed > [ 13.072165] (2:worker@Tremblay) Received "Task" > [ 13.072165] (2:worker@Tremblay) Communication time : "0.010309" > [ 13.072165] (2:worker@Tremblay) Processing "Task" > [ 14.030928] (4:worker@Ginette) "Task" done > [ 14.103093] (1:master@Tremblay) Send completed > [ 14.103093] (6:worker@Jupiter) Received "Task" > [ 14.103093] (6:worker@Jupiter) Communication time : "1.030928" > [ 14.103093] (6:worker@Jupiter) Processing "Task" > [ 15.061856] (5:worker@Bourassa) "Task" done > [ 15.072165] (2:worker@Tremblay) "Task" done > [ 16.103093] (6:worker@Jupiter) "Task" done > [ 24.103093] (1:master@Tremblay) Mmh. Got timeouted while speaking to 'worker-2'. Nevermind. Let's keep going! > [ 24.103093] (1:master@Tremblay) Mmh. Something went wrong with 'worker-3'. Nevermind. Let's keep going! > [ 24.103093] (4:worker@Ginette) Mmh. Something went wrong. Nevermind. Let's keep going! > [ 25.134021] (1:master@Tremblay) Send completed > [ 25.134021] (5:worker@Bourassa) Received "Task" > [ 25.134021] (5:worker@Bourassa) Communication time : "1.030928" > [ 25.134021] (5:worker@Bourassa) Processing "Task" > [ 25.144330] (1:master@Tremblay) Send completed > [ 25.144330] (2:worker@Tremblay) Received "Task" > [ 25.144330] (2:worker@Tremblay) Communication time : "0.010309" > [ 25.144330] (2:worker@Tremblay) Processing "Task" > [ 26.175258] (1:master@Tremblay) Send completed > [ 26.175258] (6:worker@Jupiter) Received "Task" > [ 26.175258] (6:worker@Jupiter) Communication time : "1.030928" > [ 26.175258] (6:worker@Jupiter) Processing "Task" > [ 27.134021] (5:worker@Bourassa) "Task" done > [ 27.144330] (2:worker@Tremblay) "Task" done > [ 28.175258] (6:worker@Jupiter) "Task" done > [ 36.175258] (1:master@Tremblay) Mmh. Got timeouted while speaking to 'worker-2'. Nevermind. Let's keep going! > [ 37.206186] (1:master@Tremblay) Send completed > [ 37.206186] (1:master@Tremblay) Mmh. Something went wrong with 'worker-4'. Nevermind. Let's keep going! > [ 37.206186] (4:worker@Ginette) Received "Task" > [ 37.206186] (4:worker@Ginette) Communication time : "1.030928" > [ 37.206186] (4:worker@Ginette) Processing "Task" > [ 37.206186] (5:worker@Bourassa) Mmh. Something went wrong. Nevermind. Let's keep going! > [ 37.216495] (1:master@Tremblay) Send completed > [ 37.216495] (2:worker@Tremblay) Received "Task" > [ 37.216495] (2:worker@Tremblay) Communication time : "0.010309" > [ 37.216495] (2:worker@Tremblay) Processing "Task" > [ 38.247423] (1:master@Tremblay) Send completed > [ 38.247423] (6:worker@Jupiter) Received "Task" > [ 38.247423] (6:worker@Jupiter) Communication time : "1.030928" > [ 38.247423] (6:worker@Jupiter) Processing "Task" > [ 39.206186] (4:worker@Ginette) "Task" done > [ 39.216495] (2:worker@Tremblay) "Task" done > [ 40.247423] (6:worker@Jupiter) "Task" done > [ 48.247423] (1:master@Tremblay) Mmh. Got timeouted while speaking to 'worker-2'. Nevermind. Let's keep going! > [ 49.278351] (1:master@Tremblay) Send completed > [ 49.278351] (4:worker@Ginette) Received "Task" > [ 49.278351] (4:worker@Ginette) Communication time : "1.030928" > [ 49.278351] (4:worker@Ginette) Processing "Task" > [ 50.000000] (4:worker@Ginette) Gloups. The cpu on which I'm running just turned off!. See you! > [ 50.309278] (1:master@Tremblay) Send completed > [ 50.309278] (1:master@Tremblay) All tasks have been dispatched. Let's tell everybody the computation is over. > [ 50.309278] (2:worker@Tremblay) Received "finalize" > [ 50.309278] (2:worker@Tremblay) I'm done. See you! > [ 50.309278] (5:worker@Bourassa) Received "Task" > [ 50.309278] (5:worker@Bourassa) Communication time : "1.030928" > [ 50.309278] (5:worker@Bourassa) Processing "Task" > [ 50.309278] (6:worker@Jupiter) Received "finalize" > [ 50.309278] (6:worker@Jupiter) I'm done. See you! > [ 51.309278] (1:master@Tremblay) Mmh. Got timeouted while speaking to 'worker-2'. Nevermind. Let's keep going! > [ 52.309278] (0:maestro@) Simulation time 52.3093 > [ 52.309278] (1:master@Tremblay) Mmh. Got timeouted while speaking to 'worker-3'. Nevermind. Let's keep going! > [ 52.309278] (1:master@Tremblay) Goodbye now! > [ 52.309278] (5:worker@Bourassa) "Task" done > [ 52.309278] (5:worker@Bourassa) Received "finalize" > [ 52.309278] (5:worker@Bourassa) I'm done. See you! p Testing a simple master/worker example application handling failures. TCP crosstraffic ENABLED ! output sort 19 $ $SG_TEST_EXENV ${bindir:=.}/platform-failures$EXEEXT --log=xbt_cfg.thres:critical --log=no_loc ${srcdir:=.}/small_platform_with_failures.xml ${srcdir:=.}/../msg/app-masterworker/app-masterworker_d.xml --cfg=path:${srcdir} "--log=root.fmt:[%10.6r]%e(%i:%P@%h)%e%m%n" > [ 0.000000] (0:maestro@) Cannot launch process 'worker' on failed host 'Fafard' > [ 0.000000] (1:master@Tremblay) Got 5 workers and 20 tasks to process > [ 0.010825] (1:master@Tremblay) Send completed > [ 0.010825] (2:worker@Tremblay) Received "Task" > [ 0.010825] (2:worker@Tremblay) Communication time : "0.010825" > [ 0.010825] (2:worker@Tremblay) Processing "Task" > [ 1.000000] (0:maestro@) Restart processes on host: Fafard > [ 1.000000] (1:master@Tremblay) Mmh. Something went wrong with 'worker-1'. Nevermind. Let's keep going! > [ 1.000000] (3:worker@Jupiter) Gloups. The cpu on which I'm running just turned off!. See you! > [ 2.000000] (0:maestro@) Restart processes on host: Jupiter > [ 2.010825] (2:worker@Tremblay) "Task" done > [ 11.000000] (1:master@Tremblay) Mmh. Got timeouted while speaking to 'worker-2'. Nevermind. Let's keep going! > [ 12.082474] (1:master@Tremblay) Send completed > [ 12.082474] (4:worker@Ginette) Received "Task" > [ 12.082474] (4:worker@Ginette) Communication time : "1.082474" > [ 12.082474] (4:worker@Ginette) Processing "Task" > [ 13.164948] (1:master@Tremblay) Send completed > [ 13.164948] (5:worker@Bourassa) Received "Task" > [ 13.164948] (5:worker@Bourassa) Communication time : "1.082474" > [ 13.164948] (5:worker@Bourassa) Processing "Task" > [ 13.175773] (1:master@Tremblay) Send completed > [ 13.175773] (2:worker@Tremblay) Received "Task" > [ 13.175773] (2:worker@Tremblay) Communication time : "0.010825" > [ 13.175773] (2:worker@Tremblay) Processing "Task" > [ 14.082474] (4:worker@Ginette) "Task" done > [ 14.258247] (1:master@Tremblay) Send completed > [ 14.258247] (6:worker@Jupiter) Received "Task" > [ 14.258247] (6:worker@Jupiter) Communication time : "1.082474" > [ 14.258247] (6:worker@Jupiter) Processing "Task" > [ 15.164948] (5:worker@Bourassa) "Task" done > [ 15.175773] (2:worker@Tremblay) "Task" done > [ 16.258247] (6:worker@Jupiter) "Task" done > [ 24.258247] (1:master@Tremblay) Mmh. Got timeouted while speaking to 'worker-2'. Nevermind. Let's keep going! > [ 24.258247] (1:master@Tremblay) Mmh. Something went wrong with 'worker-3'. Nevermind. Let's keep going! > [ 24.258247] (4:worker@Ginette) Mmh. Something went wrong. Nevermind. Let's keep going! > [ 25.340722] (1:master@Tremblay) Send completed > [ 25.340722] (5:worker@Bourassa) Received "Task" > [ 25.340722] (5:worker@Bourassa) Communication time : "1.082474" > [ 25.340722] (5:worker@Bourassa) Processing "Task" > [ 25.351546] (1:master@Tremblay) Send completed > [ 25.351546] (2:worker@Tremblay) Received "Task" > [ 25.351546] (2:worker@Tremblay) Communication time : "0.010825" > [ 25.351546] (2:worker@Tremblay) Processing "Task" > [ 26.434021] (1:master@Tremblay) Send completed > [ 26.434021] (6:worker@Jupiter) Received "Task" > [ 26.434021] (6:worker@Jupiter) Communication time : "1.082474" > [ 26.434021] (6:worker@Jupiter) Processing "Task" > [ 27.340722] (5:worker@Bourassa) "Task" done > [ 27.351546] (2:worker@Tremblay) "Task" done > [ 28.434021] (6:worker@Jupiter) "Task" done > [ 36.434021] (1:master@Tremblay) Mmh. Got timeouted while speaking to 'worker-2'. Nevermind. Let's keep going! > [ 37.516495] (1:master@Tremblay) Send completed > [ 37.516495] (1:master@Tremblay) Mmh. Something went wrong with 'worker-4'. Nevermind. Let's keep going! > [ 37.516495] (4:worker@Ginette) Received "Task" > [ 37.516495] (4:worker@Ginette) Communication time : "1.082474" > [ 37.516495] (4:worker@Ginette) Processing "Task" > [ 37.516495] (5:worker@Bourassa) Mmh. Something went wrong. Nevermind. Let's keep going! > [ 37.527320] (1:master@Tremblay) Send completed > [ 37.527320] (2:worker@Tremblay) Received "Task" > [ 37.527320] (2:worker@Tremblay) Communication time : "0.010825" > [ 37.527320] (2:worker@Tremblay) Processing "Task" > [ 38.609794] (1:master@Tremblay) Send completed > [ 38.609794] (6:worker@Jupiter) Received "Task" > [ 38.609794] (6:worker@Jupiter) Communication time : "1.082474" > [ 38.609794] (6:worker@Jupiter) Processing "Task" > [ 39.516495] (4:worker@Ginette) "Task" done > [ 39.527320] (2:worker@Tremblay) "Task" done > [ 40.609794] (6:worker@Jupiter) "Task" done > [ 48.609794] (1:master@Tremblay) Mmh. Got timeouted while speaking to 'worker-2'. Nevermind. Let's keep going! > [ 49.692268] (1:master@Tremblay) Send completed > [ 49.692268] (4:worker@Ginette) Received "Task" > [ 49.692268] (4:worker@Ginette) Communication time : "1.082474" > [ 49.692268] (4:worker@Ginette) Processing "Task" > [ 50.000000] (4:worker@Ginette) Gloups. The cpu on which I'm running just turned off!. See you! > [ 50.774742] (1:master@Tremblay) Send completed > [ 50.774742] (1:master@Tremblay) All tasks have been dispatched. Let's tell everybody the computation is over. > [ 50.774742] (2:worker@Tremblay) Received "finalize" > [ 50.774742] (2:worker@Tremblay) I'm done. See you! > [ 50.774742] (5:worker@Bourassa) Received "Task" > [ 50.774742] (5:worker@Bourassa) Communication time : "1.082474" > [ 50.774742] (5:worker@Bourassa) Processing "Task" > [ 50.774742] (6:worker@Jupiter) Received "finalize" > [ 50.774742] (6:worker@Jupiter) I'm done. See you! > [ 51.774742] (1:master@Tremblay) Mmh. Got timeouted while speaking to 'worker-2'. Nevermind. Let's keep going! > [ 52.774742] (0:maestro@) Simulation time 52.7747 > [ 52.774742] (1:master@Tremblay) Mmh. Got timeouted while speaking to 'worker-3'. Nevermind. Let's keep going! > [ 52.774742] (1:master@Tremblay) Goodbye now! > [ 52.774742] (5:worker@Bourassa) "Task" done > [ 52.774742] (5:worker@Bourassa) Received "finalize" > [ 52.774742] (5:worker@Bourassa) I'm done. See you! p Testing a simple master/worker example application handling failures. CPU_TI optimization enabled ! output sort 19 $ $SG_TEST_EXENV ${bindir:=.}/platform-failures$EXEEXT --log=xbt_cfg.thres:critical --log=no_loc ${srcdir:=.}/small_platform_with_failures.xml ${srcdir:=.}/../msg/app-masterworker/app-masterworker_d.xml --cfg=path:${srcdir} -cfg=cpu/optim:TI "--log=root.fmt:[%10.6r]%e(%i:%P@%h)%e%m%n" > [ 0.000000] (0:maestro@) Cannot launch process 'worker' on failed host 'Fafard' > [ 0.000000] (1:master@Tremblay) Got 5 workers and 20 tasks to process > [ 0.010825] (1:master@Tremblay) Send completed > [ 0.010825] (2:worker@Tremblay) Received "Task" > [ 0.010825] (2:worker@Tremblay) Communication time : "0.010825" > [ 0.010825] (2:worker@Tremblay) Processing "Task" > [ 1.000000] (0:maestro@) Restart processes on host: Fafard > [ 1.000000] (1:master@Tremblay) Mmh. Something went wrong with 'worker-1'. Nevermind. Let's keep going! > [ 1.000000] (3:worker@Jupiter) Gloups. The cpu on which I'm running just turned off!. See you! > [ 2.000000] (0:maestro@) Restart processes on host: Jupiter > [ 2.010825] (2:worker@Tremblay) "Task" done > [ 11.000000] (1:master@Tremblay) Mmh. Got timeouted while speaking to 'worker-2'. Nevermind. Let's keep going! > [ 12.082474] (1:master@Tremblay) Send completed > [ 12.082474] (4:worker@Ginette) Received "Task" > [ 12.082474] (4:worker@Ginette) Communication time : "1.082474" > [ 12.082474] (4:worker@Ginette) Processing "Task" > [ 13.164948] (1:master@Tremblay) Send completed > [ 13.164948] (5:worker@Bourassa) Received "Task" > [ 13.164948] (5:worker@Bourassa) Communication time : "1.082474" > [ 13.164948] (5:worker@Bourassa) Processing "Task" > [ 13.175773] (1:master@Tremblay) Send completed > [ 13.175773] (2:worker@Tremblay) Received "Task" > [ 13.175773] (2:worker@Tremblay) Communication time : "0.010825" > [ 13.175773] (2:worker@Tremblay) Processing "Task" > [ 14.082474] (4:worker@Ginette) "Task" done > [ 14.258247] (1:master@Tremblay) Send completed > [ 14.258247] (6:worker@Jupiter) Received "Task" > [ 14.258247] (6:worker@Jupiter) Communication time : "1.082474" > [ 14.258247] (6:worker@Jupiter) Processing "Task" > [ 15.164948] (5:worker@Bourassa) "Task" done > [ 15.175773] (2:worker@Tremblay) "Task" done > [ 16.258247] (6:worker@Jupiter) "Task" done > [ 24.258247] (1:master@Tremblay) Mmh. Got timeouted while speaking to 'worker-2'. Nevermind. Let's keep going! > [ 24.258247] (1:master@Tremblay) Mmh. Something went wrong with 'worker-3'. Nevermind. Let's keep going! > [ 24.258247] (4:worker@Ginette) Mmh. Something went wrong. Nevermind. Let's keep going! > [ 25.340722] (1:master@Tremblay) Send completed > [ 25.340722] (5:worker@Bourassa) Received "Task" > [ 25.340722] (5:worker@Bourassa) Communication time : "1.082474" > [ 25.340722] (5:worker@Bourassa) Processing "Task" > [ 25.351546] (1:master@Tremblay) Send completed > [ 25.351546] (2:worker@Tremblay) Received "Task" > [ 25.351546] (2:worker@Tremblay) Communication time : "0.010825" > [ 25.351546] (2:worker@Tremblay) Processing "Task" > [ 26.434021] (1:master@Tremblay) Send completed > [ 26.434021] (6:worker@Jupiter) Received "Task" > [ 26.434021] (6:worker@Jupiter) Communication time : "1.082474" > [ 26.434021] (6:worker@Jupiter) Processing "Task" > [ 27.340722] (5:worker@Bourassa) "Task" done > [ 27.351546] (2:worker@Tremblay) "Task" done > [ 28.434021] (6:worker@Jupiter) "Task" done > [ 36.434021] (1:master@Tremblay) Mmh. Got timeouted while speaking to 'worker-2'. Nevermind. Let's keep going! > [ 37.516495] (1:master@Tremblay) Send completed > [ 37.516495] (1:master@Tremblay) Mmh. Something went wrong with 'worker-4'. Nevermind. Let's keep going! > [ 37.516495] (4:worker@Ginette) Received "Task" > [ 37.516495] (4:worker@Ginette) Communication time : "1.082474" > [ 37.516495] (4:worker@Ginette) Processing "Task" > [ 37.516495] (5:worker@Bourassa) Mmh. Something went wrong. Nevermind. Let's keep going! > [ 37.527320] (1:master@Tremblay) Send completed > [ 37.527320] (2:worker@Tremblay) Received "Task" > [ 37.527320] (2:worker@Tremblay) Communication time : "0.010825" > [ 37.527320] (2:worker@Tremblay) Processing "Task" > [ 38.609794] (1:master@Tremblay) Send completed > [ 38.609794] (6:worker@Jupiter) Received "Task" > [ 38.609794] (6:worker@Jupiter) Communication time : "1.082474" > [ 38.609794] (6:worker@Jupiter) Processing "Task" > [ 39.516495] (4:worker@Ginette) "Task" done > [ 39.527320] (2:worker@Tremblay) "Task" done > [ 40.609794] (6:worker@Jupiter) "Task" done > [ 48.609794] (1:master@Tremblay) Mmh. Got timeouted while speaking to 'worker-2'. Nevermind. Let's keep going! > [ 49.692268] (1:master@Tremblay) Send completed > [ 49.692268] (4:worker@Ginette) Received "Task" > [ 49.692268] (4:worker@Ginette) Communication time : "1.082474" > [ 49.692268] (4:worker@Ginette) Processing "Task" > [ 50.000000] (4:worker@Ginette) Gloups. The cpu on which I'm running just turned off!. See you! > [ 50.774742] (1:master@Tremblay) Send completed > [ 50.774742] (1:master@Tremblay) All tasks have been dispatched. Let's tell everybody the computation is over. > [ 50.774742] (2:worker@Tremblay) Received "finalize" > [ 50.774742] (2:worker@Tremblay) I'm done. See you! > [ 50.774742] (5:worker@Bourassa) Received "Task" > [ 50.774742] (5:worker@Bourassa) Communication time : "1.082474" > [ 50.774742] (5:worker@Bourassa) Processing "Task" > [ 50.774742] (6:worker@Jupiter) Received "finalize" > [ 50.774742] (6:worker@Jupiter) I'm done. See you! > [ 51.774742] (1:master@Tremblay) Mmh. Got timeouted while speaking to 'worker-2'. Nevermind. Let's keep going! > [ 52.774742] (0:maestro@) Simulation time 52.7747 > [ 52.774742] (1:master@Tremblay) Mmh. Got timeouted while speaking to 'worker-3'. Nevermind. Let's keep going! > [ 52.774742] (1:master@Tremblay) Goodbye now! > [ 52.774742] (5:worker@Bourassa) "Task" done > [ 52.774742] (5:worker@Bourassa) Received "finalize" > [ 52.774742] (5:worker@Bourassa) I'm done. See you!