Augustin Degomme [Thu, 24 Jul 2014 13:22:33 +0000 (15:22 +0200)]
indent
Augustin Degomme [Thu, 24 Jul 2014 13:20:49 +0000 (15:20 +0200)]
protect these calls against MPI_DATATYPE_NULL errors
Augustin Degomme [Thu, 24 Jul 2014 13:20:25 +0000 (15:20 +0200)]
cleanup a bit the code, ensure tests do pass
Augustin Degomme [Thu, 24 Jul 2014 09:06:29 +0000 (11:06 +0200)]
add mvapich allreduce rs algorithm
Augustin Degomme [Thu, 24 Jul 2014 08:41:00 +0000 (10:41 +0200)]
manually privatize allred test, to have it work on non-mmap systems also
Augustin Degomme [Thu, 24 Jul 2014 00:04:40 +0000 (02:04 +0200)]
Add last collectives from mvapich selector : bcast reduce reduce_scatter scatter
bcast stilll defaults to mpich one, as they need smp support
Augustin Degomme [Wed, 23 Jul 2014 15:35:52 +0000 (17:35 +0200)]
New collectives for mvapich2 selector : allgatherv, allreduce, alltoallv, barrier
Augustin Degomme [Wed, 23 Jul 2014 12:26:17 +0000 (14:26 +0200)]
Begin to add a MVAPICH2 collectives selector. Alltoall, Allgather and gather done.
Problems :
- code is (for now) quite a mess, with several versions of tuning available at the same time (coll folder from mvapich has currently 145k cloc).
- code is copy-pasted directly ... So, no comments
- only Stampede calibration is imported for now. Some others are available, we should provide a mechanism to switch to another calibration.
- MVAPICH collectives are SMP aware. SMPI is not really ... A mechanism to automatically generate an "internal" communicator for processes sharing a physical node will be needed. Gather actually defaults to mpich one as a result of this.
Paul Bédaride [Thu, 24 Jul 2014 13:16:12 +0000 (15:16 +0200)]
Fix mallocator tesh
Gabriel Corona [Thu, 24 Jul 2014 12:43:13 +0000 (14:43 +0200)]
[mc] Use mc_region_contain where it could be used
Gabriel Corona [Thu, 24 Jul 2014 12:09:55 +0000 (14:09 +0200)]
[mc] Udpate doxygen comments
Paul Bédaride [Thu, 24 Jul 2014 09:53:48 +0000 (11:53 +0200)]
Add test for host on off and some fixes
Augustin Degomme [Wed, 23 Jul 2014 09:07:42 +0000 (11:07 +0200)]
Set 4 tests that fail on Windows and won't ever pass as "expected to fail"
This will allow to better discriminate between bugs and wontfix non-issues.
One of them fails because random returns differently on win and generates different colors in the resulting trace. Duh.
The three others are expected to crash, and do crash correctly on Windows, but the return code is different from the one expected (SIGABRT). Meh
Gabriel Corona [Tue, 22 Jul 2014 13:42:44 +0000 (15:42 +0200)]
[mc] Add tests for sparse snapshot
Gabriel Corona [Tue, 22 Jul 2014 13:15:17 +0000 (15:15 +0200)]
[mc] Disable soft-dirty page tracking by default
In all tests I ran, it has a negative impact on performance.
Augustin Degomme [Mon, 21 Jul 2014 13:22:50 +0000 (15:22 +0200)]
sanitize get/set_name functions for fortran use
Gabriel Corona [Mon, 21 Jul 2014 12:36:16 +0000 (14:36 +0200)]
[mc] Disable optimisation for xbt when using MC
In some cases, it breaks the state comparison for some reason. This
was observed with mm.c and dynar.c.
Gabriel Corona [Mon, 21 Jul 2014 11:38:08 +0000 (13:38 +0200)]
[mc] Do not handle mc_model_checker->parent_snapshot unless it is necessary
Augustin Degomme [Mon, 21 Jul 2014 09:33:08 +0000 (11:33 +0200)]
Add MPI_Win_get_name and MPI_Win_set_name support
Augustin Degomme [Mon, 21 Jul 2014 09:32:40 +0000 (11:32 +0200)]
reactivate allred test
But have its compilation flags set to O0 to avoid issues with ci slaves (too long time to compile with optims all the macros used, and too much memory used)
Gabriel Corona [Thu, 17 Jul 2014 14:30:45 +0000 (16:30 +0200)]
[mmalloc] Force metadata update in mmalloc/mrealloc
Gabriel Corona [Thu, 17 Jul 2014 08:41:20 +0000 (10:41 +0200)]
[mmalloc] Add mmcheck() which checks mmalloc heap consistency
Gabriel Corona [Thu, 10 Jul 2014 13:48:16 +0000 (15:48 +0200)]
[mmalloc] Add new block type for heapinfo blocks
Augustin Degomme [Thu, 17 Jul 2014 16:38:50 +0000 (18:38 +0200)]
revert changes on allgatherv4, which needed manual privatization to run on freebsd
Augustin Degomme [Thu, 17 Jul 2014 16:15:02 +0000 (18:15 +0200)]
remove warning with mc
Augustin Degomme [Thu, 17 Jul 2014 16:00:47 +0000 (18:00 +0200)]
remove warning
Augustin Degomme [Thu, 17 Jul 2014 15:52:24 +0000 (17:52 +0200)]
Finish pulling changes from mpich trunk testsuite
Augustin Degomme [Thu, 17 Jul 2014 14:54:35 +0000 (16:54 +0200)]
Update comm, datatype from mpich trunk
Augustin Degomme [Thu, 17 Jul 2014 13:38:45 +0000 (15:38 +0200)]
enforce a scatter error in some cases
Augustin Degomme [Thu, 17 Jul 2014 13:38:28 +0000 (15:38 +0200)]
update collectives teshsuite from mpich git (only minor changes)
Augustin Degomme [Thu, 17 Jul 2014 09:06:53 +0000 (11:06 +0200)]
tesh update for fabien's work on surf
Augustin Degomme [Thu, 17 Jul 2014 09:06:12 +0000 (11:06 +0200)]
fabien's work on surf
Adrien Lebre [Wed, 16 Jul 2014 17:39:34 +0000 (19:39 +0200)]
Push Takahiro Patch and fix cloud.tesh - Adrien
Adrien Lebre [Wed, 16 Jul 2014 16:09:41 +0000 (18:09 +0200)]
Merge branch 'master' of git+ssh://scm.gforge.inria.fr//gitroot/simgrid/simgrid
Adrien Lebre [Wed, 16 Jul 2014 16:09:37 +0000 (18:09 +0200)]
Fix a typo in the java-cloud example and add one TODO related to the migration invocation in VM.java - adrien
Augustin Degomme [Wed, 16 Jul 2014 16:00:55 +0000 (18:00 +0200)]
fix dist
Augustin Degomme [Wed, 16 Jul 2014 15:44:00 +0000 (17:44 +0200)]
rename function
Augustin Degomme [Wed, 16 Jul 2014 15:43:52 +0000 (17:43 +0200)]
requalify tesh
Augustin Degomme [Wed, 16 Jul 2014 15:16:50 +0000 (17:16 +0200)]
use global variables to store values that may be used millions of times..
Augustin Degomme [Wed, 16 Jul 2014 14:52:03 +0000 (16:52 +0200)]
add time injection in MPI_Wtime and MPI_Test, to match what was done in iprobe
Augustin Degomme [Wed, 16 Jul 2014 12:49:18 +0000 (14:49 +0200)]
add comment for magic value
Augustin Degomme [Wed, 16 Jul 2014 12:46:37 +0000 (14:46 +0200)]
add F90 rma tests
Takahiro Hirofuchi [Wed, 16 Jul 2014 07:14:23 +0000 (16:14 +0900)]
migration: minor cleanup and update TODO
Use MSG_process_create() instead of that of _with_arguments().
Takahiro Hirofuchi [Wed, 16 Jul 2014 07:04:53 +0000 (16:04 +0900)]
migration: fix status check of migration
When a migration of a VM is already ongoing, do not allow
MSG_vm_migrate() for the VM.
degomme [Tue, 15 Jul 2014 18:07:44 +0000 (20:07 +0200)]
remove the allred test
ci slaves have trouble building it (not enough memory probably)
degomme [Tue, 15 Jul 2014 16:29:37 +0000 (18:29 +0200)]
allred test needed unsigned char support, which was forgotten.
TODO : This part of the code is ugly and should be replaced by macros asap, as it may lead to nasty bugs
degomme [Tue, 15 Jul 2014 16:28:21 +0000 (18:28 +0200)]
Add MPI_Type_set_name and MPI_Type_get_name and activate tests
degomme [Tue, 15 Jul 2014 14:14:52 +0000 (16:14 +0200)]
activate scatterv test, which needed Carts and Dim
degomme [Mon, 14 Jul 2014 22:32:57 +0000 (00:32 +0200)]
add definitions
degomme [Mon, 14 Jul 2014 21:41:35 +0000 (23:41 +0200)]
fix dist
degomme [Mon, 14 Jul 2014 21:36:43 +0000 (23:36 +0200)]
set default size of Aint to integer*8 ... not ideal, though.
Real mpich testsuite uses autoconf to configure the size
degomme [Sat, 12 Jul 2014 00:00:35 +0000 (02:00 +0200)]
adapt mpif.h for rma
degomme [Fri, 11 Jul 2014 23:52:13 +0000 (01:52 +0200)]
Add f77 RMA tests
degomme [Fri, 11 Jul 2014 20:19:17 +0000 (22:19 +0200)]
mpich testsuite: add f77 topo test
degomme [Fri, 11 Jul 2014 20:18:21 +0000 (22:18 +0200)]
mpich testsuite: activate now working datatype test
Gabriel Corona [Fri, 11 Jul 2014 10:06:47 +0000 (12:06 +0200)]
[mmalloc] Add documentation
Gabriel Corona [Fri, 11 Jul 2014 10:04:34 +0000 (12:04 +0200)]
[mmalloc] Avoid useless memset0
Gabriel Corona [Fri, 11 Jul 2014 08:15:59 +0000 (10:15 +0200)]
[mc] Remove 'previous' variable ind mmalloc_compare_heap()
It was not used.
Gabriel Corona [Fri, 11 Jul 2014 08:02:18 +0000 (10:02 +0200)]
[mc] Fix test on type in mc_diff
Gabriel Corona [Thu, 10 Jul 2014 11:32:29 +0000 (13:32 +0200)]
[mc] Fix name of mc_snapshot_memcmp()
Gabriel Corona [Thu, 10 Jul 2014 10:44:22 +0000 (12:44 +0200)]
[mc] Add unit tests for reading/comparing the whole region in mc_snapshot
Gabriel Corona [Thu, 10 Jul 2014 10:30:10 +0000 (12:30 +0200)]
[mc] Test flat snapshots as well
Gabriel Corona [Thu, 10 Jul 2014 09:36:32 +0000 (11:36 +0200)]
[mc] Add unit test for mc_snapshot
Augustin Degomme [Tue, 8 Jul 2014 22:51:01 +0000 (00:51 +0200)]
Fix distcheck
Augustin Degomme [Tue, 8 Jul 2014 22:43:50 +0000 (00:43 +0200)]
file was not in the dist
Augustin Degomme [Tue, 8 Jul 2014 22:21:24 +0000 (00:21 +0200)]
activate working test
Augustin Degomme [Tue, 8 Jul 2014 22:03:47 +0000 (00:03 +0200)]
Add perf mpich3 tests
Augustin Degomme [Tue, 8 Jul 2014 16:09:28 +0000 (18:09 +0200)]
reduce size of recently added tests, to avoid them taking so long to complete
Augustin Degomme [Tue, 8 Jul 2014 15:26:15 +0000 (17:26 +0200)]
Previous commit exposes a bug in OpenMPI, switch algorithm for bcast to avoid it
The bug is that OpenMPI performs selection of algorithm based on the count on messages.
Or this count can be different from one process to another for bcast.
For example this test has a sender sending one element of 4096 bytes, and receivers receiving 1024 elements of 4 bytes
Yes, this is legal in MPI (but stupid ... but legal).
OpenMPI collective algorithm selector selects a different algorithm for these processes, thus deadlocking
Augustin Degomme [Tue, 8 Jul 2014 15:21:30 +0000 (17:21 +0200)]
activate two tests in collective communications
Gabriel Corona [Tue, 8 Jul 2014 14:41:35 +0000 (16:41 +0200)]
[mc] Move page store test from the integration test into the unit test infrastructure
Gabriel Corona [Tue, 8 Jul 2014 14:27:51 +0000 (16:27 +0200)]
Add support for C++ in tools/sg_unit_extractor.pl
Augustin Degomme [Tue, 8 Jul 2014 13:32:42 +0000 (15:32 +0200)]
leak --
Gabriel Corona [Tue, 8 Jul 2014 13:12:00 +0000 (15:12 +0200)]
[mc] Use mc_snapshot_region_memcmp() in compare_areas_with_type()
Gabriel Corona [Tue, 8 Jul 2014 13:04:17 +0000 (15:04 +0200)]
[mc] Avoid memory allocation for flat snapshots in mc_snapshot_region_memcmp()
Gabriel Corona [Tue, 8 Jul 2014 12:48:42 +0000 (14:48 +0200)]
[mc] Fix name of mc_snapshot_region_memcmp
Gabriel Corona [Tue, 8 Jul 2014 11:43:29 +0000 (13:43 +0200)]
Merge branch 'mc-fix' into mc-fastsnapshot
Conflicts:
src/mc/mc_diff.c
Gabriel Corona [Tue, 8 Jul 2014 10:54:59 +0000 (12:54 +0200)]
[mc] Fix check on type name
The name was changed to include 'struct/class'.
Gabriel Corona [Fri, 4 Jul 2014 11:22:00 +0000 (13:22 +0200)]
[mc] Fix bad parameter passed in mc_diff
In the following calls:
compare_heap_area_with_type(state,
area1, area2,
area1_to_compare, area2_to_compare,
snapshot1, snapshot2,
previous, type, size, check_ignore,
pointer_level);
compare_heap_area_without_type(state,
area1, area2,
area1_to_compare, area2_to_compare,
snapshot1, snapshot2,
previous, size, check_ignore);
areaX and real_areaX_to_compare do not point the same data in different
address spaces in some cases.
Sometimes real_areaX_to_compare is adjusted to point to the beginning
of the block or fragment:
area1_to_compare = addr_block1;
// or
area1_to_compare = (char *) addr_frag1 + offset1;
// (when offset1==0)
but areaX is not adjusted accordingly and still point to the original
data: in the called compare_heap_area_with[out]_type(), the two values
are inconsistent.
Moreover in some cases, the type does not correspond:
* areaX_to_compareX is the beginning of the fragment ;
* type is the type of areaX, not the type of the fragment.
Augustin Degomme [Mon, 7 Jul 2014 15:22:04 +0000 (17:22 +0200)]
add minloc and maxloc for long_long
Augustin Degomme [Mon, 7 Jul 2014 15:14:50 +0000 (17:14 +0200)]
this should be err_rank, not err_comm
Augustin Degomme [Mon, 7 Jul 2014 15:09:11 +0000 (17:09 +0200)]
add a non-existing MPI_2LONG datatype to handle some corner cases in fortran
Paul Bédaride [Mon, 7 Jul 2014 14:51:33 +0000 (16:51 +0200)]
Xml platform cleaning teshsuite/msg
Gabriel Corona [Mon, 7 Jul 2014 08:41:29 +0000 (10:41 +0200)]
[mc] Fix lookup of malloc fragment type
In some cases, the type was looked up in the current heapinfo and not
in the snapshot one:
* in some cases this is not an issue as state->heapinfo2 is currently
the same as the current heapinfo but this solution is more robust;
* if we were supposed to look it from state->heapinfo1 this is wrong.
Augustin Degomme [Mon, 7 Jul 2014 14:07:14 +0000 (16:07 +0200)]
activate working tests
Lower sizes and number of messages for pingping to avoid taking too long
Augustin Degomme [Mon, 7 Jul 2014 13:33:32 +0000 (15:33 +0200)]
use the right types in fortran for some platforms (based on f2c matching, hopefully correct)
Augustin Degomme [Mon, 7 Jul 2014 13:51:34 +0000 (15:51 +0200)]
define types used by fortran even when only C code is used (needed sometimes)
Conflicts:
include/smpi/smpi.h
suter [Mon, 7 Jul 2014 13:45:17 +0000 (15:45 +0200)]
temporary hack to be able to replay traces of fortran codes through the
simulation of a C code
Gabriel Corona [Fri, 4 Jul 2014 12:27:49 +0000 (14:27 +0200)]
[mc] Fix bound check in mc_snapshot_read_region
Gabriel Corona [Thu, 3 Jul 2014 10:58:34 +0000 (12:58 +0200)]
[mc] Add mc_snapshot_read_pointer()
Gabriel Corona [Thu, 3 Jul 2014 10:01:00 +0000 (12:01 +0200)]
[mc] Bug: MC was reading from the wrong region
Marion Guthmuller [Thu, 3 Jul 2014 13:24:07 +0000 (15:24 +0200)]
don't destroy detached comm from the sender side during process cleanup
Marion Guthmuller [Thu, 3 Jul 2014 10:36:51 +0000 (12:36 +0200)]
fix debug message (wrong buffer was printed)
Marion Guthmuller [Thu, 3 Jul 2014 08:23:02 +0000 (10:23 +0200)]
model-checker : update tesh
Marion Guthmuller [Thu, 3 Jul 2014 08:22:37 +0000 (10:22 +0200)]
model-checker : cosmetic in log message
Marion Guthmuller [Thu, 3 Jul 2014 08:07:29 +0000 (10:07 +0200)]
model-checker : check dict content before removing value
Marion Guthmuller [Thu, 3 Jul 2014 08:06:01 +0000 (10:06 +0200)]
model-checker : remove useless condition
Gabriel Corona [Thu, 3 Jul 2014 10:12:34 +0000 (12:12 +0200)]
[mc] Fast path when comparing NULL against non-NULL pointers
Gabriel Corona [Thu, 3 Jul 2014 10:01:00 +0000 (12:01 +0200)]
[mc] Bug: MC was reading from the wrong region
Martin Quinson [Thu, 3 Jul 2014 08:12:33 +0000 (10:12 +0200)]
Little script to report on our MPICH3 coverage
CMakeLists.txt files are used as a source of information.
I had to slightly change one of them to make it easier to parse.