X-Git-Url: http://info.iut-bm.univ-fcomte.fr/pub/gitweb/simgrid.git/blobdiff_plain/d12402e6d9999ad72d856bcaabedd6f9c03c34d6..1f6a008d060e1ffc86348cfa7a9750688d871314:/ChangeLog diff --git a/ChangeLog b/ChangeLog index 8cfffdc58b..42cd05a2d6 100644 --- a/ChangeLog +++ b/ChangeLog @@ -1,4 +1,274 @@ -SimGrid (3.27.1) NOT RELEASED YET (v3.28 expected June 21. 2021, 03:32 UTC) +---------------------------------------------------------------------------- + +SimGrid (3.30) January 30. 2022. + +The Sunday Bloody Sunday release. + +Main user-visible changes: + - The SimDag API for the simulation of the scheduling of Directed Acyclic + Graphs has been dropped. It was marked as deprecated for a couple of years. + We finally complete the implementation of what has been called SimDag++ + internally, i.e., porting the different features of SimDag on top of S4U. + The new way to simulate the execution of dependent activities directly by + maestro (without any other actor) is details in the examples/cpp/dag-* series + of examples. + - The removal of SimDag led us to also remove the export to Jedule files that + was tightly coupled to SimDag. The instrumentation of DAG simulation is still + possible through the regular instrumentation API based on the Paje format. + - We also dropped the old and clumsy Lua bindings to create platforms in a + programmatic way. It can be done in C++ in a much cleaner way now, which + motivates this suppression. + +S4U: + - Introduce on_X_cb() functions for all signals, to attach a new + callback to the signal X. The signal variables are now hidden and + only these functions should be used. + Rationale: this enables the usual deprecation scheme where functions + remain for 4 releases if we need to modify the signals, while the + current code with the signal variables directly visible prevents any + smooth transition. + - New function: Engine::run_until(date), to split the simulation. + - New signal: Activity::on_veto, to detect when an activity fails to start. + - Signal change: Comm::on_start(Comm&, bool) has been replaced by + Comm::on_send and Comm::recv. These two signals respectively correspond to + when the sending or receiving side of a Comm is ready. They are raised at + the same locations as the former Comm::on_start signal. + - New function: Engine::track_vetoed_activities() to interrupt run() + when an activity fails to start, and to keep track of such activities. + Please see the corresponding example for more info. + - New functions: s4u::Comm::{sendto_init, set_source, set_destination} to enable + the use of vetoers with direct host-to-host communications. Both source and + destination have to set for a comm to start. Each call to these setters check + if all vetoes are satisfied. When it is the case, the comm starts. A use case of + these functions is given in examples/cpp/dag-scheduling. + - New functions: {Exec, Io}::update_priority allow you to modify the priority of + these kinds of activities during their execution. Behavior is detailed in + examples/cpp/io-priority/. + +SMPI: + - Dynamic costs for MPI operations: New API to allow users to dynamically + change injected costs for MPI_Recv, MPI_Send and MPI_Isend operations. + Alternative for smpi/or, smpi/os and smpi/ois configuration options. + - Fix some issues with the replay mechanism. + +XBT: + - Function xbt::Extendable::get_data() is now templated with the type of the + pointee. Untyped function is deprecated. Use get_data() if you still + want to retrieve void*. + +Documentation: + - New section: "SimGrid MPI calibration of a Grid5000 cluster" + presenting how to properly calibrate MPI communications in SimGrid. + - Complete and reword the platform section, which is now completed. + +Python: + - Thread contexts are used by default with Python bindings. Other kinds of + contexts revealed unstable, specially starting with pybind11 v2.8.0. + +Fixed bugs (FG#.. -> FramaGit bugs; FG!.. -> FG merge requests) + (FG: issues on Framagit; GH: issues on GitHub) + - FG#95: Wrong computation time for multicore execution after pstate change + - FG#97: Wrong computation time for ptask+multicore+pstates + - FG#98: SMPI offline simulation is inconsistent with the online simulation + (deadlocks / message truncation) + - FG#99: Weird segfault when not sealing an host + +---------------------------------------------------------------------------- + +SimGrid (3.29) October 7. 2021 + +The "Ask a stupid question" release. + +We wish that every user ask one question about SimGrid to celebrate. +On Mattermost, Stack Overflow or using the issues tracker. + + +New modeling features: + - Non-linear resource sharing, modeling resources whose performance heavily degrades with contention: + - The total capacity may be updated dynamically through a callback + and depends mainly on the number of concurrent flows. + - Examples (both cpp and python): io-degradation, network-nonlinear, exec-cpu-nonlinear + + - Dynamic factors: model variability in the speed of activities + - Each action can now have a factor that affects its progression. + This multiplicative factor is applied when updating the amount of work + remaining, thereby an activity with factor=0.5 only uses half of the + instantaneous power/bandwidth it is allocated and will appear twice + slower than what it actually consumes. + - This can be used to model a overhead (e.g., there is a 20 bytes + header in a 480 bytes TCP packet so the factor 0.9583) but the novelty + is this factor can now easily be adjusted depending on activity's and + resources characteristics. + - This existed for network (e.g., the effective bandwidth depends + on the message in SMPI piecewise-linear network model) but it is now + more general (the factor may depend on the source and destination and + thus account to different behaviors for intra-node communications and + extra-node communications) and is available for CPUs (e.g., if you + want to model an affinity as in the "Unrelated Machines" problem in + scheduling) and disks (e.g., if you want to model a stochastic + capacity) too. + - For that, resources can be provided with a callback that computes + the activity factor when creating the action. + - Example: examples/cpp/exec-cpu-factors + - The same mechanism is also available for the latency, which + allows to easily introduce complex variability patterns. + +Python: + - Added support to programmatic platform creation in Python. + Example: examples/python/clusters-multicpu + +S4U: + - Disk and Host now have a set_sharing_policy() too, for non-linear sharing. + This can only be set through the API, not through XML files. + +SMPI: + - TI Tracing/Replay: + - Multiple fixes to ensure reproducibility of tracing + - scan/excan can now be replayed + - wait action now uses ranks and not pid, as the other ones. + - smpi/init and smpi/finalization-barrier are now valid for replays. + - exit() is now intercepted by SMPI to avoid premature shutdown of + simulation. First non 0 return codes is returned as simulation return + code. + +Documentation: + * New section "Release Notes" documenting recent and current developments. + * New section "Modeling I/O: the realistic way" presenting how to properly model disks in SimGrid. + * Improvements in API Reference for C++ and Python interfaces. + +ns-3 model: + - Make wifi creation compatible with ns-3 version 3.34 too. + +Fixed bugs (FG#.. -> FramaGit bugs; FG!.. -> FG merge requests) + (FG: issues on Framagit; GH: issues on GitHub) + - FG#77: Search feature of doc is broken (update sphinx theme version) + - FG#78: Multiple fixes for SMPI replay: + - TI tracing of allotallv/w was outputting wrong values + - MPI_LOGICAL in fortran is actually 32 bits wide, and not 8. +---------------------------------------------------------------------------- + +SimGrid (3.28) July 14. 2021 + +The Victoriadagarna Release. + +New features: + - C++ platform interface: Users can now describe their platform directly in C++. + This provides greatly flexibility and performance improvement for complex + platforms. Main features: + - Fat-Tree/DragonFly/Torus composing: allows you to create clusters of + "zones", instead of single hosts. This feature enables the description + of clusters with complex hosts, composed of several CPUs, GPUs, etc. + - StarZone: new zone with a Star-like topology. The routes are defined + as a set of links used to communicate from node to everybody (node<->ALL). + - Split-Duplex links: auxiliary method to create split-duplex links in + the platform, easing its utilisation. It automatically creates both UP + and DOWN links (similarly as done in XML). + - Please refer to the documentation and the examples included: + e.g. examples/cpp/clusters-multicpu/ and examples/platforms/*.cpp. + - New plugin: Producer-Consumer with monitor. Just requires to include the + include/simgrid/plugins/ProducerConsumer.hpp header to be used. See the + associated example (examples/cpp/plugin-prodcons). + +S4U: + - New: s4u::Comm::wait_all_for() (like s4u::Comm::wait_all, but with a timeout), + s4u::Io::wait_any(), s4u::Io::wait_any_for(). + - Methods test_all/test_any/wait_all/wait_any in s4u now take their vector + parameter by reference, instead of a pointer. + - Fixed a bug where Activity::wait_for() killed the activity on timeout. + Explicitly cancel the activity to get back to previous behavior. + - New: Link::set_concurrency_limit() to limit the amount of concurrent flows. + +SMPI: + - The default SMPI compiler flags are no more taken from the environment. + They can be explicitly set through cmake parameters SMPI_C_FLAGS, + SMPI_CXX_FLAGS, or SMPI_Fortran_FLAGS. + - New options: + --cfg=smpi/finalization-barrier: which can be used to add + a barrier inside MPI_Finalize. This can help for some codes which cleanup + data attached to a process, but still used in other SMPI processes. + --cfg=smpi/errors-are-fatal: True by default, behaves like if MPI_ERRORS_RETURN + is active when set to false, to keep going after a small error + --cfg=smpi/pedantic: True by default. Do not report some harmless MPI errors + which may or may not be problematic in the end. + - Sampling: + - fix behaviour, as maximum iteration count could be ignored + - add SMPI_SAMPLE_LOCAL_TAG and SMPI_SAMPLE_GLOBAL_TAG macros, to allow user to + use sampling when the same kernel is called with a different set of parameters + which have an impact on the timing. + - realloc is now intercepted, to be coherent, as malloc/calloc/free were already. + It should now work with smpi/auto-shared-malloc-thresh. + - Improve error handling and reporting in multiple places + - Improve correctness checks on the MPI code.(MPI_Op and MPI_Datatype + validity checks, truncated messages are now an error, return errors + when explicitely deleted handles are reused, ...) + - RMA: multiple fixes and stability improvements. + - analysis (-analyze flag in smpirun): + - SMPI can now report buffer leaks as well as MPI handles leaks, + if code was compiled without SMPI_NO_OVERRIDE_MALLOC. + - if -trace-call-location is used when compiling, SMPI can report + origin of leaked handles/buffers + - group leaks by type/origin in output message if possible + - New implemented MPI calls: MPI_Comm_test_inter + +Models: + - Changed internal implementation of bandwidth factors in network models. + Models affected: CM02, LV08 (default), SMPI, IB. + Configuration affected: "network/bandwidth-factors" and "smpi/bw-factors". + Bandwidth factors are applied to communications to describe that users + cannot use 100% of the available bandwidth. For example, the default network model, + LV08, applies a factor of 0.97 to the bandwidth. In older versions, this + behavior was implemented by limiting the bandwidth available in the LMM + system for this flow. This may give the false impression that there is + bandwidth available for other flows due to its underutilization, especially + for the dynamic bandwidth factors used in SMPI models. + To avoid this, we have modified the implementation so that each flow uses the + maximum physical bandwidth according to the LMM system. + However, the actual throughput of the flow seen by the user is defined by + the physical bandwidth multiplied by the bandwidth factor. + This change impacts on the simulation results for all network models on + which we have bandwidth factors configured. + ***************************************** + *DO NOT MIX 3.28 RESULTS WITH OLDER ONES* + ***************************************** + This change may impact on the timing of your simulation results. + Take care when comparing simulations from different SimGrid's + versions. Sorry for the inconvenience. + - Dynamic network factors: users can configure a callback to define + the network factors dynamically. This API is available at + simgrid::kernel::resource::NetworkModelIntf. + - Users have access to complete information about the current communication + to decide which factor to apply. This includes: message size, source and + destination hosts, links and zones traversed. + - Dynamic factors for both latency and bandwidth. + - For more details, see the example in (examples/cpp/network-factors). + - Plugin host_energy: the "watt_off" and "watt_per_state" host properties, + deprecated since version 3.24, are no longer supported. Instead, use + "wattage_off" and "wattage_per_state". + +XBT: + - xbt_assert is not disabled anymore, even when built with enable_debug=off. + +Documentation: + - New tutorial: Model-checking and formal assessment + - New sections: "Demystifying the routing" and "C++ platforms" + - Update and improve the part on visualization in MPI and Algo tutorials. + - Improve the section on routing: how to define it, how it's used internally + - Fix many issues, broken links and missing references in doxygen and Sphinx + +LUA: + - Lua platform files are deprecated. Their support will be dropped after v3.31. + +Simix: + - Legacy functions deprecated in this release: SIMIX_get_clock(), SIMIX_run(). + +Fixed bugs (FG#.. -> FramaGit bugs; FG!.. -> FG merge requests) + (FG: issues on Framagit; GH: issues on GitHub) + - FG#47: Complete and fix tests from teshuite/s4u/activity-lifecycle + - FG#64: Configuring smpi/IB-penalty-factors + - FG#67: Running computation concurrently with MPI_Iallreduce + - FG#69: Tutorial misleading users of pre-v3.26 versions of SimGrid + - FG#71: Segmentation fault on invalid gw_src/gw_dst + - GH#322: Issue when an actor kills his host vm ---------------------------------------------------------------------------- @@ -96,7 +366,7 @@ C binding and interface: available as sg_actor_start_() and sg_actor_create_(). Fixed bugs (FG#.. -> FramaGit bugs; FG!.. -> FG merge requests) - (FG: issues on Framagit; GF: issues on GForge; GH: issues on GitHub) + (FG: issues on Framagit; GH: issues on GitHub) - FG#37: Parallel tasks are limited to 1 core per host - FG#62: Running "smpirun -replay" on large networks - FG!46: Fix a few potential memory leaks in SMPI colls