From: Bruno Donassolo Date: Fri, 4 Mar 2022 10:14:37 +0000 (+0100) Subject: ptask_BMF: High-level documentation of BMF and the algorithm X-Git-Tag: v3.31~205^2~3 X-Git-Url: http://info.iut-bm.univ-fcomte.fr/pub/gitweb/simgrid.git/commitdiff_plain/00487352c466435e1418fdb2123641aece28abf3 ptask_BMF: High-level documentation of BMF and the algorithm Add eigen in installation instructions Add doc for ptask_BMF Changelog --- diff --git a/ChangeLog b/ChangeLog index 0e36dac86b..ebf5686e78 100644 --- a/ChangeLog +++ b/ChangeLog @@ -34,6 +34,21 @@ New plugin: the Chaos Monkey (killing actors at any time) - It is mostly intended to test the simgrid core in extreme conditions, but users may find it interesting too. +Models: + - New model for parallel task: ptask_BMF. + - More realistic sharing of heterogeneous resources compared to ptask_L07. + - Implement the BMF (Bottleneck max fairness) fairness. + - Improved resource sharing for parallel tasks with sub-flows (parallel + communications between same source and destination inside the ptask). + - Parameters: + - "--cfg=host/model:ptask_BMF": enable the model. + - "--cfg=bmf/max-iterations: " - maximum number of iterations performed + by BMF solver (default: 1000). + - "--cfg=bmf/selective-update:" - enable/disable the + selective-update optimization. Only invalidates and recomputes modified + parts of inequations system. May speed up simulation if sparse resource + utilization (default: false). + XBT: - Drop xbt_dynar_shrink(). diff --git a/docs/source/Configuring_SimGrid.rst b/docs/source/Configuring_SimGrid.rst index 1ef79f157c..7b90c7debd 100644 --- a/docs/source/Configuring_SimGrid.rst +++ b/docs/source/Configuring_SimGrid.rst @@ -251,6 +251,9 @@ models for all existing resources. - **ptask_L07:** Host model somehow similar to Cas01+CM02 but allowing "parallel tasks", that are intended to model the moldable tasks of the grid scheduling literature. + - **ptask_BMF:** More realistic model for heterogeneous resource sharing. + Implements BMF (Bottleneck max fairness) fairness. To be used with + parallel tasks instead of ptask_L07. - ``storage/model``: specify the used storage model. Only one model is provided so far. diff --git a/docs/source/Installing_SimGrid.rst b/docs/source/Installing_SimGrid.rst index 510603d527..1089cbd0e2 100644 --- a/docs/source/Installing_SimGrid.rst +++ b/docs/source/Installing_SimGrid.rst @@ -123,6 +123,9 @@ cmake (v3.5). boost (at least v1.48, v1.59 recommended) - On Debian / Ubuntu: ``apt install libboost-dev libboost-context-dev`` - On macOS with homebrew: ``brew install boost`` +Eigen3 + - On Debian / Ubuntu: ``apt install libeigen3-dev`` + - On macOS with homebrew: ``brew install eigen`` Java (optional): - Debian / Ubuntu: ``apt install default-jdk libgcj18-dev`` (or any version of libgcj) diff --git a/src/kernel/lmm/bmf.hpp b/src/kernel/lmm/bmf.hpp index a6d14cf4b4..ec8300faab 100644 --- a/src/kernel/lmm/bmf.hpp +++ b/src/kernel/lmm/bmf.hpp @@ -15,18 +15,56 @@ namespace simgrid { namespace kernel { namespace lmm { +/** @brief Generate all combinations of valid allocation */ class XBT_PUBLIC AllocationGenerator { public: explicit AllocationGenerator(Eigen::MatrixXd A); + /** + * @brief Get next valid allocation + * + * @param next_alloc Allocation (OUTPUT) + * @return true if there's an allocation not tested yet, false otherwise + */ bool next(std::vector& next_alloc); private: - Eigen::MatrixXd A_; //!< A_ji: resource usage matrix, each row j represents a resource and col i a flow/player + Eigen::MatrixXd A_; std::vector alloc_; bool first_ = true; }; +/** + * @beginrst + * + * Despite the simplicity of BMF fairness definition, it's quite hard to + * find a BMF allocation in the general case. + * + * This solver implements one possible algorithm to find a BMF, as proposed + * at: https://hal.archives-ouvertes.fr/hal-01552739. + * + * The idea of this algorithm is that each player/flow "selects" a resource to + * saturate. Then, we calculate the rate each flow would have with this allocation. + * If the allocation is a valid BMF and no one needs to move, it's over. Otherwise, + * each player selects a new resource to saturate based on the minimim rate possible + * between all resources. + * + * The steps: + * 1) Given an initial allocation B_i + * 2) Build a matrix A'_ji and C'_ji which assures that the player receives the most + * share at selected resources + * 3) Solve: A'_ji * rho_i = C'_j + * 4) Calculate the minimum fair rate for each resource j: f_j. The f_j represents + * the maximum each flow can receive at the resource j. + * 5) Builds a new vector B'_i = arg min(f_j/A_ji). + * 6) Stop if B == B' (nobody needs to move), go to step 2 otherwise + * + * Despite the overall good performance of this algorithm, which converges in a few + * iterations, we don't have any assurance about its convergence. In the worst case, + * it may be needed to test all possible combination of allocations (which is exponential). + * + * @endrst + */ class XBT_PUBLIC BmfSolver { public: /** @@ -139,6 +177,41 @@ private: int max_iteration_ = sg_bmf_max_iterations; //!< number maximum of iterations of BMF algorithm }; +/** + * @beginrst + * + * A BMF (bottleneck max fairness) solver to resolve inequation systems. + * + * Usually, SimGrid relies on a *max-min fairness* solver to share the resources. + * Max-min is great when sharing homogenous resources, however it cannot be used with heterogeneous resources. + * + * BMF is a natural alternative to max-min, providing a fair-sharing of heterogeneous resources (CPU, network, disk). + * It is specially relevant for the implementation of parallel tasks whose sharing involves different + * kinds of resources. + * + * BMF assures that every flow receives the maximum share possible in at least 1 bottleneck (fully used) resource. + * + * The BMF is characterized by: + * - A_ji: a matrix of requirement for flows/player. For each resource j, and flow i, A_ji represents the utilization + * of resource j for 1 unit of the flow i. + * - rho_i: the rate allocated for flow i (same among all resources) + * - C_j: the capacity of each resource (can be bytes/s, flops/s, etc) + * + * Therefore, these conditions need to satisfied to an allocation be considered a BMF: + * 1) All constraints are respected (flows cannot use more than the resource has available) + * - for all resource j and player i: A_ji * rho_i <= C_j + * 2) At least 1 resource is fully used (bottleneck). + * - for some resource j: A_ji * rho_i = C_j + * 3) Each flow (player) receives the maximum share in at least 1 bottleneck. + * - for all player i: exist a resource j: A_ji * rho_i >= A_jk * rho_k for all other player k + * + * Despite the prove of existence of a BMF allocation in the general case, it may not + * be unique, which leads to possible different rate for the applications. + * + * More details about BMF can be found at: https://hal.inria.fr/hal-01243985/document + * + * @endrst + */ /** * @brief Bottleneck max-fair system */