-
-
-
+.. This file has "irst" as an extension to ensure that it's not parsed by sphinx as is. Instead, it's included in another file that is parsed.
.. _howto_disk:
Modeling I/O: the realistic way
--------------------------------
+*******************************
Introduction
-~~~~~~~~~~~~
+============
This tutorial presents how to perform faithful IO experiments in
SimGrid. It is based on the paper "Adding Storage Simulation
evolved and may not be available anymore.
Running this tutorial
-^^^^^^^^^^^^^^^^^^^^^
+---------------------
A Dockerfile is available in ``docs/source/tuto_disk``. It allows you to
re-run this tutorial. For that, build the image and run the container:
- ``docker run -it tuto_disk``
Analyzing the experimental data
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+===============================
We start by analyzing and extracting the real data available.
Scripts
-^^^^^^^
+-------
We use a special method to create non-uniform histograms to represent
the noise in IO operations.
Copied from: `https://rdrr.io/github/dlebauer/pecan-priors/src/R/plots.R <https://rdrr.io/github/dlebauer/pecan-priors/src/R/plots.R>`_
Data preparation
-^^^^^^^^^^^^^^^^
+----------------
Some initial configurations/list of packages.
dfrange$Jobs=16
Griffon (SATA)
-^^^^^^^^^^^^^^
+--------------
Modeling resource sharing w/ concurrent access
-::::::::::::::::::::::::::::::::::::::::::::::
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
This figure presents the overall performance of IO operation with
concurrent access to the disk. Note that the image is different
geom_errorbar(data=dfd, aes(x=Jobs, ymin=BW-ci, ymax=BW+ci),color="black",width=.6) +
xlab("Number of concurrent operations") + ylab("Aggregated Bandwidth (MiB/s)") + guides(color=FALSE) + xlim(0,NA) + ylim(0,NA)
-.. image:: fig/griffon_deg.png
+.. image:: tuto_disk/fig/griffon_deg.png
Read
-''''
+""""
Getting read data for Griffon from 1 to 15 concurrent reads.
}
Write
-'''''
+"""""
Same for write operations.
}
Modeling read/write bandwidth variability
-:::::::::::::::::::::::::::::::::::::::::
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Fig.5 in the paper presents the noise in the read/write operations in
the Griffon SATA disk.
more precise information over the highly dense areas around the mean.
Read
-''''
+""""
First, we present the histogram for read operations.
griffon_read = df %>% filter(grepl("^Griffon", Cluster)) %>% filter(Operation == "Read") %>% select(Bwi)
dhist(1/griffon_read$Bwi)
-.. image:: fig/griffon_read_dhist.png
+.. image:: tuto_disk/fig/griffon_read_dhist.png
Saving it to be exported in json format.
}
Write
-'''''
+"""""
Same analysis for write operations.
griffon_write = df %>% filter(grepl("^Griffon", Cluster)) %>% filter(Operation == "Write") %>% select(Bwi)
dhist(1/griffon_write$Bwi)
-.. image:: fig/griffon_write_dhist.png
+.. image:: tuto_disk/fig/griffon_write_dhist.png
.. code:: R
}
Edel (SSD)
-^^^^^^^^^^
+----------
This section presents the exactly same analysis for the Edel SSDs.
Modeling resource sharing w/ concurrent access
-::::::::::::::::::::::::::::::::::::::::::::::
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Read
-''''
+""""
Getting read data for Edel from 1 to 15 concurrent operations.
}
Write
-'''''
+"""""
Same for write operations.
}
Modeling read/write bandwidth variability
-:::::::::::::::::::::::::::::::::::::::::
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Read
-''''
+""""
.. code:: R
edel_read = df %>% filter(grepl("^Edel", Cluster)) %>% filter(Operation == "Read") %>% select(Bwi)
dhist(1/edel_read$Bwi)
-.. image:: fig/edel_read_dhist.png
+.. image:: tuto_disk/fig/edel_read_dhist.png
Saving it to be exported in json format.
}
Write
-'''''
+"""""
.. code:: R
edel_write = df %>% filter(grepl("^Edel", Cluster)) %>% filter(Operation == "Write") %>% select(Bwi)
dhist(1/edel_write$Bwi)
-.. image:: fig/edel_write_dhist.png
+.. image:: tuto_disk/fig/edel_write_dhist.png
Saving it to be exported later.
}
Exporting to JSON
-~~~~~~~~~~~~~~~~~
+=================
Finally, let's save it to a file to be opened by our simulator.
cat(json, file="IO_noise.json")
Injecting this data in SimGrid
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+==============================
To mimic this behavior in SimGrid, we use two features in the platform
description: non-linear sharing policy and bandwidth factors. For more
details, please see the source code in ``tuto_disk.cpp``.
Modeling resource sharing w/ concurrent access
-^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+----------------------------------------------
The ``set_sharing_policy`` method allows the user to set a callback to
dynamically change the disk capacity. The callback is called each time
disk->set_sharing_policy(sg4::Disk::Operation::READ, sg4::Disk::SharingPolicy::NONLINEAR, &disk_dynamic_sharing);
Modeling read/write bandwidth variability
-^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+-----------------------------------------
The noise in I/O operations can be obtained by applying a factor to
the I/O bandwidth of the disk. This factor is applied when we update
disk->set_factor_cb(&disk_variability);
Running our simulation
-^^^^^^^^^^^^^^^^^^^^^^
+----------------------
The binary was compiled in the provided docker container.
./tuto_disk > ./simgrid_disk.csv
Analyzing the SimGrid results
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+=============================
The figure below presents the results obtained by SimGrid.
facet_wrap(disk~op,ncol=2,scale="free_y")+ # ) + #
xlab("Number of concurrent operations") + ylab("Aggregated Bandwidth (MiB/s)") + guides(color=FALSE) + xlim(0,NA) + ylim(0,NA)
-.. image:: fig/simgrid_results.png
+.. image:: tuto_disk/fig/simgrid_results.png
Note: The variability in griffon read operation seems to decrease when
we have more concurrent operations. This is a particularity of the