6 Modeling I/O: the realistic way
7 -------------------------------
12 This tutorial presents how to perform faithful IO experiments in
13 SimGrid. It is based on the paper "Adding Storage Simulation
14 Capacities to the SimGridToolkit: Concepts, Models, and API".
16 The paper presents a series of experiments to analyze the performance
17 of IO operations (read/write) on different kinds of disks (SATA, SAS,
18 SSD). In this tutorial, we present a detailed example of how to
19 extract experimental data to simulate: i) performance degradation
20 with concurrent operations (Fig. 8 in the paper) and ii) variability
21 in IO operations (Fig. 5 to 7).
23 - Link for paper: `https://hal.inria.fr/hal-01197128 <https://hal.inria.fr/hal-01197128>`_
25 - Link for data: `https://figshare.com/articles/dataset/Companion_of_the_SimGrid_storage_modeling_article/1175156 <https://figshare.com/articles/dataset/Companion_of_the_SimGrid_storage_modeling_article/1175156>`_
29 - The purpose of this document is to illustrate how we can
30 extract data from experiments and inject on SimGrid. However, the
31 data shown on this page may **not** reflect the reality.
33 - You must run similar experiments on your hardware to get realistic
34 data for your context.
36 - SimGrid has been in active development since the paper release in
37 2015, thus the MSG and XML description used in the paper may have
38 evolved and may not be available anymore.
43 A Dockerfile is available in ``docs/source/tuto_disk``. It allows you to
44 re-run this tutorial. For that, build the image and run the container:
46 - ``docker build -t tuto_disk .``
48 - ``docker run -it tuto_disk``
50 Analyzing the experimental data
51 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
53 We start by analyzing and extracting the real data available.
58 We use a special method to create non-uniform histograms to represent
59 the noise in IO operations.
61 Unable to install the library properly, I copied the important methods
64 Copied from: `https://rdrr.io/github/dlebauer/pecan-priors/src/R/plots.R <https://rdrr.io/github/dlebauer/pecan-priors/src/R/plots.R>`_
69 Some initial configurations/list of packages.
84 Use suppressPackageStartupMessages() to eliminate package startup
87 Attaching package: 'dplyr'
89 The following objects are masked from 'package:plyr':
91 arrange, count, desc, failwith, id, mutate, rename, summarise,
94 The following objects are masked from 'package:stats':
98 The following objects are masked from 'package:base':
100 intersect, setdiff, setequal, union
102 Attaching package: 'gridExtra'
104 The following object is masked from 'package:dplyr':
108 This was copied from the ``sg_storage_ccgrid15.org`` available at the
109 figshare of the paper. Before executing this code, please download and
110 decompress the appropriate file.
114 curl -O -J -L "https://ndownloader.figshare.com/files/1928095"
117 Preparing data for varialiby analysis.
122 clean_up <- function (df, infra){
123 names(df) <- c("Hostname","Date","DirectIO","IOengine","IOscheduler","Error","Operation","Jobs","BufferSize","FileSize","Runtime","Bandwidth","BandwidthMin","BandwidthMax","Latency", "LatencyMin", "LatencyMax","IOPS")
124 df=subset(df,Error=="0")
125 df=subset(df,DirectIO=="1")
126 df <- merge(df,infra,by="Hostname")
127 df$Hostname = sapply(strsplit(df$Hostname, "[.]"), "[", 1)
128 df$HostModel = paste(df$Hostname, df$Model, sep=" - ")
129 df$Duration = df$Runtime/1000 # fio outputs runtime in msec, we want to display seconds
130 df$Size = df$FileSize/1024/1024
131 df=subset(df,Duration!=0.000)
132 df$Bwi=df$Duration/df$Size
133 df[df$Operation=="read",]$Operation<- "Read"
134 df[df$Operation=="write",]$Operation<- "Write"
138 grenoble <- read.csv('./bench/grenoble.csv', header=FALSE,sep = ";",
139 stringsAsFactors=FALSE)
140 luxembourg <- read.csv('./bench/luxembourg.csv', header=FALSE,sep = ";", stringsAsFactors=FALSE)
141 nancy <- read.csv('./bench/nancy.csv', header=FALSE,sep = ";", stringsAsFactors=FALSE)
142 all <- rbind(grenoble,nancy, luxembourg)
143 infra <- read.csv('./bench/infra.csv', header=FALSE,sep = ";", stringsAsFactors=FALSE)
144 names(infra) <- c("Hostname","Model","DiskSize")
146 all = clean_up(all, infra)
147 griffon = subset(all,grepl("^griffon", Hostname))
148 griffon$Cluster <-"Griffon (SATA II)"
149 edel = subset(all,grepl("^edel", Hostname))
150 edel$Cluster<-"Edel (SSD)"
152 df = rbind(griffon[griffon$Jobs=="1" & griffon$IOscheduler=="cfq",],
153 edel[edel$Jobs=="1" & edel$IOscheduler=="cfq",])
154 #Get rid off of 64 Gb disks of Edel as they behave differently (used to be "edel-51")
155 df = df[!(grepl("^Edel",df$Cluster) & df$DiskSize=="64 GB"),]
157 Preparing data for concurrent analysis.
161 dfc = rbind(griffon[griffon$Jobs>1 & griffon$IOscheduler=="cfq",],
162 edel[edel$Jobs>1 & edel$IOscheduler=="cfq",])
163 dfc2 = rbind(griffon[griffon$Jobs==1 & griffon$IOscheduler=="cfq",],
164 edel[edel$Jobs==1 & edel$IOscheduler=="cfq",])
165 dfc = rbind(dfc,dfc2[sample(nrow(dfc2),size=200),])
169 Date = NA, #tmpl$Date,
174 Operation = NA, #tmpl$Operation,
175 Jobs = NA, # #d$nb.of.concurrent.access,
176 BufferSize = NA, #d$bs,
177 FileSize = NA, #d$size,
186 Model = NA, #tmpl$Model,
187 DiskSize = NA, #tmpl$DiskSize,
189 Duration = NA, #d$time,
192 Cluster = NA) #tmpl$Cluster)
194 dd$Size = dd$FileSize/1024/1024
195 dd$Bwi = dd$Duration/dd$Size
198 # Let's get rid of small files!
199 dfc = subset(dfc,Size >= 10)
200 # Let's get rid of 64Gb edel disks
201 dfc = dfc[!(grepl("^Edel",dfc$Cluster) & dfc$DiskSize=="64 GB"),]
203 dfc$TotalSize=dfc$Size * dfc$Jobs
204 dfc$BW = (dfc$TotalSize) / dfc$Duration
205 dfc = dfc[dfc$BW>=20,] # get rid of one point that is typically an outlier and does not make sense
208 dfc[dfc$Cluster=="Edel (SSD)" & dfc$Operation=="Read",]$method="loess"
210 dfc[dfc$Cluster=="Edel (SSD)" & dfc$Operation=="Write" & dfc$Jobs ==1,]$method="lm"
211 dfc[dfc$Cluster=="Edel (SSD)" & dfc$Operation=="Write" & dfc$Jobs ==1,]$method=""
213 dfc[dfc$Cluster=="Griffon (SATA II)" & dfc$Operation=="Write",]$method="lm"
214 dfc[dfc$Cluster=="Griffon (SATA II)" & dfc$Operation=="Write" & dfc$Jobs ==1,]$method=""
216 dfd = dfc[dfc$Operation=="Write" & dfc$Jobs ==1 &
217 (dfc$Cluster %in% c("Griffon (SATA II)", "Edel (SSD)")),]
218 dfd = ddply(dfd,c("Cluster","Operation","Jobs","DiskSize"), summarize,
219 mean = mean(BW), num = length(BW), sd = sd(BW))
221 dfd$ci = 2*dfd$sd/sqrt(dfd$num)
223 dfrange=ddply(dfc,c("Cluster","Operation","DiskSize"), summarize,
225 dfrange=ddply(dfrange,c("Cluster","DiskSize"), mutate,
232 Modeling resource sharing w/ concurrent access
233 ::::::::::::::::::::::::::::::::::::::::::::::
235 This figure presents the overall performance of IO operation with
236 concurrent access to the disk. Note that the image is different
237 from the one in the paper. Probably, we need to further clean the
238 available data to obtain exaclty the same results.
242 ggplot(data=dfc,aes(x=Jobs,y=BW, color=Operation)) + theme_bw() +
243 geom_point(alpha=.3) +
244 geom_point(data=dfrange, size=0) +
245 facet_wrap(Cluster~Operation,ncol=2,scale="free_y")+ # ) + #
246 geom_smooth(data=dfc[dfc$method=="loess",], color="black", method=loess,se=TRUE,fullrange=T) +
247 geom_smooth(data=dfc[dfc$method=="lm",], color="black", method=lm,se=TRUE) +
248 geom_point(data=dfd, aes(x=Jobs,y=BW),color="black",shape=21,fill="white") +
249 geom_errorbar(data=dfd, aes(x=Jobs, ymin=BW-ci, ymax=BW+ci),color="black",width=.6) +
250 xlab("Number of concurrent operations") + ylab("Aggregated Bandwidth (MiB/s)") + guides(color=FALSE) + xlim(0,NA) + ylim(0,NA)
252 .. image:: fig/griffon_deg.png
257 Getting read data for Griffon from 1 to 15 concurrent reads.
261 deg_griffon = dfc %>% filter(grepl("^Griffon", Cluster)) %>% filter(Operation == "Read")
262 model = lm(BW~Jobs, data = deg_griffon)
263 IO_INFO[["griffon"]][["degradation"]][["read"]] = predict(model,data.frame(Jobs=seq(1,15)))
265 toJSON(IO_INFO, pretty = TRUE)
273 "read": [66.6308, 64.9327, 63.2346, 61.5365, 59.8384, 58.1403, 56.4423, 54.7442, 53.0461, 51.348, 49.6499, 47.9518, 46.2537, 44.5556, 42.8575]
281 Same for write operations.
285 deg_griffon = dfc %>% filter(grepl("^Griffon", Cluster)) %>% filter(Operation == "Write") %>% filter(Jobs > 2)
286 mean_job_1 = dfc %>% filter(grepl("^Griffon", Cluster)) %>% filter(Operation == "Write") %>% filter(Jobs == 1) %>% summarize(mean = mean(BW))
287 model = lm(BW~Jobs, data = deg_griffon)
288 IO_INFO[["griffon"]][["degradation"]][["write"]] = c(mean_job_1$mean, predict(model,data.frame(Jobs=seq(2,15))))
289 toJSON(IO_INFO, pretty = TRUE)
297 "read": [66.6308, 64.9327, 63.2346, 61.5365, 59.8384, 58.1403, 56.4423, 54.7442, 53.0461, 51.348, 49.6499, 47.9518, 46.2537, 44.5556, 42.8575],
298 "write": [49.4576, 26.5981, 27.7486, 28.8991, 30.0495, 31.2, 32.3505, 33.501, 34.6515, 35.8019, 36.9524, 38.1029, 39.2534, 40.4038, 41.5543]
303 Modeling read/write bandwidth variability
304 :::::::::::::::::::::::::::::::::::::::::
306 Fig.5 in the paper presents the noise in the read/write operations in
307 the Griffon SATA disk.
309 The paper uses regular histogram to illustrate the distribution of the
310 effective bandwidth. However, in this tutorial, we use dhist
311 (`https://rdrr.io/github/dlebauer/pecan-priors/man/dhist.html <https://rdrr.io/github/dlebauer/pecan-priors/man/dhist.html>`_) to have a
312 more precise information over the highly dense areas around the mean.
317 First, we present the histogram for read operations.
321 griffon_read = df %>% filter(grepl("^Griffon", Cluster)) %>% filter(Operation == "Read") %>% select(Bwi)
322 dhist(1/griffon_read$Bwi)
324 .. image:: fig/griffon_read_dhist.png
326 Saving it to be exported in json format.
330 griffon_read_dhist = dhist(1/griffon_read$Bwi, plot=FALSE)
331 IO_INFO[["griffon"]][["noise"]][["read"]] = c(breaks=list(griffon_read_dhist$xbr), heights=list(unclass(griffon_read_dhist$heights)))
332 IO_INFO[["griffon"]][["read_bw"]] = mean(1/griffon_read$Bwi)
333 toJSON(IO_INFO, pretty = TRUE)
338 In hist.default(x, breaks = cut.pt, plot = FALSE, probability = TRUE) :
339 argument 'probability' is not made use of
344 "read": [66.6308, 64.9327, 63.2346, 61.5365, 59.8384, 58.1403, 56.4423, 54.7442, 53.0461, 51.348, 49.6499, 47.9518, 46.2537, 44.5556, 42.8575],
345 "write": [49.4576, 26.5981, 27.7486, 28.8991, 30.0495, 31.2, 32.3505, 33.501, 34.6515, 35.8019, 36.9524, 38.1029, 39.2534, 40.4038, 41.5543]
349 "breaks": [39.257, 51.3413, 60.2069, 66.8815, 71.315, 74.2973, 80.8883, 95.1944, 109.6767, 125.0231, 140.3519, 155.6807, 171.0094, 186.25],
350 "heights": [15.3091, 41.4578, 73.6826, 139.5982, 235.125, 75.3357, 4.1241, 3.3834, 0, 0.0652, 0.0652, 0.0652, 0.3937]
360 Same analysis for write operations.
364 griffon_write = df %>% filter(grepl("^Griffon", Cluster)) %>% filter(Operation == "Write") %>% select(Bwi)
365 dhist(1/griffon_write$Bwi)
367 .. image:: fig/griffon_write_dhist.png
371 griffon_write_dhist = dhist(1/griffon_write$Bwi, plot=FALSE)
372 IO_INFO[["griffon"]][["noise"]][["write"]] = c(breaks=list(griffon_write_dhist$xbr), heights=list(unclass(griffon_write_dhist$heights)))
373 IO_INFO[["griffon"]][["write_bw"]] = mean(1/griffon_write$Bwi)
374 toJSON(IO_INFO, pretty = TRUE)
379 In hist.default(x, breaks = cut.pt, plot = FALSE, probability = TRUE) :
380 argument 'probability' is not made use of
385 "read": [66.6308, 64.9327, 63.2346, 61.5365, 59.8384, 58.1403, 56.4423, 54.7442, 53.0461, 51.348, 49.6499, 47.9518, 46.2537, 44.5556, 42.8575],
386 "write": [49.4576, 26.5981, 27.7486, 28.8991, 30.0495, 31.2, 32.3505, 33.501, 34.6515, 35.8019, 36.9524, 38.1029, 39.2534, 40.4038, 41.5543]
390 "breaks": [39.257, 51.3413, 60.2069, 66.8815, 71.315, 74.2973, 80.8883, 95.1944, 109.6767, 125.0231, 140.3519, 155.6807, 171.0094, 186.25],
391 "heights": [15.3091, 41.4578, 73.6826, 139.5982, 235.125, 75.3357, 4.1241, 3.3834, 0, 0.0652, 0.0652, 0.0652, 0.3937]
394 "breaks": [5.2604, 21.0831, 31.4773, 39.7107, 45.5157, 50.6755, 54.4726, 59.7212, 67.8983, 81.2193, 95.6333, 111.5864, 127.8409, 144.3015],
395 "heights": [1.7064, 22.6168, 38.613, 70.8008, 84.4486, 128.5118, 82.3692, 39.1431, 9.2256, 5.6195, 1.379, 0.6429, 0.1549]
398 "read_bw": [68.5425],
399 "write_bw": [50.6045]
406 This section presents the exactly same analysis for the Edel SSDs.
408 Modeling resource sharing w/ concurrent access
409 ::::::::::::::::::::::::::::::::::::::::::::::
414 Getting read data for Edel from 1 to 15 concurrent operations.
418 deg_edel = dfc %>% filter(grepl("^Edel", Cluster)) %>% filter(Operation == "Read")
419 model = loess(BW~Jobs, data = deg_edel)
420 IO_INFO[["edel"]][["degradation"]][["read"]] = predict(model,data.frame(Jobs=seq(1,15)))
421 toJSON(IO_INFO, pretty = TRUE)
429 "read": [66.6308, 64.9327, 63.2346, 61.5365, 59.8384, 58.1403, 56.4423, 54.7442, 53.0461, 51.348, 49.6499, 47.9518, 46.2537, 44.5556, 42.8575],
430 "write": [49.4576, 26.5981, 27.7486, 28.8991, 30.0495, 31.2, 32.3505, 33.501, 34.6515, 35.8019, 36.9524, 38.1029, 39.2534, 40.4038, 41.5543]
434 "breaks": [39.257, 51.3413, 60.2069, 66.8815, 71.315, 74.2973, 80.8883, 95.1944, 109.6767, 125.0231, 140.3519, 155.6807, 171.0094, 186.25],
435 "heights": [15.3091, 41.4578, 73.6826, 139.5982, 235.125, 75.3357, 4.1241, 3.3834, 0, 0.0652, 0.0652, 0.0652, 0.3937]
438 "breaks": [5.2604, 21.0831, 31.4773, 39.7107, 45.5157, 50.6755, 54.4726, 59.7212, 67.8983, 81.2193, 95.6333, 111.5864, 127.8409, 144.3015],
439 "heights": [1.7064, 22.6168, 38.613, 70.8008, 84.4486, 128.5118, 82.3692, 39.1431, 9.2256, 5.6195, 1.379, 0.6429, 0.1549]
442 "read_bw": [68.5425],
443 "write_bw": [50.6045]
447 "read": [150.5119, 167.4377, 182.2945, 195.1004, 205.8671, 214.1301, 220.411, 224.6343, 227.7141, 230.6843, 233.0923, 235.2027, 236.8369, 238.0249, 238.7515]
455 Same for write operations.
459 deg_edel = dfc %>% filter(grepl("^Edel", Cluster)) %>% filter(Operation == "Write") %>% filter(Jobs > 2)
460 mean_job_1 = dfc %>% filter(grepl("^Edel", Cluster)) %>% filter(Operation == "Write") %>% filter(Jobs == 1) %>% summarize(mean = mean(BW))
461 model = lm(BW~Jobs, data = deg_edel)
462 IO_INFO[["edel"]][["degradation"]][["write"]] = c(mean_job_1$mean, predict(model,data.frame(Jobs=seq(2,15))))
463 toJSON(IO_INFO, pretty = TRUE)
471 "read": [66.6308, 64.9327, 63.2346, 61.5365, 59.8384, 58.1403, 56.4423, 54.7442, 53.0461, 51.348, 49.6499, 47.9518, 46.2537, 44.5556, 42.8575],
472 "write": [49.4576, 26.5981, 27.7486, 28.8991, 30.0495, 31.2, 32.3505, 33.501, 34.6515, 35.8019, 36.9524, 38.1029, 39.2534, 40.4038, 41.5543]
476 "breaks": [39.257, 51.3413, 60.2069, 66.8815, 71.315, 74.2973, 80.8883, 95.1944, 109.6767, 125.0231, 140.3519, 155.6807, 171.0094, 186.25],
477 "heights": [15.3091, 41.4578, 73.6826, 139.5982, 235.125, 75.3357, 4.1241, 3.3834, 0, 0.0652, 0.0652, 0.0652, 0.3937]
480 "breaks": [5.2604, 21.0831, 31.4773, 39.7107, 45.5157, 50.6755, 54.4726, 59.7212, 67.8983, 81.2193, 95.6333, 111.5864, 127.8409, 144.3015],
481 "heights": [1.7064, 22.6168, 38.613, 70.8008, 84.4486, 128.5118, 82.3692, 39.1431, 9.2256, 5.6195, 1.379, 0.6429, 0.1549]
484 "read_bw": [68.5425],
485 "write_bw": [50.6045]
489 "read": [150.5119, 167.4377, 182.2945, 195.1004, 205.8671, 214.1301, 220.411, 224.6343, 227.7141, 230.6843, 233.0923, 235.2027, 236.8369, 238.0249, 238.7515],
490 "write": [132.2771, 170.174, 170.137, 170.1, 170.063, 170.026, 169.9889, 169.9519, 169.9149, 169.8779, 169.8408, 169.8038, 169.7668, 169.7298, 169.6927]
495 Modeling read/write bandwidth variability
496 :::::::::::::::::::::::::::::::::::::::::
503 edel_read = df %>% filter(grepl("^Edel", Cluster)) %>% filter(Operation == "Read") %>% select(Bwi)
504 dhist(1/edel_read$Bwi)
506 .. image:: fig/edel_read_dhist.png
508 Saving it to be exported in json format.
512 edel_read_dhist = dhist(1/edel_read$Bwi, plot=FALSE)
513 IO_INFO[["edel"]][["noise"]][["read"]] = c(breaks=list(edel_read_dhist$xbr), heights=list(unclass(edel_read_dhist$heights)))
514 IO_INFO[["edel"]][["read_bw"]] = mean(1/edel_read$Bwi)
515 toJSON(IO_INFO, pretty = TRUE)
520 In hist.default(x, breaks = cut.pt, plot = FALSE, probability = TRUE) :
521 argument 'probability' is not made use of
526 "read": [66.6308, 64.9327, 63.2346, 61.5365, 59.8384, 58.1403, 56.4423, 54.7442, 53.0461, 51.348, 49.6499, 47.9518, 46.2537, 44.5556, 42.8575],
527 "write": [49.4576, 26.5981, 27.7486, 28.8991, 30.0495, 31.2, 32.3505, 33.501, 34.6515, 35.8019, 36.9524, 38.1029, 39.2534, 40.4038, 41.5543]
531 "breaks": [39.257, 51.3413, 60.2069, 66.8815, 71.315, 74.2973, 80.8883, 95.1944, 109.6767, 125.0231, 140.3519, 155.6807, 171.0094, 186.25],
532 "heights": [15.3091, 41.4578, 73.6826, 139.5982, 235.125, 75.3357, 4.1241, 3.3834, 0, 0.0652, 0.0652, 0.0652, 0.3937]
535 "breaks": [5.2604, 21.0831, 31.4773, 39.7107, 45.5157, 50.6755, 54.4726, 59.7212, 67.8983, 81.2193, 95.6333, 111.5864, 127.8409, 144.3015],
536 "heights": [1.7064, 22.6168, 38.613, 70.8008, 84.4486, 128.5118, 82.3692, 39.1431, 9.2256, 5.6195, 1.379, 0.6429, 0.1549]
539 "read_bw": [68.5425],
540 "write_bw": [50.6045]
544 "read": [150.5119, 167.4377, 182.2945, 195.1004, 205.8671, 214.1301, 220.411, 224.6343, 227.7141, 230.6843, 233.0923, 235.2027, 236.8369, 238.0249, 238.7515],
545 "write": [132.2771, 170.174, 170.137, 170.1, 170.063, 170.026, 169.9889, 169.9519, 169.9149, 169.8779, 169.8408, 169.8038, 169.7668, 169.7298, 169.6927]
549 "breaks": [104.1667, 112.3335, 120.5003, 128.6671, 136.8222, 144.8831, 149.6239, 151.2937, 154.0445, 156.3837, 162.3555, 170.3105, 178.3243],
550 "heights": [0.1224, 0.1224, 0.1224, 0.2452, 1.2406, 61.6128, 331.2201, 167.6488, 212.1086, 31.3996, 2.3884, 1.747]
553 "read_bw": [152.7139]
563 edel_write = df %>% filter(grepl("^Edel", Cluster)) %>% filter(Operation == "Write") %>% select(Bwi)
564 dhist(1/edel_write$Bwi)
566 .. image:: fig/edel_write_dhist.png
568 Saving it to be exported later.
572 edel_write_dhist = dhist(1/edel_write$Bwi, plot=FALSE)
573 IO_INFO[["edel"]][["noise"]][["write"]] = c(breaks=list(edel_write_dhist$xbr), heights=list(unclass(edel_write_dhist$heights)))
574 IO_INFO[["edel"]][["write_bw"]] = mean(1/edel_write$Bwi)
575 toJSON(IO_INFO, pretty = TRUE)
580 In hist.default(x, breaks = cut.pt, plot = FALSE, probability = TRUE) :
581 argument 'probability' is not made use of
586 "read": [66.6308, 64.9327, 63.2346, 61.5365, 59.8384, 58.1403, 56.4423, 54.7442, 53.0461, 51.348, 49.6499, 47.9518, 46.2537, 44.5556, 42.8575],
587 "write": [49.4576, 26.5981, 27.7486, 28.8991, 30.0495, 31.2, 32.3505, 33.501, 34.6515, 35.8019, 36.9524, 38.1029, 39.2534, 40.4038, 41.5543]
591 "breaks": [39.257, 51.3413, 60.2069, 66.8815, 71.315, 74.2973, 80.8883, 95.1944, 109.6767, 125.0231, 140.3519, 155.6807, 171.0094, 186.25],
592 "heights": [15.3091, 41.4578, 73.6826, 139.5982, 235.125, 75.3357, 4.1241, 3.3834, 0, 0.0652, 0.0652, 0.0652, 0.3937]
595 "breaks": [5.2604, 21.0831, 31.4773, 39.7107, 45.5157, 50.6755, 54.4726, 59.7212, 67.8983, 81.2193, 95.6333, 111.5864, 127.8409, 144.3015],
596 "heights": [1.7064, 22.6168, 38.613, 70.8008, 84.4486, 128.5118, 82.3692, 39.1431, 9.2256, 5.6195, 1.379, 0.6429, 0.1549]
599 "read_bw": [68.5425],
600 "write_bw": [50.6045]
604 "read": [150.5119, 167.4377, 182.2945, 195.1004, 205.8671, 214.1301, 220.411, 224.6343, 227.7141, 230.6843, 233.0923, 235.2027, 236.8369, 238.0249, 238.7515],
605 "write": [132.2771, 170.174, 170.137, 170.1, 170.063, 170.026, 169.9889, 169.9519, 169.9149, 169.8779, 169.8408, 169.8038, 169.7668, 169.7298, 169.6927]
609 "breaks": [104.1667, 112.3335, 120.5003, 128.6671, 136.8222, 144.8831, 149.6239, 151.2937, 154.0445, 156.3837, 162.3555, 170.3105, 178.3243],
610 "heights": [0.1224, 0.1224, 0.1224, 0.2452, 1.2406, 61.6128, 331.2201, 167.6488, 212.1086, 31.3996, 2.3884, 1.747]
613 "breaks": [70.9593, 79.9956, 89.0654, 98.085, 107.088, 115.9405, 123.5061, 127.893, 131.083, 133.6696, 135.7352, 139.5932, 147.4736],
614 "heights": [0.2213, 0, 0.3326, 0.4443, 1.4685, 11.8959, 63.869, 110.286, 149.9741, 202.887, 80.8298, 9.0298]
617 "read_bw": [152.7139],
618 "write_bw": [131.7152]
625 Finally, let's save it to a file to be opened by our simulator.
629 json = toJSON(IO_INFO, pretty = TRUE)
630 cat(json, file="IO_noise.json")
632 Injecting this data in SimGrid
633 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
635 To mimic this behavior in SimGrid, we use two features in the platform
636 description: non-linear sharing policy and bandwidth factors. For more
637 details, please see the source code in ``tuto_disk.cpp``.
639 Modeling resource sharing w/ concurrent access
640 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
642 The ``set_sharing_policy`` method allows the user to set a callback to
643 dynamically change the disk capacity. The callback is called each time
644 SimGrid will share the disk between a set of I/O operations.
646 The callback has access to the number of activities sharing the
647 resource and its current capacity. It must return the new resource's
652 static double disk_dynamic_sharing(double capacity, int n)
654 return capacity; //useless callback
657 auto* disk = host->create_disk("dump", 1e6, 1e6);
658 disk->set_sharing_policy(sg4::Disk::Operation::READ, sg4::Disk::SharingPolicy::NONLINEAR, &disk_dynamic_sharing);
660 Modeling read/write bandwidth variability
661 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
663 The noise in I/O operations can be obtained by applying a factor to
664 the I/O bandwidth of the disk. This factor is applied when we update
665 the remaining amount of bytes to be transferred, increasing or
666 decreasing the effective disk bandwidth.
668 The ``set_factor`` method allows the user to set a callback to
669 dynamically change the factor to be applied for each I/O operation.
670 The callback has access to size of the operation and its type (read or
671 write). It must return a multiply factor (e.g. 1.0 for doing nothing).
675 static double disk_variability(sg_size_t size, sg4::Io::OpType op)
677 return 1.0; //useless callback
680 auto* disk = host->create_disk("dump", 1e6, 1e6);
681 disk->set_factor_cb(&disk_variability);
683 Running our simulation
684 ^^^^^^^^^^^^^^^^^^^^^^
686 The binary was compiled in the provided docker container.
690 ./tuto_disk > ./simgrid_disk.csv
692 Analyzing the SimGrid results
693 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
695 The figure below presents the results obtained by SimGrid.
697 The experiment performs I/O operations, varying the number of
698 concurrent operations from 1 to 15. We run only 20 simulations for
701 We can see that the graphics are quite similar to the ones obtained in
706 sg_df = read.csv("./simgrid_disk.csv")
707 sg_df = sg_df %>% group_by(disk, op, flows) %>% mutate(bw=((size*flows)/elapsed)/10^6, method=if_else(disk=="edel" & op=="read", "loess", "lm"))
708 sg_dfd = sg_df %>% filter(flows==1 & op=="write") %>% group_by(disk, op, flows) %>% summarize(mean = mean(bw), sd = sd(bw), se=sd/sqrt(n()))
710 sg_df[sg_df$op=="write" & sg_df$flows ==1,]$method=""
712 ggplot(data=sg_df, aes(x=flows, y=bw, color=op)) + theme_bw() +
713 geom_point(alpha=.3) +
714 geom_smooth(data=sg_df[sg_df$method=="loess",], color="black", method=loess,se=TRUE,fullrange=T) +
715 geom_smooth(data=sg_df[sg_df$method=="lm",], color="black", method=lm,se=TRUE) +
716 geom_errorbar(data=sg_dfd, aes(x=flows, y=mean, ymin=mean-2*se, ymax=mean+2*se),color="black",width=.6) +
717 facet_wrap(disk~op,ncol=2,scale="free_y")+ # ) + #
718 xlab("Number of concurrent operations") + ylab("Aggregated Bandwidth (MiB/s)") + guides(color=FALSE) + xlim(0,NA) + ylim(0,NA)
720 .. image:: fig/simgrid_results.png
722 Note: The variability in griffon read operation seems to decrease when
723 we have more concurrent operations. This is a particularity of the
724 griffon read speed profile and the elapsed time calculation.
728 - Each point represents the time to perform the N I/O operations.
730 - Griffon read speed decreases with the number of concurrent
733 With 15 read operations:
735 - At the beginning, every read gets the same bandwidth, about
738 - We sample the noise in I/O operations, some will be faster than
739 others (e.g. factor > 1).
741 When the first read operation finish:
743 - We will recalculate the bandwidth sharing, now considering that we
744 have 14 active read operations. This will increase the bandwidth for
745 each operation (about 44MiB/s).
747 - The remaining "slower" activities will be speed up.
749 This behavior keeps happening until the end of the 15 operations,
750 at each step, we speed up a little the slowest operations and
751 consequently, decreasing the variability we see.