Analyzing Batsim results¶
Prerequisites¶
This tutorial assumes that you completed tutorial Running your first simulation and have kept the simulation results obtained during the tutorial.
Files overview¶
All Batsim output files are textual and are written in the same directory — they actually share their path prefix (see Command-line Interface).
PREFIX_jobs.csv
is the main output file. Contains information about the execution of each job (see Jobs).PREFIX_schedule.csv
contains aggregated information about the whole simulation — such as makespan, mean waiting time or total consumed energy (see Schedule).PREFIX_schedule.trace
is a Pajé trace of the simulation. Can be visualized with tools such as ViTE.PREFIX_machine_states.csv
is a time series about the platform usage. It stores how many machines are in each state for each time interval. This file is mostly used to have a scalable view of the platform usage over time — this is useful when the number of jobs is big.
Computing some statistics¶
Most Batsim output files are plain CSV and can therefore be loaded in any data analysis framework.
The following script outlines how to do a basic analysis with in R without losing sanity thanks to tidyverse. The conclusions are of course not amazing on this toy workload.
#!/usr/bin/env Rscript
library('tidyverse') # Use the tidyverse library.
theme_set(theme_bw()) # Cosmetics.
jobs = read_csv('out_jobs.csv') # Read the jobs file.
# Manually compute some metrics on each job.
jobs = jobs %>% mutate(slowdown = (finish_time - starting_time) /
(finish_time - submission_time),
longer_than_one_minute = execution_time > 60)
# Manually compute aggregated metrics.
# Here, the mean waiting time/slowdown for jobs with small execution time.
metrics = jobs %>% filter(longer_than_one_minute == FALSE) %>%
summarize(mean_waiting_time = mean(waiting_time),
mean_slowdown = mean(slowdown))
print(metrics) # Print aggregated metrics.
# Visualize what you want...
# Is there a link between jobs' waiting time and size?
ggplot(jobs) +
geom_point(aes(y=waiting_time, x=requested_number_of_resources)) +
ggsave('plot_wt_size.pdf')
# Is this still true depending on job execution time?
ggplot(jobs) +
geom_point(aes(y=waiting_time, x=requested_number_of_resources)) +
facet_wrap(~longer_than_one_minute) +
ggsave('plot_wt_size_exectime.pdf')
# Is there a link with job size and execution time?
ggplot(jobs) +
geom_violin(aes(factor(requested_number_of_resources), execution_time)) +
ggsave('plot_exectime_size.pdf')
The script can be executed from the experiment output directory. It should print some metrics and generate several plots in the current directory.
Todo
We may think of more interesting things to plot while remaining simple. This is not easy on this toy workload though…
Maybe include the plot in this document if it is interesting.
Visualizing Gantt charts¶
PREFIX_jobs.csv
output file to plot the Gantt chart of the jobs with the following script.from evalys.jobset import JobSet
from evalys import visu
js = JobSet.from_csv("PREFIX_jobs.csv")
visu.gantt.plot_gantt(js)
Todo
Introduce ViTE here and show an output example.