objectification of MC simcall achieved -- many tests still failing
* MC_RANDOM is now a regular simcall, made observable by its SimcallInspector
- remove it from simcall.in and remove all relevant elements
- Other simcalls that are visible from MC (ie, that are generated by
simcall.py and are handled in MC with big switch cases) are not
modified at all, even if the patch reindent them since this execution
path is now in a new conditionnal.
* SimcallInspectors are used everywhere where a specific handling was made for MC_RANDOM
- In MC_state_choose_request_for_process, we get the next value to
be picked by MC_random (or the next comm to be picked by waitany)
depending on the amount of times we consider this simcall (ie, on the
amount of decision forks we did on that transition while exploring the
state space.
This operation was not named before, it's now called is_pending()
because it returns whether we should do another decision fork from
that transition. Maybe is_branching() would be a better name.
Or consider() since its parameter is times_considered, and there
is one call location where its return value is ignored (when we
replay all transitions on the path leading to the state we want to
restore). But there is already a mc::ActorState::consider() where
the Checker express interest in a given simcall.
- The two other generic methods of SimcallInspectors are to_string()
to get a textual representation of the transtion (eg to display the
backtrace leading to the property violation), and dot_label() to
get a label that can be used in dot(1). This is useful to get a
visual representation of the state space.
- There is a visible() method too, but I guess it's useless: Only
visible simcall will be given an inspector. Or it will be more
complex, because differing model checking algorithms will be
interested in differing sets of simcalls (so visible() should be
Checker-dependend). Leave it as is for now.
- RandomSimcall is also implementing a get_value() method that
allows its implementation. At the end of the day, the connexion
between the actor side (out of the kernel) and the model-checker side
is made with this code:
auto observer = new simgrid::mc::RandomSimcall(SIMIX_process_self(), min, max);
return simgrid::kernel::actor::simcall([observer] { return observer->get_value(); }, observer);
The RandomSimcall object is really making the link between the two worlds.
- When we will convert the other simcalls, something will be needed
to handle transition dependency. Since MC_RANDOM is never dependent
with anything, this is left unhandled in SimcallInspector for now.
One problem at a time.
* SimcallInspectors live in the AppSide while these information are
needed in the ModelChecker side. So the network protocol was
extended to allow the Modelchecker to get this information thru RPC.
This is not satisfactory because it probably slows things down, even
if it's hard to say for sure because many tests just deadlock with it.
When the AppSide is dead, you'd better not ask for a RPC to collect
information about the simcalls that lead to the issue.
I extended the State::Transition object (that is used to both replay
a trace leading to a state I want to restore and to display the
backtrace that lead to a violation) to include the textual
representation of the transitions, to not do a RPC to the dead
application on violation.
But when the application abort(), I seem to have a stack corruption
on the app side or something. And debugging the verified application
is made very difficult by the Dwarf inspector that prevents the
verification from starting if the subprocess is not compiled in one
segment only (valgrind violates this rule). Plus valgrind has a
subprocess itself, probably defeating the checkpoint/restore
mechanism. Maybe I should try the Sanitizers.
* A better approach (if we fail to fix the curreny design) could be to
pre-compute on AppSide all information that can be extracted from
the inspectors, and copy it over to the CheckerSide once per
scheduling round. That would speed things up, and ensure that there
is no RPC after the death of the application. That should be done in
place of MC_process_refresh_simix_actor_dynar() that copies some
(less useful) information over.
21 files changed: