Advanced Gumby Experiments and Scenarios

In the previous tutorial, we have shown how to run a simple experiment with Gumby. Gumby spawned two instances that run independently from each other. Most of the time, however, you want more control over the actions performed during your experiment. In this tutorial, we will setup a more advanced experiment. Our experiment will spawn five instances that are time-synchronized with each other. After five seconds, each instance will write its ID to a file. When the experiment ends, we will sum up all these instance IDs and write it to a file.

Configuration file

Our configuration file will look as follows:

experiment_name = synchronized_instances
experiment_time = 20
instances_to_run = 5
local_instance_cmd = "process_guard.py -c launch_scenario.py -n $INSTANCES_TO_RUN -t $EXPERIMENT_TIME -m $OUTPUT_DIR -o $OUTPUT_DIR "
post_process_cmd = post_process_write_ids.sh

# The scenario file to run after an instance has spawned.
scenario_file = write_ids.scenario

# The port of the synchronization server.
sync_port = __unique_port__

Most of the configuration options should be familiar at this point. We annotated two new configuration options, namely sync_port and scenario_file. We further explain the effect of these configuration options.

Scenario Files

Note that we specify a scenario file in our configuration file. A scenario file is a file that contains all events during our experiment. Each event includes a time when the event should fire, and a reference to a Python method. The scenario file is a very convenient way of building an experiment, and can quickly be reused. For example, a scenario file can contain a particular workload that the system should process. The scenario file used in our experiment looks as follows:

&module docs.tutorials.simple_module.SimpleModule

@0:5 write_peer_id
@0:9 stop

The first two lines specify which modules to include. We will explain experiment modules later in this tutorial. For now, we will focus on the two events that are specified in this scenario file:

@0:5 write_peer_id
@0:9 stop

The above two events are scheduled to fire after five and nine seconds after experiment start, respectively. The first event calls the write_peer_id method that simply writes the ID of the instance, or peer, to a file. Specifically, each Gumby instance is assigned a unique ID, starting from 1. The write_peer_id method is defined in the SimpleModule class in the simple_module.py file. This Python code is imported in the first line of our scenario file. The stop method is implemented in the ExperimentModule class, which is the superclass of SimpleModule.

By default, an event will be executed by all peers. One can restrict which instances run a particular event. For example, the line below specifies that only peer 2 writes its peer ID:

@0:5 write_peer_id {2}

Experiment Modules

Gumby enables experiment designers to provide their functionality in separate modules. In the experiment described in this tutorial, we import the SimpleModule class in our experiment, which has the following content:

from gumby.experiment import experiment_callback
from gumby.modules.experiment_module import static_module, ExperimentModule


@static_module
class SimpleModule(ExperimentModule):
    """
    A very simple experiment module that has a single callback.
    """

    @experiment_callback
    def write_peer_id(self):
        """
        Simply write my peer ID to a file.
        """
        with open("id.txt", "w") as id_file:
            id_file.write("%d" % self.my_id)

This file contains the definition of the SimpleModule class. Note that this class is annotated with a @static_module annotation, signaling to Gumby that this class is an experiment module. This decorator is required to correctly import module logic in a scenario file. The class contains a single method, namely write_peer_id. This method simply writes the ID of the peer to a file named id.txt. Note that the name of this method corresponds to the event specified in our scenario file, and this method is invoked when the event fires. The method is annotated with a experiment_callback decorator. This is required to correctly connect the event in the scenario file and the logic in the module.

An experiment can also import multiple modules. Gumby automatically imports these module on runtime and registers the events to the available callbacks. This allows re-use of experiment logic across different experiments.

Instance Synchronization

Each spawned instance independently executes the scenario file. This makes it important that each instance starts roughly at the same time. Gumby includes a synchronization server that prepares peers for the experiment, assigns IDs, and makes sure that peers start execution of the scenario file roughly at the same time. This synchronization process is coordinated by Gumby automatically. Recall that, in contrast to the previous experiment, we do not set the experiment_server_cmd configuration option to blank. This ensures that Gumby will spawn a synchronization server. The sync_port option in the configuration file specifies the port of the synchronization server, and indicates to which port the spawned clients should connect. A value of __unique_port__ indicates that Gumby will pick a random free port on runtime.

Post-experiment Data Aggregation

We now focus on the post-experiment script. This post-experiment script is executed after the scenario file is finished and reads all written peer IDs and sums them. Note that in the configuration file, we specify to run the post_process_write_ids.sh bash script, which looks as follows:

#!/usr/bin/env bash
gumby/docs/tutorials/post_process_write_ids.py .

graph_process_guard_data.sh

This simple script first executes the post_process_write_ids.py file and then calls the graph_process_guard_data.sh script to plot the graphs. The content of the post_process_write_ids.py file is as follows:

#!/usr/bin/env python3
import os
import sys

from gumby.statsparser import StatisticsParser


class IDStatisticsParser(StatisticsParser):
    """
    Simply read all the id.txt files created by instances and sum up the numbers inside them.
    """

    def aggregate_peer_ids(self):
        peer_id_sum = 0
        for _, filename, _ in self.yield_files('id.txt'):
            with open(filename, "r") as peer_id_file:
                read_peer_id = int(peer_id_file.read())
                peer_id_sum += read_peer_id

        with open("sum_id.txt", "w") as sum_id_file:
            sum_id_file.write("%d" % peer_id_sum)

    def run(self):
        self.aggregate_peer_ids()


# cd to the output directory
os.chdir(os.environ['OUTPUT_DIR'])

parser = IDStatisticsParser(sys.argv[1])
parser.run()

This Python file defines the IDStatisticsParser class which is a subclass of StatisticsParser. The latter class provides basic functionality to quickly aggregate data generated by peers. Of particular interest is the yield_files method that returns an iterator with files created by experiment peers that match a particular pattern. In the aggregate_peer_ids method we iterate through all the files named id.txt, read them, and aggregate the integer value included in these files. Then the result is written to the sum_id.txt file.

Running the Experiment

You can run the experiment with the following command:

$ gumby/run.py gumby/docs/tutorials/synchronized_instances.conf

This will execute the experiment described above. You should see something similar to the log lines below:

INFO:ProcessRunner:[] ERR: 2021-07-17 17:55:23,644:INFO:1 of 5 expected subscribers connected.
INFO:ProcessRunner:[] ERR: 2021-07-17 17:55:26,639:INFO:2 of 5 expected subscribers connected.
INFO:ProcessRunner:[] ERR: 2021-07-17 17:55:27,645:INFO:4 of 5 expected subscribers connected.
INFO:ProcessRunner:[] ERR: 2021-07-17 17:55:27,646:INFO:All subscribers connected!
INFO:ProcessRunner:[] ERR: 2021-07-17 17:55:27,646:INFO:1 of 5 expected subscribers ready.
INFO:ProcessRunner:[] ERR: 2021-07-17 17:55:27,649:INFO:All subscribers are ready, pushing data!
INFO:ProcessRunner:[] ERR: 2021-07-17 17:55:27,649:INFO:Pushing a 359 bytes long json doc.
INFO:ProcessRunner:[] ERR: 2021-07-17 17:55:27,649:INFO:1 of 5 expected subscribers received the data.
INFO:ProcessRunner:[] ERR: 2021-07-17 17:55:27,649:INFO:Data sent to all subscribers, giving the go signal in 1.0 secs.
INFO:ProcessRunner:[] ERR: 2021-07-17 17:55:27,650:INFO:Starting the experiment!

These log lines indicate that the spawned instances are connecting with the synchronization server and that the experiment starts only after all instances have an ID assigned and are synchronized.

A file named sum_id.txt should have been created in the experiment output directory. This file should contain the value 15, which corresponds to the sum of the IDs of all participating peers (1+2+3+4+5). Note that Gumby also creates sub-directories to store particular files created by individual peers. The directory name corresponds with the peer ID. These sub-directories should contain the id.txt file, created by the write_peer_id method.

This tutorial covere more advanced concepts of Gumby, and shows how one can setup advanced experiments using scenario files and experiment modules. In the next tutorial, we show how to deploy and execute the above experiment on the DAS5 supercomputer.