Running Distributed Experiments on the DAS5 =========================================== In the previous tutorials we have devised experiments that spawn multiple instances on a single computer. In this tutorial, we will show how to run an experiment on the `DAS5 compute cluster `_. The DAS5 is a Dutch nation-wide compute infrastructure that consists of multiple clusters, managed by different universities. This tutorial assumes that the reader has access to the DAS5 head nodes. The experiment we will run on the DAS5 is the same experiment that we described in the previous tutorial. In this experiment, we will spawn multiple (synchronized) instances and each instance will write its ID to a file five seconds after the experiment starts. The experiment is started on the DAS5 head node and before the instances spawn, Gumby automatically reserves a certain number of compute nodes. Each compute node then spawns a certain number of instances, depending on the experiment configuration. When the experiment ends, all data generated by instances is collected by the head node. The configuration file for this DAS5 experiment looks as follows: .. literalinclude:: synchronized_instances_das5.conf The new configuration options are annotated with some explanation. It includes a ``local_setup_cmd`` configuration option that is executed before the experiment starts. The ``das4_setup.sh`` script checks the user quote on the DAS5 and invokes the ``build_virtualenv.sh`` script that prepares a virtual environment with various Python packages. To use this virtual environment, the ``use_local_venv`` option is set. Additionally, there are a few configuration options that are specific to DAS5 experiments. The ``node_amount`` configuration option indicates how many DAS5 compute nodes are used. The maximum number of compute nodes in each cluster can be found `here `_. In our experiment, we spawn 16 instances and use 2 compute nodes. Gumby automatically balances instances over compute nodes and in our experiment, each compute node hosts 8 instances. The ``node_timeout`` configuration option indicates the timeout of the experiment. To prevent premature termination of an experiment, we recommend to set this value a bit higher than the time of the latest event in the scenario file. To run this experiment, execute the following command on one of the DAS5 head nodes: .. code-block:: bash $ gumby/run.py gumby/docs/tutorials/synchronized_instances_das5.conf