REANA is a reproducible research data analysis platform developed at CERN. It is considered for Analysis Preservation in PHENIX due to the following features:
REANA workflows can be represented as Directed Acyclic Graphs which is reflected in the YAML schema based on the Common Workflow Language (CWL). Each computational component of a workflow may require a separate and distinct Docker container, although individual steps can be as simple as a shell command writing a comment to a log file, in which case containers would be redundant.
Execution of workflows in REANA requires a properly configured REANA cluster. One such cluster is available to CERN users, and there are instances at other institutions. There is also a test instance currently being evaluated at BNL and it is available on the internal BNL network only. Access to REANA clusters is controlled by their administrators granting access tokens to qualified users. The user interacts with a REANA cluster via its network interface (HTTPS), either via the Web GUI for a quick overview of workflows in various stages of execution, or the CLI client which affords the user full access to all REANA functions. The client also makes it possible to use an automated agent for interaction with the system by scripting various actions.
To be able to access a REANA cluster the user must be issued an access token by the administrators (this may be specific to each institution hosting its REANA facility and typically involves visiting the requisite Web page). REANA client must be installed on the user’s machine. It is a Python-based tool so optimally this is done via the “virtual environment” mechanism:
# create new virtual environment virtualenv ~/.virtualenvs/reana source ~/.virtualenvs/reana/bin/activate # install reana-client (may need sudo) pip install reana-client
The “activate” step will be necessary if a new shell/window is created for interacting with REANA. A SSH tunnel is required to access the REANA cluster at BNL. Assuming a token has been obtained and a SSH tunnel established on port 30443 a test session might look like this:
# set REANA environment variables for the client export REANA_SERVER_URL=https://localhost:30443 export REANA_ACCESS_TOKEN=________ # user's REANA token # clone and run a simple analysis example git clone https://github.com/reanahub/reana-demo-root6-roofit cd reana-demo-root6-roofit reana-client run -w root6-roofit
By default the client will look up the workflow definition from the file
found in the current folder.
-w option (“workflow”) simply defines the handle/name by which this workflow will
be know to the system. The name can be anything. To specify a different workflow definition
file and a different name one might use something like
reana-client run -f my_workflow_file.yaml -w my_custom_workflow_name
Progress of REANA workflows can be tracked in the Web-based GUI provided by each cluster or via the CLI, reana-client. Likewise, outputs files generated by the workflows (including the example above) are available for download both via the GUI and the CLI. If a workflow is no longer useful it can be deleted from the REANA system:
reana-client delete -w my_custom_workflow_name
List of (many) commands that can be used with the client can be easily referenced by
--help option. There are also other options, some of the more
useful ones are listed here (mostly overriding default values):
-w name of the workflow -t access token -f file (default is "reana.yaml") -o path to the directory where the files are to be downloaded
One of the available options in the definition of a REANA workflow is
This option is not mandatory and performs a helper function in cases when contents of
a whole directory should be staged to the workspace of a running REANA process. This
can have unintended consequences, for example an attempt to stage a massive AFS
folder or some other file system with inherent latency or of a large size may result
in a lot of network traffic on the submitting host and the whole process taking
an unreasonably long time. Issues with storage quotas on the REANA cluster are also
Caution must be exercised.