# LOFAR PILOT This is a small pipeline runner script that wraps Common Workflow Language ([CWL](https://www.commonwl.org/)) pipelines with [toil](https://toil.readthedocs.io). It is compatible with [LINC](https://git.astron.nl/RD/LINC) and the [VLBI](https://git.astron.nl/RD/VLBI-cwl/) pipelines. *This is a work in progress. Issues should be reported to [Matthijs van der Wild](mailto:matthijs.van-der-wild@durham.ac.uk).* ## Assumptions This script assumes the following: * All relevant input data is available either in either the `$HOME` directory or in a directory henceforth called `$BINDDIR`. Targets of any links in these directories should be accessible to the compute directories, as these will be mounted during relevant jobs. * This script will be used with the SLURM queuing system on COSMA5 with the following options: `-p cosma5 -A durham -t 72:00:00`. If these options are not appropriate or if this script is to be run on other SLURM-run clusters one must set `$TOIL_SLURM_ARGS` prior to running. * `$CWL_SINGULARITY_CACHE` is set and the corresponding path contains (a link to) a singularity container `vlbi-cwl.sif`. If it isn't set a suitable container can be specified as detailed below. ## Execution The script can be run as follows: ``` sh pilot.sh [options] $BINDDIR ``` Options can be the following: * `-h` prints the script usage with all available options (optional). * `-r` restarts a failed pipeline, if this script was run before but the pipeline failed. * `-c` allows the pipeline to use the specified container (optional, VLBI pipeline only). * `-i` points to your input JSON file (so it can be any appropriate JSON file, as long as it is located in either `$HOME` or `$BINDDIR`. * `-p` is a path to the pipeline repository (LINC and VLBI pipeline only). * `--scratch` is a path to local scratch storage where temporary data can be written to (optional). **`--scratch` must be local to the compute node. Nonlocal scratch storage will likely cause the pipeline to fail.** * `--outdir` is a path relative to which intermediate files and final data products will be written. Will be created if it does not exist. If not specified, `$BINDDIR` will be used instead. * `--batch_system` specifies the queuing system to be used. Defaults to `slurm`. Use `single_machine` to run on the local node. * `` is the workflow file name without extension, e.g. `delay-calibration` or `concatenate-flag` for the VLBI pipeline or `HBA_calibrator` or `HBA_target` for LINC. ## Notes * Upon successful pipeline completion the results directory contains the following: * The pipeline data products, * the statistics gathered by toil. * Jobstore files and intermediate pipeline data products are stored in a `toil` directory in `$BINDDIR`. * Jobstore files can be removed by running `toil clean $BINDDIR/toil/_job`. * Toil may not clear temporary files after the pipeline has finished. These have to be removed by hand.