UnifyFS Walkthrough
Access the Tutorial Cluster
The tutorial cluster has one login node and four available compute nodes.
Assuming you have requested an account. If not, request it here.
SSH into the cluster:
ssh -p 2223 username@13.215.163.223
password: P@ssw0rd
Preparation
In this tutorial, we will run two programs with and without UnifyFS. To keep things tidy, we create a separate run directory for each application, and a UnifyFS working directory to store UnifyFS logs and internal metadata.
cd ~/
mkdir -p ./runs/ior/unifyfs-workdir
mkdir -p ./runs/flashx/unifyfs-workdir
UnifyFS has already been installed on the cluster and is available as a loadable module.
Simply load it.
module load unifyfs
After loading the module, these two commands become available: unifyfs and unifyfs-ls.
Applications Used in This Tutorial
This tutorial uses IOR and FlashX to demonstrate UnifyFS.
👉 Good news:
You do not need to change application source code to use UnifyFS.
You only need executables that are linked with the UnifyFS client library.
Part 1 - IOR Benchmarking
Load the IOR benchmark:
module load ior
We use Slurm to manage the tutorial cluster. Allocate 2 nodes with 8 CPU tasks per node:
salloc -N2 --tasks-per-node=8
1. Run IOR without UnifyFS
Notice the write bandwidth in the end.
cd ~/runs/ior/
# Run IOR using 8 processes on 1 node
srun -N1 --tasks-per-node=8 \
ior -a MPIIO -t 4m -b 16m -o ./testFile
# Run IOR using 16 processes on 2 nodes
srun -N2 --tasks-per-node=8 \
ior -a MPIIO -t 4m -b 16m -o ./testFile
IOR Parameters Explained:
-a MPIIO: Use MPI-IO-t 4m: 4 MB transfer size-b 16m: 16 MB block size per process-o: Output file name
2. Run IOR with UnifyFS
Now repeat the same experiment using UnifyFS and compare the write bandwidth.
Setup and start UnifyFS servers
export UNIFYFS_LOG_DIR=`pwd`/unifyfs-workdir/
export UNIFYFS_LOGIO_SPILL_DIR=/mnt/ssd/$USER
# Create spill directory on each allocated node
srun -N $SLURM_NNODES --tasks-per-node=1 mkdir -p $UNIFYFS_LOGIO_SPILL_DIR
# This command starts one UnifyFS server daemon per compute node
# (may take a few seconds)
unifyfs start -d -S `pwd`/unifyfs-workdir/
Run IOR with and without UnifyFS
Key idea: Files written under /unifyfs are intercepted by the UnifyFS client library and stored in UnifyFS-managed storage.
# One node
srun -N1 -n8 --tasks-per-node=8 \
ior -a MPIIO -t 4m -b 16m -o ./testFile
srun -N1 -n8 --tasks-per-node=8 \
ior-unifyfs -a MPIIO -t 4m -b 16m -o /unifyfs/testFile
# Two nodes
srun -N2 -n16 --tasks-per-node=8 \
ior -a MPIIO -t 4m -b 16m -o ./testFile
srun -N2 -n16 --tasks-per-node=8 \
ior-unifyfs -a MPIIO -t 4m -b 16m -o /unifyfs/testFile
Inspect files stored in UnifyFS
unifyfs-ls
Shut down UnifyFS
unifyfs terminate
Part 2 - FlashX
We have built a FlashX 3D Sedov simulation and packaged it as a module.
This module provides two executables:
flashx: Original Sedov simulationflashx-unifyfs: Same simulation linked with UnifyFS
module load flashx-sedov
Assuming you are still inside the Slurm allocation from the IOR experiment (if not, allocate two nodes again).
Switch to the FlashX run directory.
cd ~/runs/flashx
Start UnifyFS (same as before)
export UNIFYFS_LOG_DIR=`pwd`/unifyfs-workdir/
export UNIFYFS_LOGIO_SPILL_DIR=/mnt/ssd/$USER
srun -N $SLURM_NNODES --tasks-per-node=1 mkdir -p $UNIFYFS_LOGIO_SPILL_DIR
unifyfs start -d -S `pwd`/unifyfs-workdir/
FlashX Input File
Copy the prepared ./flash.par file, which FlashX will read as its input configuration file.
cp /home/sca26-tut/share/flash.par ./
The key paramete is basenm (last line), which controls the output file location and base name.
#basenm = "/unifyfs/sedov_"
basenm = "sedov_"
Run FlashX without UnifyFS
# Run the 3D Sedov simulation, which roughly takes 2 mins.
export HDF5_DO_MPI_FILE_SYNC=FALSE
srun flashx
After the run completes, inspect sedov.log and look for the writeCheckpoint
messages. These lines report the time spent writing checkpoint files.
Run FlashX with UnifyFS
Edit the last line of flash.par:
basenm = "/unifyfs/sedov_"
This configuration instructs FlashX to write its output files to the UnifyFS file system
Now run the UnifyFS-enabled executable:
srun flashx-unifyfs
Finally, compare the checkpoint write times with and without UnifyFS. In this setup, UnifyFS typically reduces the checkpoint write time by about 50% (from roughly 4 seconds to around 2 seconds), though exact numbers may vary.
Inspect files and shut down UnifyFS
unifyfs-ls
unifyfs terminate
Optional: How These Applications Were Built
IOR
Modify the IOR Makefile to link against UnifyFS:
LDFLAGS = -L${UNIFYFS_DIR}lib -lunifyfs_mpi_gotcha
FlashX - 3D Sedov
./setup Sedov -auto -3d +parallelio -nxb=16 -nyb=16 -nzb=16
cd object
make -j 8
This produces an executable called flashx.
To build the UnifyFS-enabled version, modify Makefile.h and include:
-L$UNIFYFS_DIR/lib -lunifyfs_mpi_gotcha