Cloudmask Workflow
Cloudmesh cc comes with an example workflow that runs Cloudmask, which is a program that develops a model to classify sections of satellite images. Information regarding Cloudmask can be found at https://github.com/laszewsk/mlcommons/tree/main/benchmarks/cloudmask#readme
Running the Cloudmask Workflow on Rivanna
To execute the workflow on UVA’s HPC supercomputer, Rivanna, first ensure that your UVA Computing ID is set with the following command, replacing the X’s with your ID:
me@mycomputer $ cms set username=XXXXXX
We assume you have properly configured the UVA VPN by following the steps located at https://in.virginia.edu/vpn. The steps include installing a digital certificate and installing Cisco VPN; please follow them fully.
Additionally, you must have used ssh-copy-id XXXXXX@rivanna.hpc.virginia.edu
to automate the password login to Rivanna, as well as have
set up a proper ssh-agent
on your local computer, for your ssh-key.
We also assume that, if your local machine runs Windows,
that your Git Bash is set to use only LF line endings.
Then, clone the mlcommons
repository on your local machine
and run the workflow:
me@mycomputer $ cd ~/cm
me@mycomputer $ git clone --config core.autocrlf=false https://github.com/laszewsk/mlcommons.git
me@mycomputer $ cd mlcommons
me@mycomputer $ pytest -v -x --capture=no benchmarks/cloudmask/target/rivanna-cloudmesh-cc/rivanna/run_cloudmask_workflow.py
The workflow iterates through the five GPUs available on Rivanna— A100, V100, P100, RTX2080, and K80— and runs the program three times on each GPU. Each run trains the model with 10, 30, and 50 epochs for benchmarking.
Upon completing a run, the logs and benchmarks of the program can be found in the target folder:
me@mycomputer $ ssh rivanna
rivanna $ cd /scratch/$USER/mlcommons/benchmarks/cloudmask/target
Additionally, the generated .h5
model file can be found in the
home directory:
rivanna $ cd ~/sciml_bench/outputs/slstr_cloud/
The program may take a while to run if the resources on Rivanna are being used by other jobs.