Table of Contents

Transferring files to/from Caviness

The following sections use the wiki's documentation conventions.

Be careful about modifications you make to your startup files (e.g. .bash*). Commands that produce output such as VALET or workgroup commands may cause your file transfer command or application to fail. Log into the cluster with ssh to check what happens during login, and modify your startup files accordingly to remove any commands which are producing output and try again. See computing environment startup and logout scripts for help.

Common clients for file transfer

You can move data to and from the cluster using the following supported clients:

Command-line clients include:
sftp Recommended for interactive, command-line use.
hpn-sftp Recommended for interactive, command-line use with at least 1Gbps network connection, and high speed disk drives at both source and destination.1)
scp Recommended for batch script use.
hpn-scp Recommended for batch script use with at least 1Gbps network connection, and high speed disk drives at both source and destination. 2)
rsync Most appropriate for synchronizing the file directories of two systems when only a small fraction
of the files have been changed since the last synchronization.
Globus Globus is web browser based and recommended for 'fire and forget' high-performance data transfers between systems within and across organizations.
See Globus for more details.
Rclone Rclone is a command line program to sync files and directories to and from popular cloud storage services.
See Rclone on Caviness for setting up a remote configuration for Google Drive.
Graphical-user-interface clients include:
winscp Windows only
fetch Mac OS X only
filezilla Windows, Mac OS X, UNIX, Linux
cyberduck Windows, Mac OS X (command line version for Linux)
Globus Web browser. See Globus for more details.
Specifying the -c arcfour cipher option for command-line clients should provide the best transfer speeds.
  • sftp -c arcfour
  • hpn-sftp -c arcfour
  • scp -c arcfour
  • hpn-scp -c arcfour
  • rsync -e 'ssh -c arcfour'
  • rsync -e 'hpn-ssh -c arcfour'

This option is not available for the PuTTY command line mode unless you use a saved session with your encryption cipher selection policy set to Arcfour (SSH-2 only) at the top of the list.

For Windows clients editing files on Windows desktops and then transferring them back to the cluster, you may find that your file becomes "corrupt" during file transfer process. The symptoms are very subtle because the file appears to be okay, but in fact contains CRLF line terminators. This causes problems when reading the file on a Linux cluster and generates very strange errors. Some examples might be a file used for submitting a batch job such as submit.qs and one you have used before and know is correct, will no longer work. Or an input file used for ABAQUS like tissue.inp which has worked many times before produces an error like Abaqus Error: Command line option "input" must have a value..

Use the utility file to check for CRLF line terminators and dos2unix to fix it, like this below

[traine@login01 ABAQUS]$ file tissue.inp
tissue.inp: ASCII text, with CRLF line terminators
[traine@login01 ABAQUS]$ dos2unix tissue.inp
dos2unix: converting file tissue.inp to UNIX format ...
[traine@login01 ABAQUS]$ file tissue.inp
tissue.inp: ASCII text

Copying files to the cluster

To copy a file over an SSH connection from a Mac/UNIX/Linux system to any of the cluster's filesystems, type the generic command

scp «options» «local_filename» «HPC_username»@«HPC_hostname»:«HPC_filename»

Begin the «HPC_filename» with a "/" to indicate the full path name. Otherwise the name is relative to your home directory on the HPC cluster.

Use the scp -r to copy an entire directory, for example…

  scp -c arcfour -r fuelcell traine@caviness.hpc.udel.edu:/work/it_css/projects

copies the fuelcell directory in your local current working directory into the /work/it_css/projects directory on the Caviness cluster. The /work/it_css/projects directory on the Caviness cluster must exist, and traine must have write access to it.

Copying files from the cluster

To copy a file over an SSH connection to a Mac/UNIX/Linux system from any of the cluster's files systems type the generic command

scp «options» «HPC_username»@«HPC_hostname»:«HPC_filename» «local_filename»

Begin the «HPC_filename» with a "/" to indicate the full path name. Otherwise, the name is relative to your home directory.

Use scp -r to copy the entire directory.

For example,

  scp -c arcfour -r traine@caviness.hpc.udel.edu:/work/it_css/project/fuelcell  .

will copy the directory fuelcell on the Caviness cluster into a new fuelcell directory in your local system's current working directory. (Note the final period in the command.)

Copying files between clusters

You can use GUI applications to transfer small files to and from your PC as a way to transfer between clusters, however this is highly inefficient for large files due to multiple transfers and slower disk speeds. As a result, you do not benefit from the arcfour encoding.

The command tools work the same on any Unix cluster. To copy a file over an SSH connection, first logon the file cluster1 and then use the scp command to copy files to cluster1. Use the generic commands

ssh «options» «HPC_username1»@«HPC_hostname1»
scp «options» «HPC_filename1» «HPC_username2»@«HPC_hostname2»:«HPC_filename2»

Login to «HPC_hostname1» and in the scp command begin both «HPC_filename1» and «HPC_filename2» with a "/" to indicate the full path name. The clusters will most likely have different full path names.

Use ssh -A to enable agent forwarding and scp -r to copy the entire directory.3)

For example,

  ssh -A traine@farber.hpc.udel.edu
  cd archive/it_css/project
  scp -c arcfour -r fuelcell traine@caviness.hpc.udel.edu:/work/it_css/project/fuelcell

will copy the directory fuelcell from Farber to a new fuelcell directory on Caviness.

1) , 2)
hpn- command-line variants are based on High Performance SSH/SCP - HPN-SSH, OpenSSH 6.1 with hpn13v14 patch.
3)
If you are using PuTTY, skip the ssh step and connect to the cluster you want to copy from.