abstract:mills:transfer

Edit this page for MILLS suitable options

Transferring files to/from Mills

The following sections use the wiki's documentation conventions.

Be careful about modifications you make to your startup files (e.g. .bash*). Commands that produce output such as VALET commands may cause your file transfer command or application to not work. Log into the cluster with ssh to check what happens during login, and modify your startup files accordingly to remove any commands which are producing output and try again.

You can move data to and from the cluster using the following supported clients:

Command-line clients include:
sftp Recommended for interactive, command-line use.
hpn-sftp Recommended for interactive, command-line use with at least 1Gbps network connection, and high speed disk drives at both source and destination.1)
scp Recommended for batch script use.
hpn-scp Recommended for batch script use with at least 1Gbps network connection, and high speed disk drives at both source and destination. 2)
rsync Most appropriate for synchronizing the file directories of two systems when only a small fraction
of the files have been changed since the last synchronization.
Graphical-user-interface clients include:
winscp Windows only
fetch Mac OS X only
filezilla Windows, Mac OS X, UNIX, Linux
cyberduck Windows, Mac OS X (command line version for Linux)
Specifying the -c arcfour cipher option for command-line clients should provide the best transfer speeds.
  • sftp -c arcfour
  • hpn-sftp -c arcfour
  • scp -c arcfour
  • hpn-scp -c arcfour
  • rsync -e 'ssh -c arcfour'
  • rsync -e 'hpn-ssh -c arcfour'

This option is not available for the PuTTY command line mode unless you use a saved session with your encryption cipher selection policy set to Arcfour (SSH-2 only) at the top of the list.

For Windows clients editing files on Windows desktops and then transferring them back to the cluster, you may find that your file becomes "corrupt" during file transfer process. The symptoms are very subtle because the file appears to be okay, but in fact contains CRLF line terminators. This causes problems when reading the file on a Unix cluster and generates very strange errors. Some examples might be a file used for submitting a batch job such as submit.qs and one you have used before and know is correct, will no longer work. Or an input file used for ABAQUS like tissue.inp which has worked many times before produces an error like Abaqus Error: Command line option "input" must have a value..

Use the utility file to check for CRLF line terminators and dos2unix to fix it, like this below

[traine@mills ABAQUS]$ file tissue.inp
tissue.inp: ASCII text, with CRLF line terminators
[traine@mills ABAQUS]$ dos2unix tissue.inp
dos2unix: converting file tissue.inp to UNIX format ...
[traine@mills ABAQUS]$ file tissue.inp
tissue.inp: ASCII text

To copy a file over an SSH connection from a Mac/UNIX/Linux system to any of the cluster's filesystems, type the generic command

scp «options» «local_filename» «HPC_username»@«HPC_hostname»:«HPC_filename»

Begin the «HPC_filename» with a "/" to indicate the full path name. Otherwise the name is relative to your home directory on the HPC cluster.

Use the scp -r to copy an entire directory, for example…

  scp -c arcfour -r fuelcell traine@mills.hpc.udel.edu:/archive/it_css/projects

copies the fuelcell directory in your local current working directory into the /archive/it_css/projects directory on the Mills cluster. The /archive/it_css/projects directory on the Mills cluster must exist, and traine must have write access to it.

To copy a file over an SSH connection to a Mac/UNIX/Linux system from any of the cluster's files systems type the generic command

scp «options» «HPC_username»@«HPC_hostname»:«HPC_filename» «local_filename»

Begin the «HPC_filename» with a "/" to indicate the full path name. Otherwise, the name is relative to your home directory.

Use scp -r to copy the entire directory.

For example,

  scp -c arcfour -r traine@mills.hpc.udel.edu:/archive/it_css/project/fuelcell  .

will copy the directory fuelcell on the Mills cluster into a new fuelcell directory in your local system's current working directory. (Note the final period in the command.)

You can use GUI applications to transfer small files to and from your PC as a way to transfer between clusters, however this is highly inefficient for large files due to multiple transfers and slower disk speeds. As a result, you do not benefit from the arcfour encoding.

The command tools work the same on any Unix cluster. To copy a file over an SSH connection, first logon the file cluster1 and then use the scp command to copy files to cluster1. Usethe generic commands

ssh «options» «HPC_username1»@«HPC_hostname1»
scp «options» «HPC_filename1» «HPC_username2»@«HPC_hostname2»:«HPC_filename2»

Login to «HPC_hostname1» and in the scp command begin both «HPC_filename1» and «HPC_filename2» with a "/" to indicate the full path name. The clusters will most likely have different full path names.

Use ssh -A to enable agent forwarding and scp -r to copy the entire directory.3)

For example,

  ssh -A traine@mills.hpc.udel.edu
  cd archive/it_css/project
  scp -c arcfour -r fuelcell traine@farber.hpc.udel.edu:/home/work/it_css/project/fuelcell

will copy the directory fuelcell from Mills to a new fuelcell directory on Farber.


1) , 2)
hpn- command-line variants are based on High Performance SSH/SCP - HPN-SSH, OpenSSH 6.1 with hpn13v14 patch.
3)
If you are using PuTTY, skip the ssh step and connect to the cluster you want to copy from.
  • abstract/mills/transfer.txt
  • Last modified: 2018-05-23 12:25
  • by anita