abstract:darwin:transfer

Transferring files to/from DARWIN

The following sections use the wiki's documentation conventions.

Be careful about modifications you make to your startup files (e.g. .bash*). Commands that produce output such as VALET or workgroup commands may cause your file transfer command or application to fail. Log into the cluster with ssh to check what happens during login, and modify your startup files accordingly to remove any commands which are producing output and try again. See computing environment startup and logout scripts for help.

You can move data to and from the cluster using the following supported clients:

Command-line clients include:
sftp Recommended for interactive, command-line use.
scp Recommended for batch script use.
rsync Most appropriate for synchronizing the file directories of two systems when only a small fraction
of the files have been changed since the last synchronization.
Globus Globus web browser based and recommended for 'fire and forget' high-performance data transfers between systems within and across organizations.
See Globus for more details.
Rclone Rclone is a command line program to sync files and directories to and from popular cloud storage services.
Graphical-user-interface clients include:
winscp Windows only
fetch Mac OS X only
filezilla Windows, Mac OS X, UNIX, Linux
cyberduck Windows, Mac OS X (command line version for Linux)
Globus Web browser. See Globus for more details.
For Windows clients editing files on Windows desktops and then transferring them back to the cluster, you may find that your file becomes "corrupt" during file transfer process. The symptoms are very subtle because the file appears to be okay, but in fact contains CRLF line terminators. This causes problems when reading the file on a Linux cluster and generates very strange errors. Some examples might be a file used for submitting a batch job such as submit.qs and one you have used before and know is correct, will no longer work. Or an input file used for ABAQUS like tissue.inp which has worked many times before produces an error like Abaqus Error: Command line option "input" must have a value..

Use the utility file to check for CRLF line terminators and dos2unix to fix it, like this below

[traine@login01 ABAQUS]$ file tissue.inp
tissue.inp: ASCII text, with CRLF line terminators
[traine@login01 ABAQUS]$ dos2unix tissue.inp
dos2unix: converting file tissue.inp to UNIX format ...
[traine@login01 ABAQUS]$ file tissue.inp
tissue.inp: ASCII text

To copy a file over an SSH connection from a Mac/UNIX/Linux system to any of the cluster's filesystems, type the generic command

scp «options» «local_filename» «HPC_username»@«HPC_hostname»:«HPC_filename»

Begin the «HPC_filename» with a "/" to indicate the full path name. Otherwise the name is relative to your home directory on the HPC cluster.

Use the scp -r to copy an entire directory, for example…

  scp -r fuelcell traine@darwin.hpc.udel.edu:/lustre/it_css/users/1201/projects

copies the fuelcell directory in your local current working directory into the /lustre/it_css/users/1201/projects directory on the DARWIN cluster. The /lustre/it_css/users/1201/projects directory on the DARWIN cluster must exist, and traine must have write access to it.

To copy a file over an SSH connection to a Mac/UNIX/Linux system from any of the cluster's files systems type the generic command

scp «options» «HPC_username»@«HPC_hostname»:«HPC_filename» «local_filename»

Begin the «HPC_filename» with a "/" to indicate the full path name. Otherwise, the name is relative to your home directory.

Use scp -r to copy the entire directory.

For example,

  scp -r traine@darwin.hpc.udel.edu:/lustre/it_css/users/1201/projects/fuelcell  .

will copy the directory fuelcell on the DARWIN cluster into a new fuelcell directory in your local system's current working directory. (Note the final period in the command.)

You can use GUI applications to transfer small files to and from your PC as a way to transfer between clusters, however this is highly inefficient for large files due to multiple transfers and slower disk speeds. As a result, you do not benefit from the arcfour encoding.

The command tools work the same on any Unix cluster. To copy a file over an SSH connection, first logon the file cluster1 and then use the scp command to copy files to cluster1. Use the generic commands

ssh «options» «HPC_username1»@«HPC_hostname1»
scp «options» «HPC_filename1» «HPC_username2»@«HPC_hostname2»:«HPC_filename2»

Login to «HPC_hostname1» and in the scp command begin both «HPC_filename1» and «HPC_filename2» with a "/" to indicate the full path name. The clusters will most likely have different full path names.

Use ssh -A to enable agent forwarding and scp -r to copy the entire directory.1)

For example,

  ssh -A traine@caviness.hpc.udel.edu
  cd archive/it_css/project
  scp -r fuelcell traine@darwin.hpc.udel.edu:/lustre/it_css/users/1201/projects/fuelcell

will copy the directory fuelcell from Farber to a new fuelcell directory on DARWIN.


1)
If you are using PuTTY, skip the ssh step and connect to the cluster you want to copy from.
  • abstract/darwin/transfer.txt
  • Last modified: 2024-04-08 17:01
  • by anita