====== Transferring files to/from DARWIN ======
//The following sections use the wiki's [[:#documentation-conventions|documentation conventions]].//
Be careful about modifications you make to your startup files (e.g. ''.bash*''). Commands that produce output such as VALET or workgroup commands may cause your file transfer command or application to fail. Log into the cluster with ''ssh'' to check what happens during login, and modify your startup files accordingly to remove any commands which are producing output and try again. See computing environment [[abstract:darwin:app_dev:compute_env#startup-and-logout-scripts|startup and logout scripts]] for help.
===== Common clients for file transfer =====
You can move data to and from the cluster using the following supported clients:
^ Command-line clients include: ^^
| **sftp**| Recommended for interactive, command-line use. |
| **scp**| Recommended for batch script use. |
| **rsync**| Most appropriate for synchronizing the file directories of two systems when only a small fraction \\ of the files have been changed since the last synchronization. |
| **Globus**| [[https://www.globusonline.org/|Globus]] web browser based and recommended for 'fire and forget' high-performance data transfers between systems within and across organizations.\\ See [[software:globus:globus|Globus]] for more details.|
| **Rclone**| [[https://rclone.org/|Rclone]] is a command line program to sync files and directories to and from popular cloud storage services. |
^ Graphical-user-interface clients include: ||
| [[http://udeploy.udel.edu/software/winscp/|winscp]]| Windows only |
| [[http://udeploy.udel.edu/software/fetch-ftp-client/|fetch]] | Mac OS X only |
| [[http://filezilla-project.org/download.php?type=client|filezilla]] | Windows, Mac OS X, UNIX, Linux |
| [[http://cyberduck.ch/|cyberduck]] | Windows, Mac OS X (command line version for Linux) |
| [[https://www.globusonline.org/|Globus]] | Web browser. See [[software:globus:globus|Globus]] for more details.|
For **Windows** clients editing files on Windows desktops and then transferring them back to the cluster, you may find that your file becomes "corrupt" during file transfer process. The symptoms are very subtle because the file appears to be okay, but in fact contains ''CRLF'' line terminators. This causes problems when reading the file on a Linux cluster and generates very strange errors. Some examples might be a file used for submitting a batch job such as ''submit.qs'' and one you have used before and know is correct, will no longer work. Or an input file used for ABAQUS like ''tissue.inp'' which has worked many times before produces an error like ''Abaqus Error: Command line option "input" must have a value.''.
Use the utility ''file'' to check for ''CRLF'' line terminators and ''dos2unix'' to fix it, like this below
[traine@login01 ABAQUS]$ file tissue.inp
tissue.inp: ASCII text, with CRLF line terminators
[traine@login01 ABAQUS]$ dos2unix tissue.inp
dos2unix: converting file tissue.inp to UNIX format ...
[traine@login01 ABAQUS]$ file tissue.inp
tissue.inp: ASCII text
===== Copying files to the cluster =====
To copy a file over an SSH connection from a Mac/UNIX/Linux system to any of the cluster's filesystems, type
the generic command
''scp'' </options//>> </local_filename//>> </HPC_username//>>''@''</HPC_hostname//>>'':''</HPC_filename//>>
Begin the </HPC_filename//>> with a "/" to indicate the full path name. Otherwise the name is relative to your
home directory on the HPC cluster.
Use the ''scp -r'' to copy an entire directory, for example...
scp -r fuelcell traine@darwin.hpc.udel.edu:/lustre/it_css/users/1201/projects
copies the ''fuelcell'' directory in your local current working directory into the ''/lustre/it_css/users/1201/projects'' directory on the DARWIN cluster. The ''/lustre/it_css/users/1201/projects'' directory on the DARWIN cluster must exist, and ''traine'' must have write access to it.
===== Copying files from the cluster =====
To copy a file over an SSH connection to a Mac/UNIX/Linux system from any of the cluster's files systems type
the generic command
''scp'' </options//>> </HPC_username//>>''@''</HPC_hostname//>>'':''</HPC_filename//>> </local_filename//>>
Begin the </HPC_filename//>> with a "/" to indicate the full path name. Otherwise, the name is relative to your
home directory.
Use ''scp -r'' to copy the entire directory.
For example,
scp -r traine@darwin.hpc.udel.edu:/lustre/it_css/users/1201/projects/fuelcell .
will copy the directory ''fuelcell'' on the DARWIN cluster into a new ''fuelcell'' directory in your local system's current working directory. (Note the final period in the command.)
===== Copying files between clusters =====
You can use GUI applications to transfer small files to and from your PC as a way to transfer between clusters, however this is highly inefficient for large files due to multiple transfers and slower disk speeds. As a result, you do not benefit from the //arcfour// encoding.
The command tools work the same on any Unix cluster.
To copy a file over an SSH connection, first logon the file cluster1 and then use the ''scp'' command to copy files to cluster1. Use the generic commands
''ssh'' </options//>> </HPC_username1//>>''@''</HPC_hostname1//>>\\
''scp'' </options//>> </HPC_filename1//>> </HPC_username2//>>''@''</HPC_hostname2//>>'':''</HPC_filename2//>>
Login to </HPC_hostname1//>> and in the ''scp'' command
begin both </HPC_filename1//>> and </HPC_filename2//>> with a "/" to indicate the full path name. The clusters will most likely have different full path names.
Use ''ssh -A'' to enable agent forwarding and ''scp -r'' to copy the entire directory.((If you are using PuTTY, skip the ''ssh'' step and connect to the cluster you want to copy from.))
For example,
ssh -A traine@caviness.hpc.udel.edu
cd archive/it_css/project
scp -r fuelcell traine@darwin.hpc.udel.edu:/lustre/it_css/users/1201/projects/fuelcell
will copy the directory ''fuelcell'' from Farber to a new ''fuelcell'' directory on DARWIN.