Managing Data

How do I transfer data to/from HPC cluster to local machine or another cluster?

 

HPC users can transfer data between their local machine and the clusters.
They can use software or commands to do the data transfer.


Available Software
Multiple transfer software is available to copy data between the HPC clusters and the user's computer.

- Filezilla is a multi-platform client commonly used to transfer data to and from the cluster.

- Cyberduck is another popular file transfer client for Mac or Windows computers.

- WinSCP is Windows-only file transfer software.

- Globus is another common solution, especially for larger transfers.

Data Transfer using Command Line

The commands rsync and scp are command-line tools to transfer data to and from the cluster, and they should be run on your computer, not on the cluster:

- To transfer something to HPC clusters from your local computer:
     scp -r local-directory [USER]@[CLUSTER].hpc.ucdavis.edu:~/destination/

- To transfer something from a cluster to your local computer:
     scp -r [USER]@[CLUSTER].hpc.ucdavis.edu:~/[CLUSTER-DATA] local-directory
   

- To use rsync to transfer a file or directory from cluster to your local computer:
     rsync -aP -e ssh [USER]@[CLUSTER].hpc.ucdavis.edu:~/[CLUSTER-DATA] .
 

rsync has the advantage that if the connection is interrupted for any reason, you can just up-arrow and run the exact same command again and it will resume where it stopped.

See man scp and man rsync for more information.

If you are trying to copy over data from a web server file to your local machine or other server use the following command:

*Note that you don't have to log into the cluster to run this command and the files shouldn't be password protected:-*


scp -r [WEB-URL]:/home/[USER]/public_html/files path/to/the/directory/you/want/to/copy


scp -r [CLUSTER].hpc.ucdavis.edu:/home/[USER]public_html/public_web_files /path/to/the/directory/you/want/to/copy/destination
 

Agent Forwarding

Agent forwarding is a Secure Shell (SSH) feature that allows an SSH client to securely forward authentication credentials from the local machine to a remote server.

This feature is particularly useful in scenarios where HPC users need to access multiple clusters without having to authenticate themselves or transfer data between clusters repeatedly.

Keep in mind:-

- Users should have an account on each cluster to accomplish agent forwarding.

- Users need to make sure the agent is enabled on their machine.

   - On Windows machines, enter the command ssh-agent in the command prompt to ensure it is enabled.

   - Ubuntu and Machines running MacOS have built-in agent that can be verified using the command ssh-agent

- If ssh-agent is not running, start it using:

  eval $(ssh-agent)

This command starts the ssh-agent and sets up the necessary environment variables.

- Once ssh-agent is running, add your SSH private key using the following command:
ssh-add path/to/your/private-key
 

- To verify that your private key has been added, use:
ssh-add -l

Once you have enabled ssh-agent and added your private key, you can use the feature to transfer over data using scp or rsync commands.

How to transfer data from HPC server to Box using SFTP?


Some users may complain that outgoing sftp from HPC clusters to Box server is prohibited but it is not. The reason is on Box server's side, Box supports "ftps" which is a different protocol.

Here are the steps:-
 

1. Go to https://ucdavis.account.box.com/login

2. Create an external password on Box.com - Go to "Account Settings" and set up the password.

3. Create a folder on Box into which files will be transferred

4. Log into the cluster. Go to the level folder in which the folder/files you want to copy over

5. Log into Box from the cluster location:

lftp -u <Box username which is just ucd email> ftps://ftp.box.com

6. Enter the external password for Box.com

7. Use the "mirror" command to copy folder and contents to Box; -R flag for reverse (which is the "put" command for putting files from the cluster into
Box); -L --dereference flag for transferring symbolic links as actual files:

mirror -RL --dereference <source> <target>

You can use "get" and "put" to upload and download

get [OPTS] <rfile> [-o <lfile>]
put [OPTS] <lfile> [-o <rfile>]

Or use help command to see the list of all available options.

8. Exit out of the Box.
 

How can I set files' access permissions for different groups and users?


If you want to set read, write and execute permissions for shared files of a shared directory or your own files, you can use the chmod command (mostly requires root permission).
The chmod command is used to define or change permissions or modes on files and limit access to only those who are allowed access.

Example:
If you’re a owner of a file called Confidential and want to change the permissions or modes so that user can read / write and execute, group members can read and execute only and others can only read, you will run the commands below:

sudo chmod u=rwx,g=rx,o=r Confidential
 

The commands above changes the permission of the file called Confidential so that user can read (r), write (w)and execute (x) it… group members can only read (r) and execute (x) and others can only read (x) its content. The same command above can be written as shown below using its octal permission notation:

sudo chmod 754 Confidential

The following tables show types of access restrictions and user restrictions:-

User_Permissions_Chmod

Read more about directories' and files' permissions on: https://help.ubuntu.com/community/FilePermissions