Horizon - Project storage (/data)
This page provides comprehensive information about the "Horizon" /data mount, a dedicated storage solution for Research Computing supercomputers.
Cost
| Storage | Protocols | Purpose | Free Starting Amount | Cost per TB per year |
|---|---|---|---|---|
| Project-based Storage | NFS, SMB, Globus | Project file storage | 100GB | $50 |
Horizon Access
Project storage is available on both Phoenix and Sol supercomputers under /data. This path is auto-mounted, so it may not appear in ls /data until you cd into a project directory.
These filesystems are mounted on the clusters via NFS. By default, NFSv3 is used. If NFSv4 with support for file access control lists (FACLs) is required, please contact us.
SMB access is available campus-wide over the Cisco VPN for shares that are explicitly configured for SMB.
Usage Guidelines
- Ensure that the data stored complies with our data policy and legal regulations. Horizon does not support storing sensitive data.
- Regular backups of critical data are recommended, even though our systems are designed for high reliability.
- For SMB access, follow standard security practices to maintain the integrity of your data.
Backup Policies
-
Snapshots are taken at the array level, nightly.
- Snapshots done on the array are resilient, but not redundant.
- Snapshots are kept for a maximum of 14 days.
- To recover data from a snapshot, see this guide.
-
All data is kept on-site and is not replicated to secondary storage.
/data mounts not listing
/data is used for numerous purposes, from community dataset storage, group-owned project storage, and class shared-storage, among other things. In totality, this amounts to several thousand directories.
However, if you navigate /data, you might find only a few directories listed. This is an efficiency feature: all the desired directories are available all the time, but they will not be listed until they are actively used.
For example:
[rcsparky@sol-login01:/data]$ ls
amciilab bioxfel grp_rcadmins
# cd datasets
[rcsparky@sol-login01:/data/datasets]$ ls
community
[rcsparky@sol-login01:/data/datasets]$ ls ..
amciilab bioxfel datasets grp_rcadmins
Note, this step is not required to use the data--all /data directories are available at all times to all nodes; this is simply whether the directory is enumerated on directory listings of /data.
Web Portal
Alternatively, it might be common to navigate to /data from the web portal or other browser; in all cases, simply typing in /data/<directory> or /data/<directory>/<subdir2>/<subdir3...> will always still work.