File systems
Overview
Your home directory (the directory you start off in when you log in) is a place for small files, such as scripts, notes, source code, or compiled libraries or programs. Your quota for this space is 10GB.
If you need to read/write large datasets or many files, please use the ~/data or ~/scratch shortcuts in your home directory. These point your directories within the /gpfs/data and /gpfs/scratch file systems, which are much larger and tuned for large data:
- Scratch is a temporary space that is not backed up and has a 512 GB quota, so it is useful for storing temporary output from programs that you do not need to keep for more than 30 days.
- Your data directory is backed up on a nightly basis, but has a 256GB quota (per Brown faculty member, and shared by all members of a lab or research group), although this quota can be increased by purchasing additional storage.
- there is also a quota limit on the number of files per file system at the user level. Home directories have a limit of 200,000 files, scratch directories 8,000,000 files, and data directories varies according to your allocation (about 1000 files per GB, per group).
Full Details
CCV uses IBM's General Parallel File System (GPFS) for users' home directories, data storage, scratch/temporary space, and runtime libraries and executables. A separate GPFS file system exists for each of these uses, in order to provide tuned performance. These file systems are mounted as:
/gpfs/home – Home directories are located here. It is a relatively small file system on SAS disks and is optimized for small I/O operations (e.g., for compiling code). [Nightly backups]
/gpfs/data – Research data sets should be kept here. It is a large file system on SATA disks, and provides very high read bandwidth and moderate write bandwidth. In your home directory there is a link ~/data that points to your data directory). [Nightly backups]
/gpfs/scratch – Application output and temporary files should be written here. It is a large file system on SATA disks, and provides very high read and write bandwidth. In your home directory there is a link ~/scratch that points to your scratch directory. [No backups!]
Important: scratch is temporary storage and will be periodically purged by deleting files that are more than 4 weeks old. Any application output that you would like to save for long-term use must be copied from scratch to data!
/gpfs/runtime – CCV-managed libraries and executables are located here. These are read-only and accessible using the Modules system.
In general, your application should read any initial input data from data and write all output into scratch. Then, when the application has finished, move or copy data you would like to save from scratch to data.
