CERN Accelerating science

This website is no longer maintained. Its content may be obsolete. Please visit for current CERN information.

CERN home pageCERN home pageThe Consult pageThe Consult pageThis pageThe Consult pageThis pageThis pageHelp, Info about this page
Previous:Y2K Testing Continues in the Autumn(See printing version)
Next:Changes to Interactive and Batch Registration

A New UNIX Hierarchical Storage Management Service - The hsm Command

Harry Renshall , IT/PDP

At the FOCUS meeting of July 1st 1999, Judy Richards presented an analysis of current file management and personal stage requirements (see She discussed Hierarchical Storage Management (HSM), in which files are moved from a disk pool to a tape pool on a 'least recently used' basis. This is currently implemented at CERN under an IBM service offering called HPSS, the High Performance Storage System, but there is concern over management and administration costs for this application. Use is therefore limited to modest amounts of data and straightforward applications such as Central Data Recording and large private user files. Access is via CERN interfaces so that user data could be migrated transparently if the underlying storage management were to be changed.

The main interfaces up to now have been the CERN staging system commands such as stagein, stageout, stagewrt and the remote file copy command rfcp. These will both stay in use but we have now packaged rfcp into a new command, called hsm, which hides the fact that the underlying storage manager is HPSS and will hence allow any future migration to another storage manager. The stage commands copy data between tape staging pools and the HSM filebase. The new hsm command will copy data between local workstation or PC disks and the HSM filebase. Both sets of commands work on any files stored in the HSM filebase.

The basic functions of the hsm command approved by FOCUS are:

The HSM service hence appears to users as a large back-end file store in which they have a personal home directory, currently of the form /hpss/, into which they can put local files (via hsm put), get them back (via hsm get), audit their stored files (via hsm ls) , delete stored files (via hsm delete) and create directory structures (via hsm mkdir). The hsm command defaults to the home directory of the logged-in user. It is automatically available in the CERN ASIS tree and has a full man page (i.e. type "man hsm"). We have sized the resources in the back-end store assuming from 500 to 1000 users, each with from 1 to 10 GB stored. Users have to be registered in the CERN Computer Users data base by their group administrator to use the HSM. Since the underlying software uses the account registry called DCE we have introduced a DCE service (at the same level as AFS or TAPES) into the XUSERREG command and group administrators should register the requested UNIX loginid in this service.

The HSM service is intended for users with disk space requirements beyond those of AFS home directory or user project space, though performance reasons can also be important. Our currently policy is to allow user AFS home directories to grow up to 300 MB of quota of what are typically small files (a few KB to several MB) and this is well backed up expensive disk space. There is an archive facility, called pubarch, which matches this type of data.

We have recently introduced an extension of AFS project space to individual users using cheaper but backed up disks for larger files, tens of MB, with a quota of hundreds of MB to a few GB. This space is suitable for interactive work with files up to a few tens of MB, as it uses the AFS caching mechanism, or for batch work on larger files where response is not critical but it is not suitable for high data rate work with larger files from tens of MB up to 2 GB. For this type of data we recommend using the hsm command to copy from the HSM to the local workstation disk and working with the local copy, which can be moved back to the HSM if modified. Note that to the hsm command AFS directories appear to be local disks. However, given the architecture of AFS it does not make sense to use the HSM filebase as an active back-end for AFS project space and users should rather try to obtain more project space. It is, however, a reasonable use of the HSM to use it as an archive from AFS project space for data that is unlikely to be reused, since the permanent store of the HSM is cheap tape rather than expensive disk.

We will be happy to discuss with group space administrators or individual users the appropriate storage to use in case of doubt.

For matters related to this article please contact the author.

Last Updated on Fri Sep 24 18:02:38 GMT+03:30 1999.
Copyright © CERN 1999 -- European Laboratory for Particle Physics