User-curated data¶
Namespace | CVMFS path | Read scope | Write scope |
---|---|---|---|
/igwn/cit/staging | N/A | read:/staging | write:/staging |
IGWN supports distributiong user-curated data on OSDF as part of the /igwn/cit
OSDF namespace.
These data are not curated.
These data are managed entirely by individual users, and are not curated or managed by the IGWN Computing team.
This namespace is primarily to support staging of user data for workflows where transferring directly to/from the Access Point may be prohibitively slow or may overwhelm that service.
Reading data from /igwn/cit
¶
Data in the /igwn/cit
namespace can be accessed using the OSDF client.
Downloading a file from /igwn/cit
.
stashcp /igwn/cit/staging/duncan.macleod/hello.txt ./hello.txt
Reading data requires a valid SciToken.
Reading data from the /igwn/cit/staging
namespace requires a valid SciToken with the read:/staging
scope.
Authorised users can read data from any other user.
The read:/staging
token scope grants read-only access to all data under the /igwn/cit/staging
OSDF path from any user.
There is no support for publishing data under /igwn/cit
and restricting access to a subset of IGWN members.
Data are not readable via CVMFS.
Unlike the other IGWN OSDF namespaces, data published into the /igwn/cit
namespace are not made available via CVMFS.
They are only available using the OSDF client.
Reading data using a robot identity is currently unsupported.
Robot identities, such as those used by service systems or shared accounts, may be issued tokens with the read:/staging
scope. However, it is not currently possible to actually use that token to read data from the /igwn/cit/staging
namespace.
The IGWN Computing group are actively attempting to address this limitation, for details, see computing/helpdesk#4805.
Publishing data into /igwn/cit
¶
Publishing data from CIT¶
Data may be published into the /igwn/cit
OSDF namespace directly from any access point at the LDAS-CIT computing centre by copying the files to a directory you create under
/osdf/igwn/cit/staging/<USER>/
(where <USER>
should be replaced by your actual user name on that system). This can be done using the standard cp
command, or any similar utility.
These data will then be immediately visible to the OSDF client under the equivalent namespace path
/igwn/cit/staging/<USER>/...
(i.e. just remote the /osdf
prefix)
Files copied into /osdf/igwn/cit/staging
cannot be modified.
For details see below.
Publishing data remotely¶
Data may also be published from any access point (not only at CIT) on the command line using the OSDF client and specifying the source via a file://
URL and the destiniation via an osdf://
URL:
stashcp file://</path/to/local/file> osdf:///igwn/cit/staging/<USER>/...
Do not use stashcp directly in HTCondor workflows
The OSDF client should not be used directly in HTCondor workflows, instead OSDF URLs can be given as values in the transfer_output_remaps
HTCondor submit option.
For full details see User-curated data.
Writing data requires a valid SciToken.
Writing data to the /igwn/cit/staging
namespace requires a valid SciToken with the write:/staging
scope.
An authorised user is only permitted to write to their own sub-path (/igwn/cit/staging/<USER>
).
Files published into /igwn/cit/staging
cannot be modified.
For details see below.
Writing data remotely using a robot identity is currently unsupported by the IGWN token issuer.
Robot identities, such as those used by service systems or shared accounts, may be issued tokens with the write:/staging
and read:/staging
scopes. However, it is not currently possible to actually use that token to read data from nor write data to the /igwn/cit/staging
namespace on those access points that use the IGWN token issuer. However, it is possible for shared accounts (without the use of a robot identity) to read and write data to the corresponding path under /igwn/cit/staging
on access points that use an AP token issuer in HTCondor. For more details on these differences, please read the documentation.
The IGWN Computing group are actively attempting to address this limitation on the IGWN issuer; for details, see computing/helpdesk#4805.
Users of shared accounts can write data locally at CIT as detailed above, regardless of which token issuer HTCondor uses on that access point.
Published files cannot be modified¶
Files distributed through the OSDF use a global network of caches. Whenever your job runs it will receive an input file from the nearest cache, which will not check whether there is a different version of the file at the origin server at CIT. If you change the contents of any file, it will be impossible to ensure that all jobs receive the updated version of that file.
Therefore, rather than changing the contents of a file, change its name or directory (or both), and submit a new job or workflow that points at this new path.