Offline registration of instrument data¶
This guide assumes some familiarity with the:
- Rucio client configuration
- IGWN Rucio kubernetes deployment model and the Rucio client utilities "kustomization" configuration
Walkthrough¶
To illustrate offline batch registration, we'll run through an example where we register frame files for the calibrated KAGRA strain data "C20" for the post-O3 GEO600-KAGRA observing period known as O3GK.
Check the data¶
In this example, the data exists on disk at the ICRR in Tokyo:
[ldr@pegasus-01 ~]$ find /gpfs/data/proc/C20/O3GK/K1 -name "*.gwf" | head -2
/gpfs/data/proc/C20/O3GK/K1/12714/K-K1_HOFT_C20-1271451680-32.gwf
/gpfs/data/proc/C20/O3GK/K1/12714/K-K1_HOFT_C20-1271406528-32.gw
We have already configured a non-deterministic RSE at this end-point, the non-deterministic nature of which allows us to register data sets without imposing constraints on file paths at sites outwith our control. We call this RSE ICRR-STAGING
:
$ rucio-admin rse info ICRR-STAGING | grep 'deterministic\|site\|hostname\|prefix\|scheme\|port'
deterministic: False
site: Tokyo
hostname: kagra-ldr.icrr.u-tokyo.ac.jp
prefix: /data/proc
scheme: gsiftp
port: 2811
Confirm authenticated connectivity from the Nautilus k8s cluster using gfal-ls
in a utilities pod:
$ kubectl exec registrar-client-utils-784ddf7c7-q7rsb -c gfal2-utils -- \
> /usr/bin/gfal-ls -l gsiftp://kagra-ldr.icrr.u-tokyo.ac.jp:2811/data/proc
drwxr-xr-x 1 15002 1003 4096 Apr 18 2020 C00
drwxr-xr-x 1 15002 1003 4096 Mar 12 2020 C10
drwxr-xr-x 1 15002 1003 4096 Nov 2 07:00 C20
Create the file list¶
Next, we need to enumerate the files we want to register. The gwrucio_registrar
client accepts either an explicit listing of physical file names (PFNs) or the ASCII dump from LDAS diskcache (or pmdc). Since this is a static dataset on a host to which we have login access, let's just use the explicit list of PFNs. From a host with POSIX access to the dataset, execute:
find /gpfs/data/proc/C20/O3GK/K1 -name "*.gwf" > O3GK:K1_HOFT_C20.pfns
This manual step may be eliminated in future by performing an equivalent operation remotely from the registrar client pod, or by using the cache dump produced by diskcache/pmdc under the control of a service manager / cron daemon. For offline registration, the cache dump is mostly useful for datasets which are updated regularly; PFN lists are useful for one-off registration jobs, like this, and for very large datasets which can be registered via multiple parallel jobs.
Configure gwrucio_registrar
¶
The registration client is configured via a simple ini
file:
gwrucio_registrar
ini file
```ini
###### Data discovery and registration configuration¶
[global] ; Placement of data in Rucio schema scope=O3GK dataset=K-K1_HOFT_C20
[data] ; Arguments used to extract PFNs from a diskcache dump regexp=K1_HOFT_C20 minimum-gps=1000000000 maximum-gps=2000000000 extension=gwf
[common-metadata] ; This metadata will be attached to all files in the rucio database in this ; dataset. Each file's metadata will also include the frame "contents" (e.g. ; K1_HOFT_C20) and the GPS start/stop times. ifo=K1 calibration=C20
################ K8s Job Configuration¶
[gwrucio] ; Name and location for remote access to the diskcache / file-list diskcache=O3GK:K1_HOFT_C20.pfns url=gsiftp://kagra-ldr.icrr.u-tokyo.ac.jp:2811/home/ldr/rucio/filelists ; Site for initial registration of data rse=ICRR-STAGING ; gwrucio_registrar is a multi-threaded application threads=8
[metadata] ; Used to label k8s jobs. Note k8s resource labels must obey RFC 1123. name=o3gk-k1-hoft-c20 tier=jobs
[rucio] ; Rucio client configmap for e.g., Rucio server name and auth. Note k8s ; resource labels must obey RFC 1123. configmap=ligo-rucio-cfg configfile=ligo-rucio.cfg ```
Kubernetes job configuration¶
Multiple registration jobs (offline) and deployments (online) are managed through kustomization overlays. The base kustomization file and the job manifest can be created automatically from the ini file above using jinja templates via a simple bash script
Using those templates, our configuration file yields a job directory:
O3GK-K1_HOFT_C20
├── kustomization.yaml
├── O3GK-K1_HOFT_C20.ini
└── o3gk-k1_hoft_c20.yaml
where
kustomization.yaml
: the kustomization "base" layerO3GK-K1_HOFT_C20.ini
: a copy of the configuration fileo3gk-k1_hoft_c20.yaml
: the registration job manifest
The kustomization base generates a configMap from the gwrucio_registrar
configuration file which can then be mounted as a volume in the registration pod:
Registration job base kustomization file
kustomization base generated from a bash script and jinja templates
```yaml apiVersion: kustomize.config.k8s.io/v1beta1 kind: Kustomization
configmapGenerator: - files: - O3GK-K1_HOFT_C20.ini name: o3gk-k1-hoft-c20-ini
resources: - o3gk-k1_hoft_c20.yaml ```
The registration job we create will first launch an initContainer to stage the file list from the end-point (or any location accessible with GFAL and our user certificate) into a shared volume in the pod. When the file list has been staged in, the actual gwrucio_registrar
container spins up and registers the files in that list. The pod terminates on completion.
Registration job manifest
Job manifest generated from a bash script and jinja templates.
```yaml¶
apiVersion: batch/v1 kind: Job metadata: name: o3gk-k1-hoft-c20 labels: tier: apps spec: ttlSecondsAfterFinished: 3600 template: spec: initContainers: # Stage in the file list / cache file - name: stage-data image: containers.ligo.org/rucio/igwn-rucio-client/gfal2-utils volumeMounts: - name: data mountPath: /data - name: proxy-volume mountPath: /tmp env: - name: X509_USER_PROXY value: "/tmp/x509up" - name: SRC value: "gsiftp://kagra-ldr.icrr.u-tokyo.ac.jp:2811/home/ldr/rucio/filelists/O3GK:K1_HOFT_C20.pfns" - name: DEST value: /data/O3GK:K1_HOFT_C20.pfns containers: - name: rucio-clients image: containers.ligo.org/rucio/igwn-rucio-client/rucio-clients command: | /bin/bash -ce /docker-entrypoint.sh gwrucio_registrar ${CONFIG} -c ${DISKCACHE} -r ${RSE} -t ${THREADS}"] resources: limits: cpu: 1 memory: 500Mi ephemeral-storage: 10Gi requests: cpu: 50m memory: 96Mi ephemeral-storage: 100Mi env: - name: X509_USER_PROXY value: "/tmp/x509up" - name: CONFIG value: "/opt/rucio/registrar/O3GK-K1_HOFT_C20.ini" - name: DISKCACHE value: "/data/O3GK:K1_HOFT_C20.pfns" - name: RSE value: "ICRR-STAGING" - name: THREADS value: "8" imagePullPolicy: Always volumeMounts: - name: rucio-cfg mountPath: /opt/rucio/etc - name: registrar-ini mountPath: /opt/rucio/registrar - name: sshkey readOnly: true mountPath: /root/.ssh - name: proxy-volume mountPath: /tmp - name: data mountPath: /data volumes: - name: rucio-cfg configMap: name: ligo-rucio-cfg items: - key: ligo-rucio.cfg path: rucio.cfg - name: registrar-ini configMap: name: o3gk-k1-hoft-c20-ini items: - key: O3GK-K1_HOFT_C20.ini path: O3GK-K1_HOFT_C20.ini - name: sshkey secret: secretName: client-ssh-key - name: proxy-volume secret: secretName: proxy-cert items: - key: x509up path: x509up mode: 0400 optional: true - name: data emptyDir: {} restartPolicy: OnFailure ```
Add the job base to the overlay¶
To allow us to share configMaps (e.g. Rucio configuration files) and secrets (e.g. ssh keys) between registrar client instances, we can add all deployments (online registration) and job bases (offline registration) to a single kustomization overlay. The job directory containing the kustomization base and job manifest is simply appended to the overlay bases
section. Deployments for online registration can be added as resources
.
Registration kustomization overlay
Registration kustomization overlay. This creates a kubernetes secret to store our rucio user ssh key and a configMap with the standard Rucio client configuration file. In this example, our offline job base is appended at the end of the file, after specifying an online registration deployment.
```yaml apiVersion: kustomize.config.k8s.io/v1beta1 kind: Kustomization namePrefix: registrar-
secretGenerator: - files: - secrets/id_rsa - secrets/id_rsa.pub name: client-ssh-key
configMapGenerator: - files: - config/rucio.cfg name: rucio-cfg
resources: - online-registration-O3GK-GEO600.yaml
bases: - O3GK-K1_HOFT_C20 ```
Deploy the job and register data¶
Deployment is simply a matter of applying the kustomization file to the appropriate namespace. The kubectl
application will now take care of creating all secrets, configMaps, deployments and jobs described in the kustomization overlay and base(s). Assuming the default namespace is appropriate and that the overlay kustomization resides in the current working directory, apply the kustomization with:
$ kubectl apply -k .
$ k apply -k .
configmap/registrar-rucio-cfg-ffm79k22mc unchanged
secret/registrar-client-ssh-key-fm664m7d29 unchanged
deployment.apps/registrar-online-registration-O3GK-GEO600 configured
job.batch/registrar-o3gk-k1-hoft-c20 configured
Finally, monitor registration job progress by following the pod logs:
$ kubectl logs registrar-o3gk-k1-hoft-c20-nrjqc -f
<!-- markdownlint-disable MD013 -->
2020-11-05 05:45:12,364 1 DEBUG Thread [3/8] : Computing checksums for file gsiftp://kagra-ldr.icrr.u-tokyo.ac.jp:2811/data/proc/C20/O3GK/K1/12714/K-K1_HOFT_C20-1271401152-32.gwf
2020-11-05 05:45:12,400 1 DEBUG Thread [5/8] : Checksums took 9s (gsiftp://kagra-ldr.icrr.u-tokyo.ac.jp:2811/data/proc/C20/O3GK/K1/12714/K-K1_HOFT_C20-1271418176-32.gwf)
2020-11-05 05:45:12,400 1 DEBUG Thread [5/8] : Working on file 17/4613
2020-11-05 05:45:12,510 1 DEBUG https://ligo-rucio.nautilus.optiputer.net:443 "GET /dids/O3GK/K-K1_HOFT_C20-1271416000-32.gwf/meta?plugin=DID_COLUMN HTTP/1.1" 404 132
2020-11-05 05:45:12,511 1 DEBUG Thread [5/8] : Computing checksums for file gsiftp://kagra-ldr.icrr.u-tokyo.ac.jp:2811/data/proc/C20/O3GK/K1/12714/K-K1_HOFT_C20-1271416000-32.gwf
<!-- markdownlint-enable MD013 -->