Pelican Docker Image

Run Pelican Server with Docker Image

This document explains how to run a Pelican server using Pelican docker image. If you are installing Pelican to use its client functionalities (download or upload an object), refer to previous sections to download and install a Pelican binary instead.

Before starting

Pelican builds separate images for each Pelican server components, e.g. origin, cache, director, and registry. Depending on which Pelican server you want to run, you need to select a different Docker image from the list below.

  • Pelican origin server: hub.opensciencegrid.org/pelican_platform/origin:latest
  • Pelican cache server: hub.opensciencegrid.org/pelican_platform/cache:latest
  • Pelican director server: hub.opensciencegrid.org/pelican_platform/director:latest
  • Pelican registry server: hub.opensciencegrid.org/pelican_platform/registry:latest

Note: the latest tag will pull the Pelican server image with the latest released version. You may pull an image with explicit Pelican version by passing the version as the tag (e.g. v7.6.4) instead. For a list of available tags, refer to Pelican Harbor repository (opens in a new tab).

Run Pelican server via Docker CLI

This section describes how to run various Pelican server images using the Docker CLI. If you haven't installed Docker engine, follow the documentation from Docker (opens in a new tab) to install and start the Docker engine first.

Run Pelican origin server

To run the latest pelican origin server, run the following command:

docker run -it -p 8444:8444 -p 8443:8443 -v /path/to/your/data/:/tmp/pelican --name=pelican-origin hub.opensciencegrid.org/pelican_platform/origin:latest serve -v /tmp/pelican:/foo/bar -f <federation>

Where:

  • docker run is a Docker CLI command that runs a new container from an image
  • -it (--interactive --tty) runs the container in interactive mode and uses a tty terminal
  • -p <host-port>:<container-port> (--publish) publishes a container’s port(s) to the host, allowing you to reach the container’s port via a host port. In this case, we can reach the container’s port 8444 via the host’s port 8444. Note that the admin website of Pelican servers run on port 8444 by default, and the objects transfer endpoint of the Pelican origin server runs on port 8443 by default.
  • -v <host-location>:<container-location> (--volume) binds mount a volume from the host location(s) to the containter's location(s). This allow you to share files in your host machine to the container. In this case, we bind /path/to/your/data/ on your host machine to /tmp/pelican in the container. You need to replace /path/to/your/data/ to the directory where your data to publish is located.
  • --name assigns a logical name to the container (e.g. pelican-origin). This allows you to refer to the container by name instead of by ID.
  • hub.opensciencegrid.org/pelican_platform/origin:latest is the image to run
  • serve is the command to run the Pelican binary
  • -v /tmp/pelican:/foo/bar is the Pelican argument to bind /tmp/pelican directory in the container as namespace /foo/bar in Pelican. You need to change /foo/bar to a meaningful path that can represent your data, e.g. /chtc/public-data. You may pass additional arguments to Pelican server by appending them after this argument.
  • -f <federation> sets the federation discovery URL, where <federation> is the URL to the federation the origin will be joining in. For instructions on how to find a federation to join, refer to Serve a Pelican Origin

Run Pelican cache server

To run the latest pelican cache server, run the following command:

docker run -it -p 8444:8444 -p 8442:8442 --name=pelican-cache hub.opensciencegrid.org/pelican_platform/cache:latest serve -f <federation>

Where most of the command overlaps with the one to run the Pelican origin server, with the following differences:

  • -p 8444:8444 -p 8442:8442 publishes port 8444 and 8442 from the container to the same port on the host machine. Note that the admin website of the Pelican server runs on port 8444 by default, and the objects transfer endpoint of the Pelican cache server runs on port 8442 by default.

Note that Pelican docker image currently does not support binding a directory on your host machine as the directory for Pelican cache to store cache files, i.e. using -v /host/machine:/run/pelican/cache/location. We will have an update once we officialy support this flow.

Run Pelican director server

To run the latest pelican director server, run the following command:

docker run -it -p 8444:8444 --name=pelican-director hub.opensciencegrid.org/pelican_platform/director:latest serve -f <federation>

Note that to successfully run a Pelican director server, additional configuration is required. Follow Serve a Pelican Federation for instructions.

Run Pelican registry server

To run the latest pelican registry server, run the following command:

docker run -it -p 8444:8444 --name=pelican-registry hub.opensciencegrid.org/pelican_platform/registry:latest serve -f <federation>

Note that to successfully run a Pelican registry server, additional configuration is required. Follow Serve a Pelican Federation for instructions.

Stop Pelican container

To stop the Pelican container, run the following command:

# The `docker ps` command shows the processes running in Docker
docker ps
 
# This will display a list of containers that looks like the following:
CONTAINER ID   IMAGE  COMMAND   CREATED  STATUS   PORTS    NAMES
0be1a304b5d7   hub.opensciencegrid.org/pelican_platform/director:latest   "/bin/sh"   1 hour ago   Up 1 hour   0.0.0.0:8444->8444/tcp   pelican-director
 
# To stop the pelican container run the command
# docker stop <container-ID> or use
# docker stop <container-name>, which is `pelican-director` as previously defined
docker stop pelican-director

Configure Pelican server in a container

There are two ways to configure a Pelican server running in a container. One is through the environment variables, the other is by passing a pelican.yaml configuration file to the container. This section includes instructions for both.

Configure via environment variables

Most Pelican configuration parameters described in parameters page can be passed to a Pelican server as environment variables. For a configuration parameter, the corresponding environment variable is PELICAN_<PARAMETER_NAME>, where PELICAN is the prefix, and <PARAMETER_NAME> is the name of the parameter, with dots . replaced by underscores _. For example, a Pelican configuration named Logging.Level has a corresponding environment variable named PELICAN_LOGGING_LEVEL.

For admins that use Pelican images built for OSDF (with osdf prefix to the image tag, such as osdf-origin). The environment variables are prefixed by OSDF instead of PELICAN. The example above should then be OSDF_LOGGING_LEVEL.

To pass environment variables to the Docker container, append -e flag to your docker run command. For example, if you want to pass PELICAN_LOGGING_LEVEL to the container, run:

docker run -it -e PELICAN_LOGGING_LEVEL=debug -p 8444:8444 -p 8443:8443 -v /path/to/your/data/:/tmp/pelican --name=pelican-origin hub.opensciencegrid.org/pelican_platform/origin:latest serve -v /tmp/pelican:/foo/bar -f <federation>

There are other ways to pass environment variables to a Docker container, see details in Docker documentation (opens in a new tab).

Some of Pelican configuration parameters with the object type can not be passed as environment variables, such as GeoIPOverrides and Origin.Exports. For these parameters, use a configuration file instead.

Configure via the configuration file

Pelican servers can also be configured via a configuration file in yaml. The default location that Pelican looks for a configuration file is /etc/pelican/pelican.yaml if Pelican runs as the root user (which is the case when the Pelican server is running in a container). If the Pelican server runs as a non-root user, the default location is ~/.config/pelican/pelican.yaml.

It is recommended that admins bind a Pelican configuration file on the host machine to the container running the Pelican server. Follow the steps below for instructions.

  1. Prepare a yaml file on the host machine:

    touch pelican.yaml
  2. Modify and save the pelican.yaml file on host machine with the following content:

    pelican.yaml
    Logging:
      Level: debug
  3. Bind the configuration file on the host machine to the container when running the Pelican server image:

    docker run -it -p 8444:8444 -p 8443:8443 -v /path/to/your/data/:/tmp/pelican -v $PWD/pelican.yaml:/etc/pelican/pelican.yaml --name=pelican-origin hub.opensciencegrid.org/pelican_platform/origin:latest serve -v /tmp/pelican:/foo/bar -f <federation>

    where

    • -v $PWD/pelican.yaml:/etc/pelican/pelican.yaml is the flag to bind the pelican.yaml file under current working directory $PWD to /etc/pelican/pelican.yaml directory inside the container
  4. You should note that your running Pelican origin server is logging debug level messages.