ODK Central: make data directory configurable

spwoodcock · January 24, 2024, 1:41pm

Error

I was following the guide to install ODK Central: https://docs.getodk.org/central-install-digital-ocean/

I always run the docker daemon as rootless, so I don't start the containers as root.

When I run docker compose up -d, I get an error:

Error response from daemon: error while creating mount source path '/data/transfer': mkdir /data: permission denied

Diagnosis

Non-root users do not have permission to create directories under /.

Solution

I was about to make a PR to getodk/docs, but then I thought we should discuss here first to find a preferred solution.

Option 1: update central-install-digital-ocean.rst getodk/docs.

After point 6 about adding ./files/allow-postgres14-upgrade, add point 7 to the list:

#. Create the data directory on your system:

   .. code-block:: bash

     $ sudo mkdir -p /data/transfer

   This may not be necessary if you are installing as the *root* user.

Option 2: make the data dir configurable in getodk/central

Something like:

docker-compose.yml

  service:
    build:
      context: .
      dockerfile: service.dockerfile
    depends_on:
      - secrets
      - postgres14
      - mail
      - pyxform
      - enketo
    volumes:
      - secrets:/etc/secrets
      # default to /data/transfer, so nothing changes for most
      - ${DATA_DIR:-/data/transfer}:/data/transfer

Ideally we could make the DATA_DIR default to a path outside of /, for example $PWD/data/transfer,
but this would affect users that already have data saved to /data/transfer.

Then the DATA_DIR variable would probably be added to .env.example to allow for configuration by the user.

Question

Is this solution something that would be useful to the ODK team or other users?

Another option is to just advise users they need to install as root (although I'm not a fan of doing this for security reasons).

@Ivangayton

yanokwa · January 24, 2024, 7:21pm

Thanks for the work on this!

I don't remember why we put it in /data/transfer to begin with. Seems like we could also put it inside the ~/central install directory itself and .gitignore it? And add moving the existing folder as an upgrade step.

spwoodcock · January 25, 2024, 8:10am

Sounds like a good plan to me!

Should I make the updates to central and docs, or leave for you guys to discuss?

Also, do you think it's worth having an environment variable too, so the user can modify the dir?

${DATA_DIR:-./data/transfer}:/data/transfer

LN · February 6, 2024, 12:49am

I chatted about this briefly with Central devs and we believe that either there was some previous limitation about relative paths for named volumes or that there are some gotchas certain environments. Unfortunately we can't quickly come up with which it is.

If you can confirm relatively confidently that binding named volumes to relative paths is common practice then we are open to a PR for the migration.

The security risks of running as root are limited to vulnerabilities discovered in Docker's sandboxing in this case, right? Here's info about the daemon attack surface and what rootless mode mitigates.

My general sense is that running Docker commands as root correctly captures the level of responsibility and risk that this implies. Rootless makes sense in contexts with multiple levels of administrators, some of which don't have root, for example.

spwoodcock · February 7, 2024, 10:28am

I think you could be right there - I also remember some strange behaviour for relative paths in docker compose. But I think that was compose V1 (the Python-based tool).

I suppose compose V2 (golang-based) has since fixed it, as I have not had an issue in a long time. It's also possible to bind mount using $PWD so the absolute path is used anyway:

volumes:
  - ${DATA_DIR:-${PWD}/data/transfer}:/data/transfer

As for security, I agree the risk is quite low for most use cases. Using a rootless daemon does reduce the potential attack surface however (if, like you say, a vulnerability is discovered), and isn't very complicated to configure, so I always default to that.

My other concern is using tools that by default run daemonless (rootless), like Podman. In theory a user could use the Central docker-compose.yml and run with Podman, but would be prevented from doing so with the current config.

EDIT looks like there is no need for $PWD, compose V2 supports resolving relative paths to absolute: https://github.com/compose-spec/compose-go/pull/332