Configure pyodk via environment variables, as well as config file

Issue

  • pyodk currently has the option for configuration by a TOML file on the filesystem.
  • This is fine for scripting.
  • For integrating into applications, it would be much nicer to configure via environment variables.

Solution

  • Add code to pick up environment variables related to pyodk.

  • A config file should generally take precedence over these environment variables, if both are set.

  • The variables could be:

    PYODK_CENTRAL_URL=https://example.com
    PYODK_CENTRAL_USER=user@example.com
    PYODK_CENTRAL_PASS=yourpassword
    PYODK_DEFAULT_PROJECT_ID=1
    

Question

  • As requested, I didn't raise a PR for this, instead asking the community first.
  • Would it be worthwhile to add a small handler for environment variables in pyodk?
  • Note that I created a branch/PR on my fork with these changes already, with tests and docs. If this is approved on the forums, I could make the PR against the official repo.

I welcome this change.

I'd probably name the variables to more closely match the config file, but I could be convinced that these need _CENTRAL_.

PYODK_USERNAME=https://example.com
PYODK_USER=user@example.com
PYODK_PASSWORD=yourpassword
PYODK_DEFAULT_PROJECT_ID=1

I think it's more typical that env vars override config files, so I'd prefer that behavior.

What do you mean by add a small handler?

I'm not sure the wisdom of this, but could we also store the token in an env var?

3 Likes

Good point - thanks for the feedback!

To match the config file, would these vars work?

# the toml file variable is base_url
PYODK_BASE_URL=https://example.com
PYODK_USERNAME=user@example.com
PYODK_PASSWORD=yourpassword
PYODK_DEFAULT_PROJECT_ID=1

By small handler I basically just mean a little bit of code to handle the env vars - what I implemented in the PR :smile:

Interesting idea for the token too - the cache_path variable is in a lot of code though, so this would be quite a large refactor, perhaps best for another issue / v2?

Although doing that might have some complications. It's probably unlikely, but say someone is running pyodk in a service that has multiple replicas. It's easy to share the cache file between replicas and use the same token, but if this was set dynamically in the environment of a single replica, each replica would have a unique environment and wouldn't benefit from the shared session token. Not sure if this is a big deal though!

Something that would be very useful is passing the config directly to the Client class too.

Changing:

    def __init__(
        self,
        config_path: str | None = None,
        cache_path: str | None = None,
        project_id: int | None = None,
        session: Session | None = None,
        api_version: str | None = "v1",
    ) -> None:
        self.config: config.Config = config.read_config(config_path=config_path)

to

    def __init__(
        self,
        config_path: str | config.CentralConfig | None = None,
        cache_path: str | None = None,
        project_id: int | None = None,
        session: Session | None = None,
        api_version: str | None = "v1",
    ) -> None:
        if isinstance(config_path, config.CentralConfig):
            self.config: config.Config = config.Config(central=config_path)
        else:
            self.config: config.Config = config.read_config(config_path=config_path)

This way we could also do:

new_config = config.CentralConfig(
    base_url="https://obj.config.com",
    username="user@obj.config.com",
    password="ConfigPassword",
)
with Client(config_path=new_config) as client:
    return client.submissions.create(
        project_id=xxx,
        form_id=xxx,
        xml=xxx,
    )

The loading order would be:

  1. Directly passed CentralConfig object.
  2. Env vars.
  3. User specified config_path.
  4. Default config_path.
  5. Error.

The reason I mention this is because I am using pyodk in an app where there are potentially multiple different ODK Central servers to contend with, so I need to easily swap between them. Doing this via env var or config file isn't ideal

2 Likes