Migration Data API

I am performing an ODK migration through the API and facing issues transferring users and roles. The API detects the users but not the roles. Any idea what might be causing this? Additionally, I’m unsure whether migrating passwords while preserving their hash is possible.

Any ideas?

import requests
from requests.auth import HTTPBasicAuth

# Configuración del entorno antiguo (origen)
OLD_ODK_URL = "https://old-odk-central.org/v1"
OLD_ADMIN_USER = "admin_old"
OLD_ADMIN_PASS = "old_password"

# Configuración del entorno nuevo (destino)
NEW_ODK_URL = "https://new-odk-central.org/v1"
NEW_ADMIN_USER = "admin_new"
NEW_ADMIN_PASS = "new_password"

# Contraseña temporal para los nuevos usuarios
TEMP_PASSWORD = "Temporal1234"

def get_users():
    """ Obtiene la lista de usuarios del entorno antiguo. """
    response = requests.get(f"{OLD_ODK_URL}/users", auth=HTTPBasicAuth(OLD_ADMIN_USER, OLD_ADMIN_PASS), verify=False)
    
    if response.status_code == 200:
        return response.json()
    else:
        print(f"❌ Error al obtener usuarios: {response.status_code} - {response.text}")
        return []

def create_user(user):
    """ Crea un usuario en el nuevo entorno sin asignar roles. """
    user_data = {
        "email": user["email"],
        "displayName": user["displayName"],
        "password": TEMP_PASSWORD  # Se asigna una contraseña temporal
    }
    
    response = requests.post(f"{NEW_ODK_URL}/users", json=user_data, auth=HTTPBasicAuth(NEW_ADMIN_USER, NEW_ADMIN_PASS), verify=False)
    
    if response.status_code == 201:
        print(f"✅ Usuario {user['email']} migrado correctamente.")
    else:
        print(f"❌ Error al migrar {user['email']}: {response.status_code} - {response.text}")

def migrate_users():
    """ Migra todos los usuarios sin roles. """
    users = get_users()

    if not users:
        print("⚠️ No se encontraron usuarios en el entorno antiguo.")
        return

    for user in users:
        if user["email"] != OLD_ADMIN_USER:  # Evita migrar el usuario administrador
            create_user(user)

if __name__ == "__main__":
    print("\n🔄 Migrando usuarios sin roles...")
    migrate_users()

You would need to look up roles and assignments and set those for each user. A role is something like "Administrator", "Project Manager", "Project Viewer", "Data Collector". You don't really need to know what each role is, but you do need to look up assignments, which connect user and roles.

API docs for listing all server-wide assignments and then assigning a server-wide role to a user. The /users API response includes an id which can be used as the actorId in the above API call.

I was trying this out on your behalf and realized that, like the docs say, this is only the site-wide roles like Administrator.

def get_users():
    """ Obtiene la lista de usuarios del entorno antiguo. """
    response = requests.get(f"{OLD_ODK_URL}/users", auth=HTTPBasicAuth(OLD_ADMIN_USER, OLD_ADMIN_PASS), verify=False)
    
    if response.status_code == 200:
        return response.json()
    else:
        print(f"❌ Error al obtener usuarios: {response.status_code} - {response.text}")
        return []

def get_assignments():
    response = requests.get(f"{OLD_ODK_URL}/assignments", auth=HTTPBasicAuth(OLD_ADMIN_USER, OLD_ADMIN_PASS), verify=False)
    
    if response.status_code == 200:
        return response.json()
    else:
        print(f"❌ Error al obtener assignments: {response.status_code} - {response.text}")
        return []

def create_user(user, user_role):
    """ Crea un usuario en el nuevo entorno sin asignar roles. """
    user_data = {
        "email": user["email"],
        "displayName": user["displayName"],
        "password": TEMP_PASSWORD  # Se asigna una contraseña temporal
    }

    print(f"creating user {user['email']} {user['id']} {user_role}")

    response = requests.post(f"{NEW_ODK_URL}/users", json=user_data, auth=HTTPBasicAuth(NEW_ADMIN_USER, NEW_ADMIN_PASS), verify=False)
    
    if response.status_code == 201:
        print(f"✅ Usuario {user['email']} migrado correctamente.")
    else:
        print(f"❌ Error al migrar {user['email']}: {response.status_code} - {response.text}")

    if user_role: # could be None if user does not have site-wide role
        print(f"assigning role {user_role} to user {user['email']}")

        response = requests.post(f"{NEW_ODK_URL}/assignments/{user_role}/{user['id']}", auth=HTTPBasicAuth(NEW_ADMIN_USER, NEW_ADMIN_PASS), verify=False)
        
        if response.status_code == 201:
            print(f"✅ Usuario {user['email']} role assigned")
        else:
            print(f"❌ Error al migrar user role {user['email']}: {response.status_code} - {response.text}")

def migrate_users():
    """ Migra todos los usuarios sin roles. """
    users = get_users()

    assignments = get_assignments() # site-wide assignments only
    user_roles = {a['actorId']: a['roleId'] for a in assignments}
    print("assignments", assignments, user_roles)

    if not users:
        print("⚠️ No se encontraron usuarios en el entorno antiguo.")
        return

    for user in users:
        if user["email"] != OLD_ADMIN_USER:  # Evita migrar el usuario administrador
            create_user(user, user_roles.get(user['id'], None)) # dict.get() can return default value if user has no role

if __name__ == "__main__":
    print("\n🔄 Migrando usuarios sin roles...")
    migrate_users()

You may also need to get project-specific assignments and set those for each existing (user, role, project) combination. You might need to migrate form-specific assignments, too. Note that this depends on how you are copying over your project and form data because those project and form IDs might change. Maybe the site-wide roles are all you need.

As for the password hashes, that isn't possible using your current approach of migrating the data via the API. You would have to migrate database data directly.

2 Likes

Thanks @ktuite for your reply, I'll take note of what you mentioned. Regarding the other method, which is transferring the data directly… is there a script that does that? I saw one that updates the version using the volume, but my data is on a specific host…?
Best

Hi @sowe1 ! :wave:

I think @ktuite was pointing you to this: Direct Backups via API. It allows you to back up the entire ODK Central system database - including users, projects, forms, submissions, etc - and restore everything to another server (as good as migrating) as necessary!

A couple of things to keep in mind:
:one: If you’re using the above backup and restore method for migration, you’ll need to handle Enketo thing separately. Here’s a useful article on that by @KagundaJM.
:two: Always make sure to have a full system backup (preferably snapshots) before experimenting anything such!

Another approach I personally prefer for internal migrations is using "snapshots". Just take a snapshot of the instance, restore it onto a new one, and we’re good to go!

Hope this helps! Great day! :sunflower: :smile:

2 Likes

Thanks @MinimalPotato, for your response! .. I tried to do the manual, but when I launch the first command from inside the host I get curl: (28) Failed to connect to odk.host.org port 443 after 135568 ms: Connection timed out
Any idea?

That might need slightly more debug info to diagnose :pray:

  • Is odk.host.org the URL it shows in the error, or just an example given by you? If its the actual URL it suggests misconfiguration. Do you have a value set for DOMAIN in the .env file?

  • Where are you running your commands from Directly on the host machine, or inside the docker container?

1 Like

This is command i run , curl -X POST -H "Content-Type: application/json" -d '{"passphrase": "pass"}' -k -u testapp@example.com https://odk-src.kagundajm.codes/v1/backup --output ~/odk-backups/odk-bk.zip

Is odk.host.org the URL it shows in the error, or just an example given by you? If its the actual URL it suggests misconfiguration. Do you have a value set for DOMAIN in the .env file?

  • Yes, it's just an example for security reasons. I’m using the same domain in the .env file.

Where are you running your commands from Directly on the host machine, or inside the docker container?

  • Both outside and inside the container. Inside, I'm using the domain (maybe I need to check with localhost or the container's IP).

I ran the same cURL command against a self-hosted ODK Central server with success:

curl -X POST -H "Content-Type: application/json" -d '{"passphrase": "pass"}' -k -u admin@xxx.org https://DOMAIN/v1/backup --output test.zip

I think the issue must be with the domain you are using. A timeout suggests the domain doesn't resolve to a server anywhere. Are you sure the domain exists, has a A record pointing to your server, and the server is running and accepting connections?

You can debug the DNS entries using something like:

dig odk-src.kagundajm.codes
# or
nslookup odk-src.kagundajm.codes

The domain should resolve to an IP address.

If you are doing this on the same machine as the Central server, with the default docker config, you should be able to also run the cURL command against http://localhost/v1/backup

2 Likes

Thanks @spwoodcock, it looks like at least the file generation issue is solved for now. However, I think it's generating it incorrectly because when I try to restore the ZIP, it says it can't be found, even though it's in the system. I've also tried opening the ZIP, but it won't let me.

odk@odk:/opt/odk-central/central$ docker compose exec service  node /usr/odk/lib/bin/restore.js odk-bk20230913-08AM.zip pass
[Error: ENOENT: no such file or directory, open 'odk-bk20230913-08AM.zip'] {
  errno: -2,
  code: 'ENOENT',
  syscall: 'open',
  path: 'odk-bk20230913-08AM.zip'
}
odk@odk:/opt/odk-central/central$ docker compose exec service  node /usr/odk/lib/bin/restore.js /tmp/backup-filename.zip pass
[Error: ENOENT: no such file or directory, open '/tmp/backup-filename.zip'] {
  errno: -2,
  code: 'ENOENT',
  syscall: 'open',
  path: '/tmp/backup-filename.zip'
}
odk@odk:/opt/odk-central/central$ cd ..
odk@odk:/opt/odk-central$ docker compose exec service  node /usr/odk/lib/bin/restore.js /tmp/backup-filename.zippass
no configuration file provided: not found
odk@odk:/opt/odk-central$ docker exec central-service-1  node /usr/odk/lib/bin/restore.js /tmp/backup-filename.zip pass
[Error: ENOENT: no such file or directory, open '/tmp/backup-filename.zip'] {
  errno: -2,
  code: 'ENOENT',
  syscall: 'open',
  path: '/tmp/backup-filename.zip'
}
odk@odk:/opt/odk-central$ ls /tmp/
backup-filename.zip                                                     systemd-private-2ea287c72e404d1495cf720e5fddc1ac-fwupd.service-sT5JBn                  systemd-private-2ea287c72e404d1495cf720e5fddc1ac-systemd-logind.service-CVr6IG     VMwareDnD
data                                                                    systemd-private-2ea287c72e404d1495cf720e5fddc1ac-ModemManager.service-PTYn77           systemd-private-2ea287c72e404d1495cf720e5fddc1ac-systemd-oomd.service-q8N1z3       vmware-root_1130-2722763433
dbus-hW7vP0xwR8                                                         systemd-private-2ea287c72e404d1495cf720e5fddc1ac-polkit.service-mrEaER                 systemd-private-2ea287c72e404d1495cf720e5fddc1ac-systemd-resolved.service-oQNToO   vmware-root_3285-4290754452
snap-private-tmp                                                        systemd-private-2ea287c72e404d1495cf720e5fddc1ac-power-profiles-daemon.service-TbNxdk  systemd-private-2ea287c72e404d1495cf720e5fddc1ac-systemd-timesyncd.service-OV4Aea
systemd-private-2ea287c72e404d1495cf720e5fddc1ac-colord.service-LGxbqZ  systemd-private-2ea287c72e404d1495cf720e5fddc1ac-switcheroo-control.service-iah2rx     systemd-private-2ea287c72e404d1495cf720e5fddc1ac-upower.service-nBy0ZW
odk@odk:/opt/odk-central$ 

I've noticed that the file is generated incorrectly. If I use localhost, I get this message:

<html>
<head><title>301 Moved Permanently</title></head>
<body bgcolor="white">
<center><h1>301 Moved Permanently</h1></center>
<hr><center>nginx/1.14.2</center>
</body>
</html>

Is this normal? Could there be an issue with the configuration? ODK seems to be working correctly.

And if run wiht domain show me this message

{"message":"Completely unhandled exception: timeout exceeded when trying to connect","details":{"stack":["ConnectionError: timeout exceeded when trying to connect","at Object.createConnection (/usr/odk/node_modules/slonik/dist/src/factories/createConnection.js:54:23)"]}}

Hey @sowe1 ! :wave:

Hope you're doing well! As always, I just want to start by reminding you to make sure you have a backup (ideally a full system snapshot) of your main setup with all the data. I can't stress enough how important this is!

I feel like you might be mistakenly missing a few steps from the guide - especially this part:

The first time ODK Central runs, it creates the /data/transfer folder on the host server and inside the central-service-1 Docker container. This folder is meant to act as a bridge between the host and the container, allowing data to be shared. Any changes made to /data/transfer in the container will also be reflected in /data/transfer on the host, and vice versa.

It looks like you're trying to reference the backup file from outside the container, but it hasn’t been copied (or moved) to the /data/transfer folder. Since this is the shared path between the container and the host, the container won’t be able to find the file unless it's inside this directory. That's likely why you're seeing the "file not found" error - because the file exists on the host but not inside the container.

That being said, we won’t be able to fix this properly (the above won't truly help) until we address the following issue:

I’m not confident, but I don’t think you can use localhost to create a backup anymore. ODK Central's NGINX config blocks access via localhost (unless you're using an older version of ODK Central). It only allows access with the correct host header = domain name (the one we set in .env). So before moving forward, we need to make sure we have a proper backup by fixing the issue below and using cURL (to make an API call) to create backup:

Here are a few things I'd suggest checking:

  1. Firewall rules - Are there any inbound / outbound rules blocking access, either the external or internal firewall?
  2. Multiple services - Are you running other services on the same host as well? Could there be a port conflict, or maybe multiple ODK Central instances running on the same host?
  3. ODK Central status - Is the server you're trying to back up actually up and functional? Can you access the frontend and the data via frontend? Check the container statuses using "docker ps" on the host. It feels like something might be wrong with the Postgres container.

Hope this helps..

Great info above!!

Just want to add to it to say:

  • I think it is possible to connect to Central on locahost - I managed to from within a container at least:

  • As @MinimalPotato mentioned, you either need to mount the backup zip file inside the container (using volumes: in the compose.yaml), OR you can simply copy the file in:

    docker cp odk-bk20230913-08AM.zip central-service-1:/tmp/
    # then the original command
    docker compose exec service  node /usr/odk/lib/bin/restore.js /tmp/backup-filename.zip
    
Here are a few things I'd suggest checking:

Firewall rules - Are there any inbound / outbound rules blocking access, either the external or internal firewall?

There's no firewall on this server. I know it should have iptables, but not for now.

Multiple services - Are you running other services on the same host as well? Could there be a port conflict, or maybe multiple ODK Central instances running on the same host?

Nope, only one ODK central in this server, that all

ODK Central status - Is the server you're trying to back up actually up and functional?

Yes i't a producction enviroment and works

Can you access the frontend and the data via frontend? 

Yes

Check the container statuses using "docker ps" on the host. It feels like something might be wrong with the Postgres container.
docker compose ps 
WARN[0000] /home/ravamo/Documents/central2/docker-compose.yml: the attribute `version` is obsolete, it will be ignored, please remove it to avoid potential confusion 
NAME                                         IMAGE                                COMMAND                  SERVICE              CREATED        STATUS                  PORTS
8457edd02cb6_central2-mail-1                 itsissa/namshi-smtp:4.92-8.deb10u6   "/bin/entrypoint.sh …"   mail                 6 months ago   Up 25 hours             25/tcp
9317c592cbb9_central2-pyxform-1              ghcr.io/getodk/pyxform-http:v1.7.0   "gunicorn --bind 0.0…"   pyxform              6 months ago   Up 25 hours             
9defe75dacf7_central2-enketo_redis_cache-1   redis:5                              "docker-entrypoint.s…"   enketo_redis_cache   6 months ago   Up 25 hours             6379/tcp
central2-enketo-1                            central2-enketo                      "docker-entrypoint.s…"   enketo               6 months ago   Up 25 hours             8005/tcp
central2-nginx-1                             central2-nginx                       "/bin/bash /scripts/…"   nginx                6 months ago   Up 25 hours (healthy)   0.0.0.0:80->80/tcp, :::80->80/tcp, 0.0.0.0:443->443/tcp, :::443->443/tcp
central2-service-1                           central2-service                     "docker-entrypoint.s…"   service              6 months ago   Up 25 hours             8383/tcp
e8f31b215a1d_central2-postgres-1             postgres:9.6                         "docker-entrypoint.s…"   postgres             6 months ago   Up 25 hours             5432/tcp
f518a80975a0_central2-enketo_redis_main-1    redis:5                              "docker-entrypoint.s…"   enketo_redis_main    6 months ago   Up 25 hours             6379/tcp

Regarding the file, that was my mistake for not seeing it. However, if I place it inside the container and execute it, the problem is that it doesn't generate the ZIP because the database disconnects, as you can see. And to answer your initial question, yes, I have made a full backup of the server.

Odk -version

versions:
24ee74e5f974a518aa1cc8b06e7addb3be6b4690 (v1.3.3-2-g24ee74e)
5cc6fd79d112ce36d6298c61bb8817689c4c323b client (v1.3.2)
1d1a3a59969e61383da74119e405e67778b7a170 server (v1.3.3)

Best

Here's the translation:

Hi all,
My backup nightmare continues... any help would be greatly appreciated!
Well, I managed to fix the connection issue, but now when I try to generate the backup, it creates a file. When I ask what type it is, it tells me

 'Zip archive data, at least v2.0 to extract, compression method=deflate.' 

But when I try to load it, I get this error:

odk@odk:/opt/odk-central/central$ docker compose exec service  node /usr/odk/lib/bin/restore.js /tmp/data.zip
Error: end of central directory record signature not found
    at /usr/odk/node_modules/yauzl/index.js:187:14
    at /usr/odk/node_modules/yauzl/index.js:631:5
    at /usr/odk/node_modules/fd-slicer/index.js:32:7
    at FSReqCallback.wrapper [as oncomplete] (node:fs:671:5)

Is there a way to open the file? Am I doing something wrong? Because I follow the manual, but the backup is not working properly or something is happening.

Any ideas? I'm feeling really blocked and I can’t seem to find a solution. I’ve tried to decipher the script for upgrading (upgrade-postgres.sh), but basically, it just does a backup, and that doesn’t work for me. Is there any way to migrate the data in bulk.

sh 2025/03/04 21:04:43 [error] 17#17: *33 connect() failed (111: Connection refused) while connecting to upstream, client: 10.20.30.11, server: 10.20.30.47, request: "POST /v1/sessions HTTP/1.1", upstream: "http://172.18.0.3:8383/v1/sessions", host: "10.20.30.47", referrer: "https://10.20.30.47/"

Now, if I create a user, when logging in it says this... What I did was create a backup from 9.6 and transfer it to 14, but it's still not working properly...

I've tried to do a backup between the two versions, and there are differences between both databases. Is there somewhere where I can see the changes? Or is there a script to perform the migration? @MinimalPotato

There is actually an open issue about this on the central-backend repo: https://github.com/getodk/central-backend/issues/595

It suggest the backup file is corrupt or invalid :grimacing:

Can you extract the zip onto your system and inspect the contents manually?

On a side note, regarding the upgrade from postgres 9.4 --> 14:

  1. I believe this is handled as part of the upgrade process for ODK Central: https://docs.getodk.org/central-upgrade
  2. I wouldn't imagine this will cause an issue with the backup zip though, as long as you backup and restore to the same version of Central.
  3. If you need to do a postgres version upgrade manually, there is a nice tool to do this very easily: https://github.com/pgautoupgrade/docker-pgautoupgrade (disclaimer: I help maintain it)

Hope this helps debugging!

Thanks @spwoodcock for your repaly

Can you extract the zip onto your system and inspect the contents manually?

I tried it but the zip is empty it is frustrating....

I believe this is handled as part of the upgrade process for ODK Central: https://docs.getodk.org/central-upgrade

The truth is that this process has not worked for me... I don't know why but it doesn't migrate the data... Is there really no way to move the tables from one system to another?

Yeah there is!

Just do a standard Postgres (database) backup and restore :smiley:

There are loads of resources online for doing that

Good morning, thanks for your comment and for pointing me in the right direction. I was already aware of that, but the issue is that the database structures are different between versions 9.6 and 10, so doing that is not enough.

Is there a way to migrate the data from 9.6 to 10?

If I set up a new ODK instance with the old database, I get this error:

service-1 | After attempting to automatically migrate the database, we have detected unapplied migrations, which suggests a problem with the database migration step. Please look in the console above this message for any errors and post what you find in the forum: https://forum.getodk.org/

Any suggestions? I've been trying to migrate from one ODK instance to another for a while now, and I don’t think it should be this complicated

@MinimalPotato or @ktuite Maybe you can shed some light on this because I’m completely lost with this problem. I can’t seem to find a solution and keep going around in circles.

Thanks for the extra context :+1:

I think a key consideration here is what version of ODK Central you are running?

Sounds like you might be on an older version.
There are upgrade notes that include the migration from Postgres 9.6 --> 10 here:

Probably best to start with that, upgrade progressively (one version at a time) to the latest version of ODK Central (which will include all latest fixes), then if you need to backup via API at some point it will probably work (although the backup by API does sound like an XY problem to me)