Data persistence when upgrading/downgrading Central

dmenne · August 25, 2020, 6:03pm

Installed 1.0 over a running 0.9.
docker ps all looks healthy
Tried to add user (why are data and setting not made persistent in Docker?)

Error as below. Reverted to 0.9

root@ubuntu-server:~/central# docker-compose exec service odk-cmd --email my.name@myname.de user-create
prompt: password:  ********

error: insert into "actees" ("id", "species") values ($1, $2) returning * - relation "actees" does not exist
    at Parser.parseErrorMessage (/usr/odk/node_modules/pg-protocol/dist/parser.js:241:15)
    at Parser.handlePacket (/usr/odk/node_modules/pg-protocol/dist/parser.js:89:29)
    at Parser.parse (/usr/odk/node_modules/pg-protocol/dist/parser.js:41:38)
    at Socket.<anonymous> (/usr/odk/node_modules/pg-protocol/dist/index.js:8:42)
    at Socket.emit (events.js:203:13)
    at addChunk (_stream_readable.js:294:12)
    at readableAddChunk (_stream_readable.js:275:11)
    at Socket.Readable.push (_stream_readable.js:210:10)
    at TCP.onStreamRead (internal/stream_base_commons.js:166:17) {

LN · August 25, 2020, 6:11pm

Could you please expand on what this refers to? Which data and settings are reverting? Since you're creating users on the command line, does it mean that you're not seeing the users that you expect in frontend?

Is everything working now that you've reverted to 0.9? Are all the expected users there? Have you previously upgraded and downgraded Central? It looks like your database is in an unexpected state.

dmenne · August 26, 2020, 8:27am

Reverting to 0.9 works fine - I used a device image, did not reinstall it.

Re: loss of data. This has been a problem in earlier upgrades, but I put it aside, because we are still in development mode and loading the backup from Google drive restored everything. Since a direct restore from the database dump did not work - see separate thread - I thought that you had changed the database structure.

There must be something strange going on with the volumes: Each time I upgraded - and I have done it a dozen of times on developments computers - a dozen docker unmapped volumes are reported "unused" - the bind mounts are reused, central_transfer being the exception. I would prefer to do full binding of volumes to disk locations - I work a lot with Docker, but I have never been happy volumes not physically present on the host disk.

 volumes:
      - /var/lib/postgresql/data

I assume the reason this is not done for the postgres volume is greater freedom for changes - this is ok in a developing product, but a non-Google backup for upgrades would be required.

2020-08-26 10_24_31-Portainer

dmenne · August 26, 2020, 12:49pm

I did a clean reinstall, and the install error reported in the original post is gone.

LN · August 26, 2020, 11:21pm

Thanks for the update and I'm glad it's working. I hope to come back to some of your backup questions soon and will also give this thread more thought then.

dmenne · August 27, 2020, 9:52am

For your developers, added after more testing: I have tried to make the database easier keep persistent by mapping it to the host.

volumes:
      - /data/postgres:/var/lib/postgresql/data

This works at first sight, all folders are created and docker ps shows everything is fine, tables and data on host are created, but the database is not populated. This should not happen in a robust design, I think there is something in the scripts hardwired.

issa · September 5, 2020, 9:36pm

so, the postgres container already maps all its contents to the host filesystem, it just chooses its own destination (or probably docker automatically assigns it). this is all postgres; we have not done any scripting or building of our own on top of what postgres provides.

so it was already persistent. what you've done here is to instead redirect it to a known directory on the host, rather than where the data already is (wherever docker decided to put it). this means that your data is still on your machine, but it's just not where you have now asked for it to be. if you find your old data and copy it to /var/lib/postgresql/data, it will appear as expected.

in general, we do not consider any of the build scripts, including docker-compose.yml, to be user-editable. if you wish to edit them you can, but you may run into issues like this.

dmenne · September 28, 2020, 9:18am

I lost all my data (test data, luckily) again after I had to rebuild odk collect, and therefore would like to come back to this item.

ODK Central uses an anonymous mount which is the oldest version in Docker

   volumes:
      - /var/lib/postgresql/data

I do not agree with @issa that anonymous volumes is what "Postgresql" recommends, and it is definitively not common in the Docker sphere. Mounting the data in the host is standard nowadays, or at least using a named volume, and it is handled that way in all other containers of docker-compose.

Named volume:

      - odkcollect:/var/lib/postgresql/data

I strongly prefer mapping to a host directory, because it makes it easy to create database backups, and finding the file that

Host mapping

      - /host/data:/var/lib/postgresql/data

I fully agree with your concept of using named volumes for ephemeral data (cache..), and host mapping for configurable parts, so I do not understand why the most important one is anonymously mapped.

if you wish to edit them you can, but you may run into issues like this.

Playing with Docker compose is not what I love. However, I have to find a solution for the backup problem, because the backup to a Google Server under US jurisdiction (or better: NSA surveillance) is illegal under European law.. See, for example, here.

dmenne · September 28, 2020, 3:15pm

I have to correct myself: with a clean rebuild and the latest version from github, using

volumes:
      - /data/postgres:/var/lib/postgresql/data

works without problems. I recommend that this should be made the default setting to avoid the lost-data problem when a rebuild does not use the anonymous container.

I have not tried the named volume solution, because I prefer the direct mapping. Using a relative path like ./files/postgresql for mapping might be another solution, it's up to you.

seadowg · September 29, 2020, 9:41am

As you point out earlier using the database for backups could be problematic for development of Central as it might result in a world where people rely on a particular database structure, so I'm not sure having a specific host dir for the volume is the best solution here. I could go either way on this tbh - as long as the data is persisted between up``/down.

This seems to be the real pain here right? Needing backups (to prevent having to re-setup the server) but not being able to use Google services (or any US hosted service). In my head a nice way forward would be to allow download of backup snapshots from Central (which are just zip files) directly so the user can choose where to store them. Does that sound like it would fit your needs? I'd be interested in writing that up as a Feature for the forum if so.

As a side note on a temporary solution it does look like you are able to select a data storage region with GSuite: https://support.google.com/a/answer/7630496?hl=en. This won't help if your clients or yourself block/refuse Google services entirely though which I know is the case with a few EU NGOs and health services.

dmenne · September 29, 2020, 9:58am

I had proposed this in a separate thread: don't upload to Google when the address is not filled in or valid. This would be perfect, not much work and probably would not even required a change in the web interface. The response of @yanokwa was not encouraging.

First, for patient/care home data no external site is allowed, with the exception of some very special security servers that are beyond the financial range.

For business data, your suggestion is interesting. As far I understand, the latest ruling of the European Court of Justice implies that any company having to obey NSA rules (simplified) is not permitted any longer, which causes quite a bit of trouble. Microsoft had tried to provide a solution by installing a server farm in Frankfurt which was completely managed by Deutsche Telekom - as far I know they have closed it down after the first ruling of the court that allowed individual contracts between businesses. Maybe that is coming back now.

For Brits, after Brexit everything will be better anyway, no more European Court - just kidding.

seadowg · September 29, 2020, 10:11am

~~Ah. Would you be able to link that thread?~~

Edit: never mind it looks like there is a support discussion here and you created a feature discussion here

dmenne · September 29, 2020, 12:21pm

I noted that the discussion went a bit off (and the title of the thread is not helpful):

Please consider using a named volume or better (my choice) a host link for the database.

An anonymous volume is safe against crashes and restarts, but it is not safe against rebuilds.

seadowg · September 29, 2020, 1:12pm

It seems unclear to me from the Docker docs (looking at Volumes and down) whether this is always the case. Just to get this straight in my head: were you running into data loss every time you upgraded? If so it'd be interesting to see what steps you were using to upgrade. I personally haven't had this problem and assumed docker was taking care of this for me but it might be I've managed to get around it with sheer luck - I'd been using docker-compose stop instead of down which is what the Central docs had recommended.

dmenne · September 29, 2020, 1:32pm

It happened twice to me now, and it has happened too many times with dozens of other project, so I have become defensive.

docker-compose down, changes in docker-compose, docker-compose up is not risky, I do it every day. A version update after git pull is the killer. Luckily, my data are test data currently, and I can use the backups on Google Drive, but I fear the call of the clinic one day it happens in real use.

I admit that this is personal experience, but in theory I can understand it. The data are still there, but hidden in some cryptic asldkfqwerjlwqjerölwjerlk - volume.

LN · October 2, 2020, 4:48am

We are currently exploring adding an endpoint (and button on the UI) to request a backup directly instead of it being uploaded anywhere. We hope it can be part of the next Central release and I'll update the feature thread as we have more details.

Are you absolutely sure you couldn't have done some kind of administrative task to explain the behavior? We'll certainly look more into it and if you have any documentation that could point to this being expected, that would be very helpful. I have not heard of or experienced this.

cartes · October 20, 2020, 10:19am

Hi, sorry for barging into this thread, but I think this is relevant.

If you check the docker docs now:

It is clearly stated that, if you do a "stop", it just stops the container "without" removing them.
But a "down" results in "removal" of all containers, networks, volumes, and images created by up.

I was burned by this very badly when I first installed central 3 days ago. I wanted to expose the port for the PostgreSQL database so that an external application of mine can have access. So, there is no way of doing that by a "stop", as the network has to be redone. So, I did a down-up, and to my surprise, I couldn't log in! Found out pgdata was wiped.

Fixed that by creating a volume, so another "down-up" was needed. Only to find out I mistakenly put "pgdata" under "central" folder, which is not ideal. Did another "down-up", and it worked. Did a retesting, and it worked.

All good, right? No. Suddenly the "nginx" showed "up (unhealthy)". Why? Because Lestencrypt couldn't get the certificates, as during all these "down-up", the certificates had to be reissued everytime, and therefore, ran over their rate limit, meaning I'm on a 7 day timeout unless I want to change the domain or subdomain.

I agree with dmenne here, that all important data should be configured in a way that it doesn't go down if we do a simple "down". Especially more relevant when the original dockper-compose is so well designed, and the docs are so well written.

Thanks.

dmenne · October 20, 2020, 12:47pm

Thanks for jumping in on that! I was so disappointed when I read issa's response that I more or less gave up discussing this.

Not correct. Only very basic examples using postgresql use anonymous mapping.

Florian_May · October 20, 2020, 1:19pm

I kind of feel I've dodged a bullet here, as my IT require me to use their dedicated huge, fast, and well managed/backed up postgres instance.

Since ODK Central stores all data, including media attachments, in the db, this conveniently centralises all growth in the db and keeps my server predictably small. I just need to modify the server config to use our own db and don't have to customise any of the Central code. I love the design decisions that lead to this and see the clarity of having one (necessarily opinionated) deployment pathway. I got bitten by those constraints too (locked into Kubernetes until only recently).

Maybe using a separate postgres instance could be a way forward for you, Dieter? Wouldn't have to be dockerised and therefore might be easier to access, backup, less of a pain to persist.

cartes · October 20, 2020, 2:41pm

@Florian_May your suggestion is great until you try to get docker to connect with a DB running un-dockerized on localhost. So, as a basic installation, running pgsql inside docker is a very good comporomise.

In line of what @dmenne and @issa has been discussing about, I would humbly suggest the Core Team to at least allow a configurable option, if not outright hardcode it into the code, to mount separate volumes for storing persistent data. Not only pgsql, but also the SSL Certificate.

Also, since the codebase is always changing, I think the best course would be to have the pgdata & ssl folders be outside the "central" folder. For my system, I have it like this:
/data/odk
--- central
--- pgdata
--- ssl

For SSL, I also found out the hard way, need to store the DH key in a persistent storage as well.