Backup api call fails after several minutes with ERR_STREAM_PREMATURE_CLOSE

Attila_Heidrich · March 27, 2023, 3:24pm

**1. I get ERR_STREAM_PREMATURE_CLOSE error after 23 minutes when I try to get a backup from the server
We need to move a site (v1.5.3) to a new environment. The URL will be different too
versions:
da62679fe138eb89b42fcc2d9f61ae9beca9d67b (v1.5.3-2-gda62679)
d9cb07fdceaa7df017ed2aee114db1b2b7e1a2d8 client (v1.5.3)
badb3912fdf4d5dca29bd4cd520b9d3b4788db6e server (v1.5.3)

The site runs odkcentral with docker-compose. I "found" it, not my installation.

**2. curl -o xxx.zip --user 'xxx@gmail.com:xxxword' -X POST -H "Content-Type:application/json" -d '{"passphrase": "anotherpass"}' https://xxx.yy/v1/backup

**3.
tried from a remote site, also locally on the server where the "service" container is running (and all others).
after I realized, that it is required to have free space for the entire backup (which surprised me a lot) I thought it was the storage issue, but we added a complete disk as /tmp of the service container to have a view of the backup files while created

**4.
client errors:

root@xxx:/mnt/sgcvol# curl ...
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 32093    0 32065    0    28     22      0 --:--:--  0:23:15 --:--:--     0
curl: (18) transfer closed with outstanding read data remaining

server errors:

::ffff:172.18.0.8 - - [27/Mar/2023:13:55:52 +0000] "POST /v1/backup HTTP/1.0" 200 -
Error [ERR_STREAM_PREMATURE_CLOSE]: Premature close
    at new NodeError (internal/errors.js:322:7)
    at ServerResponse.onclose (internal/streams/end-of-stream.js:121:38)
    at ServerResponse.emit (events.js:412:35)
    at ServerResponse.emit (domain.js:537:15)
    at Socket.onServerResponseClose (_http_server.js:221:44)
    at Socket.emit (events.js:412:35)
    at Socket.emit (domain.js:475:12)
    at TCP.<anonymous> (net.js:686:12)
    at TCP.callbackTrampoline (internal/async_hooks.js:130:17)

tmp dir after interrupted backup:

root@xxx:/mnt/sgcvol# ls -ltr tmp-60-Rd8D4uEORFqN/
total 15816620
-rw-r--r-- 1 root root       78686 Mar 27 14:05 toc.dat
-rw-r--r-- 1 root root      115144 Mar 27 14:05 2554.dat.gz
-rw-r--r-- 1 root root      123612 Mar 27 14:05 2544.dat.gz
-rw-r--r-- 1 root root       41987 Mar 27 14:05 2549.dat.gz
-rw-r--r-- 1 root root        1616 Mar 27 14:05 2570.dat.gz
-rw-r--r-- 1 root root         406 Mar 27 14:05 2561.dat.gz
-rw-r--r-- 1 root root        3682 Mar 27 14:05 2547.dat.gz
-rw-r--r-- 1 root root        2276 Mar 27 14:05 2545.dat.gz
-rw-r--r-- 1 root root        2325 Mar 27 14:05 2538.dat.gz
-rw-r--r-- 1 root root      544728 Mar 27 14:05 2565.dat.gz
-rw-r--r-- 1 root root          25 Mar 27 14:05 2574.dat.gz
-rw-r--r-- 1 root root          25 Mar 27 14:05 2573.dat.gz
-rw-r--r-- 1 root root          25 Mar 27 14:05 2571.dat.gz
-rw-r--r-- 1 root root          25 Mar 27 14:05 2568.dat.gz
-rw-r--r-- 1 root root          25 Mar 27 14:05 2567.dat.gz
-rw-r--r-- 1 root root         537 Mar 27 14:05 2560.dat.gz
-rw-r--r-- 1 root root         240 Mar 27 14:05 2558.dat.gz
-rw-r--r-- 1 root root          25 Mar 27 14:05 2556.dat.gz
-rw-r--r-- 1 root root          25 Mar 27 14:05 2555.dat.gz
-rw-r--r-- 1 root root         111 Mar 27 14:05 2551.dat.gz
-rw-r--r-- 1 root root         465 Mar 27 14:05 2548.dat.gz
-rw-r--r-- 1 root root         338 Mar 27 14:05 2542.dat.gz
-rw-r--r-- 1 root root          29 Mar 27 14:05 2540.dat.gz
-rw-r--r-- 1 root root      277190 Mar 27 14:05 2563.dat.gz
-rw-r--r-- 1 root root      660899 Mar 27 14:05 2550.dat.gz
-rw-r--r-- 1 root root 16194281472 Mar 27 14:38 2553.dat.gz
root@xxx:/mnt/sgcvol#

I may not be using the correct process, please help me to choose the correct way to transfer a site into a new location. I would avoid filesystem level backups, and docker exports, if possible, but I'm also open to be over this task, so

Matthew_White · March 29, 2023, 6:27am

When you request a Direct Backup, Central does a few things:

Run pg_dump.
Encrypt the result with an optional passphrase.
Respond with (stream) the encrypted backup.

The error you describe arises from a Central limitation. When you request a Direct Backup, Central will wait a certain length of time for pg_dump to complete. After that, the request will time out. That length of time turns out to be about 23 minutes.

We could probably increase that length of time. However, if pg_dump isn't completing within 23 minutes, that probably means that the backup is fairly large. It may be better to seek an alternative rather than to attempt to transfer a backup of that size over the API.

In general, we recommend making a full system backup. Direct Backups back up the database, but a database backup by itself does not include sufficient information to re-establish the same Web Form links. See the documentation for more information.

With that caveat, another option is to run pg_dump on your own, not using the API server. Since the API server won't encrypt the result, you won't use Central to restore the backup. Instead, you would use Postgres tools to restore the backup (likely pg_restore). Central calls pg_dump with option -F d to output a directory-format archive, but you may find another format more convenient. Most Central installations won't have pg_dump available outside the Postgres container, but you could run pg_dump within the container. You could also install Postgres tools outside the container.

It's critical to have an ongoing backup strategy in place so that you're completing backups on a regular basis, not just once. In particular, you'll want to have a backup in place before upgrading to v2023.2. That version of Central (the latest) upgrades Postgres to version 14.

Attila_Heidrich · March 30, 2023, 12:00pm

Thanks for the quick answer!

I see nothing specific in the documentation on the "full system backup". What else should it cover exactly beside the database?

Matthew_White · March 30, 2023, 10:20pm

The main thing to back up besides the database is Enketo data. The documentation mentions:

You will additionally need to have a backup of Enketo data to be able to restore existing Web Form links. At a minimum, you must back up Enketo's Redis store and the keys generated in the Enketo configuration.