Backup NOT to Google Drive

From what I read from the docs, a backup to Google Drive is the only available option. I am aware that it is encrypted, but for security reasons we run everything in an Intratnet, and backup up to a site with NSA reservation is not permitted.

Is there any alternative? Saving MySQL is no problem, but I assume that restoring might be messy in case the server must setup from scratch.

Hi @dmenne, we recently implemented a backup on a local server.
We will be sharing the process in few days.

2 Likes

You can use a local JSON server (for automated process), as well as a Briefcase-initiated backup (for a manual process).

1 Like

Thanks, I had forgotten that Briefcase would do it. And waiting for @bertrand_rukundo's solution!

There is a bit of imprecise language in the discussion so far because I think folks are likely taking about different servers.

@dmenne asked about Central, so to be precise...

Central is backed by PostgreSQL, not MySQL. If you don't want to use Google Drive to store backups, you can backup up the database as you see fit. Restore is as simple as restoring the database.

Briefcase can pull all submissions from Central and you can use that as backup. Restore isn't great because pushing will only work if submissions are associated with a single version. We are working on fixing this issue, but no ETA on this.

You can also use the API to backup and restore data, but you'll have to write your own code to interact with that API.

1 Like

Sorry, clearly it's PostgresSQL - we are testing too many alternatives and I mixed up.

Using the API is no problem for me, because we anyway create an sqlite-warehouse with relational data of the forms using ruODK. However, reading the docs I do not understand how I would create a full image of the database for easy restore from it. I find it strange that an API for backup is directly linked to Google Drive - or did I miss the existence of some "get the packed database to a given file"?

1 Like

Hi @dmenne,
I just checked with our IT admin, it seems they just run the whole system daily backup (application server and database server) using VEEAM (https://www.veeam.com/)
Sorry if that doesn't help much.

2 Likes

Thanks. As I understand @yanokwa, the full state is saved with a database backup. That should be fine, in addition to some weekly full server backup.

1 Like

This came about because we know a lot of users of ODK tools have very limited technical capacity and we want to make sure that anyone has access to a straightforward off-premise backup option. If you have some technical know-how and/or are using a web provider that offers a backup solution, I agree that there are better options!

Using a database backup is a recipe for disaster. I tried it today to port from a 3 week old installation to a new one, and it failed because of some changes in database structure. Nothing really serious is lost now, but I do not see some easy method of cloning a database. Backups should be done by backing up the whole system, do not backup the database.

I do not have the slightest idea why an open-source product puts priority on offering a google-drive backup ONLY. It would be very easy to replace the google drive by a direct download/upload. Did google pay for this privilege?

@yanokwa:

You can also use the API to backup and restore data, but you'll have to write your own code to interact with that API.

Well, if you mean downloading all forms, user, submission and uploading these again: it could work. What is documented as a backup API is only configuration of a google drive upload. No way following European Law.

Sorry to be grumpy.

We a very small team with limited resources. We are providing millions of people, yourself included, software and support for free. If our offering is not to you liking, your best option for improvement is to contribute.

For example, you could file a detailed bug report about what failed with the database backups and how we could reproduce it. Or if, as you say, “It would be very easy to replace the google drive by a direct download/upload”, you could show us how easy it is by sending in a pull request. Or if you aren't a developer you could pay someone to add said feature to ODK.

I look forward to your contributions to Central.

1 Like

Keep a copy of the zip-file in `data/transfer when upload is not successful. This avoids having to create additional gui elements: if google is not configured, backup is only kept local. And it is up to the sysadmin to copy the backup to some safe place.

backup.js:

  // run the pgdump and encrypt it into a zipfile.
  const [ tmpdirPath, tmpdirRm ] = await tmpdir();
  await pgdump(tmpdirPath);
  const tmpfilePath = await tmpfile();
  await encryptToArchive(tmpdirPath, tmpfilePath, configValue.keys);
  tmpdirRm();

I did not report the database errors, because it is your valid choice to change the format - that's what the API is good for. If you want, here it is - I cloned a database created 4 weeks ago.


postgres_1            | LOG:  database system was shut down at 2020-08-08 09:56:38 UTC                                                                               
postgres_1            | FATAL:  the database system is starting up                                                                                                   
postgres_1            | LOG:  MultiXact member wraparound protections are now enabled                                                                                
postgres_1            | LOG:  autovacuum launcher started                                                                                                            
postgres_1            | LOG:  database system is ready to accept connections                                                                                         
postgres_1            | ERROR:  relation "public.actees" does not exist                                                                                              
postgres_1            | STATEMENT:  COPY public.actees (id, species) FROM stdin;                                                                                     
postgres_1            |                                                                                                                                              
postgres_1            | ERROR:  relation "public.actors" does not exist                                                                                              
postgres_1            | STATEMENT:  COPY public.actors (id, type, "acteeId", "displayName", meta, "createdAt", "updatedAt", "deletedAt", "expiresAt") FROM stdin;    
postgres_1            |                                                                                                                                              
postgres_1            | ERROR:  relation "public.roles" does not exist                                                                                               
postgres_1            | STATEMENT:  COPY public.roles (id, name, system, "createdAt", "updatedAt", verbs) FROM stdin;                                                
postgres_1            |                                                                                                                                              
postgres_1            | ERROR:  relation "public.assignments" does not exist                                                                                         
postgres_1            | STATEMENT:  COPY public.assignments ("actorId", "roleId", "acteeId") FROM stdin;                                                             
postgres_1            |                                                                                                                                              
service               | wait-for-it.sh: waiting 15 seconds for postgres:5432                                                                                         
service               | wait-for-it.sh: postgres:5432 is available after 0 seconds                                                                                   
service               | generating local service configuration..                                                                                                     
service               | running migrations..                                                                                                                         
service               | starting cron..                                                                                                                              
service               | using 4 worker(s) based on available memory (3024876)..                                                                                      
service               | starting server.                                                                                                                             
postgres_1            | ERROR:  relation "public.audits" does not exist                                                                                              
postgres_1            | STATEMENT:  COPY public.audits ("actorId", action, "acteeId", details, "loggedAt", claimed, processed, "lastFailure", failures, id) FROM stdi
;                                                                                                                                                                    
postgres_1            |                                                                                                                                              
postgres_1            | ERROR:  relation "public.blobs" does not exist                                                                                               
postgres_1            | STATEMENT:  COPY public.blobs (id, sha, content, "contentType", md5) FROM stdin;                                                             
postgres_1            |                                                                                                                                              
postgres_1            | ERROR:  relation "public.client_audits" does not exist                                                                                       
postgres_1            | STATEMENT:  COPY public.client_audits ("blobId", event, node, start, "end", latitude, longitude, accuracy, "old-value", "new-value", remainde
) FROM stdin;                                                                                                                                                        
postgres_1            |                                                                                                                                              
postgres_1            | ERROR:  relation "public.config" does not exist                                                                                              
postgres_1            | STATEMENT:  COPY public.config (key, value, "setAt") FROM stdin;                                                                             
postgres_1            |                                                                                                                                              
postgres_1            | ERROR:  relation "public.keys" does not exist                                                                                                
postgres_1            | STATEMENT:  COPY public.keys (id, public, private, managed, hint, "createdAt") FROM stdin;                                                   
postgres_1            |
1 Like

I did quite a few translations to German

1 Like

Cloning worked for me using the backup on Google Drive and a restore. However, this was a test database, and despite best encryption used in the Google-detour, it is not a legally accepted method for clinical data sets.