1. What is the issue? Please be detailed.
Since a while now I’m struggling to find a way to keep the database from growing to sizes that are hardly manageable. I have postponed any action for a while, but now the DB size is about 120GB and I really need to find a solution.
Among the other options I’m exploring the possibility to move the attachments to an S3 compatible storage
but this solution is not free of doubts -that I have outlined in the post linked here above- and anyway it would add a layer of complexity that I would prefer to avoid, unless really necessary.
Since a while I have then started to look into deleting submissions or deleting submissions attachments. There are 2 threads that I have carefully read
and where I also left some comments and already received answers, but to be honest I’m still not 100% clarified in this matter so I’m looking for some final answer/confirmation (or not) and this will hopefully help others facing the same problem.
- Using the “Clearing a submission attachment” API method https://docs.getodk.org/central-api-submission-management/#clearing-a-submission-attachment does not help in reducing the database size ODK Central - How to delete a submission - #39 by manghig If this is true/confirmed then I guess that the method description should state that clearly
- This comment by @ktuite Deleting media from Central submissions - does it reduce database size - #2 by ktuite seems to imply that there is a API method to delete submissions attachments which I can’t find in docs. I’m assuming that it would be something different from the “clear submission attachment” one and assuming that it really exists, is also unclear if that would trigger the database size to shrink
- Deleting a submission entirely https://docs.getodk.org/central-api-submission-management/#deleting-a-submission seems to be the only clear way to get the database to shrink in size. It is still unclear to me if after the 30 days retention period is necessary to manually issue a
VACUUMorVACUUM FULLcommands inside the database or if database size will shrink automatically - API docs refer that “Purging deleted Submissions can also be triggered from the command line” but I can’t find here https://docs.getodk.org/central-command-line/ how that command would look like. In this other comment by @ktuite ODK Central - How to delete a submission - #19 by ktuite there is a reference to the command
docker compose exec service node lib/bin/purge-forms.js --helpbut this does not seems to work on recent Central versions (see here below). If this command still exists, would this help avoid wait the 30 days retention period, and would it trigger the database to shrink in size?
node:internal/modules/cjs/loader:1386
throw err;
^
Error: Cannot find module '/usr/odk/lib/bin/purge-forms.js'
at Function._resolveFilename (node:internal/modules/cjs/loader:1383:15)
at defaultResolveImpl (node:internal/modules/cjs/loader:1025:19)
at resolveForCJSWithHooks (node:internal/modules/cjs/loader:1030:22)
at Function._load (node:internal/modules/cjs/loader:1192:37)
at TracingChannel.traceSync (node:diagnostics_channel:328:14)
at wrapModuleLoad (node:internal/modules/cjs/loader:237:24)
at Function.executeUserEntryPoint [as runMain] (node:internal/modules/run_main:171:5)
at node:internal/main/run_main_module:36:49 {
code: 'MODULE_NOT_FOUND',
requireStack:
}
Node.js v22.21.0
- In the threads I liked above there are also valuable comments by other users and by @yanokwa but there are also statements I really struggle to understand. For example ODK Central - How to delete a submission - #34 by yanokwa it ends with “So in summary, if you delete data in Central, it eventually frees up space for re-use inside the database, but it may not free up space for re-use by the operating system” but I really can’t understand what this means. On a “normal” server the space that a DB can use to store data is the free space on the disk where the PostgreSQL data folder is in… so what is the difference?
Sorry for the long post and thanks in advance