Hi @ktuite and @Lindsay_Stevens_Au ,
Thanks for the tips! Very useful. In the meantime I managed to solve the issue by using the Odata get() using this in a Python class:
"""Class to get ODK submissions in pages"""
class ODKCentralClient:
def init(self, base_url, default_project_id, table_name, username, password, page_size):
"""
Initialize the client with ODK Central credentials and settings.
:param base_url: Base URL of the ODK Central server (e.g. https://your-odk-server)
:param default_project_id: ID of the project to access
:param username: Username for Basic Auth
:param password: Password for Basic Auth
:param page_size: Number of submissions per page (default 200)
"""
self.base_url = base_url.rstrip('/')
self.default_project_id = default_project_id
self.auth = HTTPBasicAuth(username, password)
self.page_size = page_size
self.table_name = table_name
def _build_endpoint(self, form_id):
"""
Build the OData submissions endpoint URL with page size limit.
:param form_id: The form ID (not form name)
:return: Full URL string
"""
return (f"{self.base_url}/v1/projects/{self.default_project_id}/forms/"
f"{form_id}.svc/{self.table_name}?$top={self.page_size}")
def get_all_submissions(self, form_id):
"""
Fetch all submissions for a form, handling pagination.
:param form_id: The form ID to fetch submissions from
:return: List of all submissions (each submission is a dict)
"""
endpoint = self._build_endpoint(form_id)
all_submissions = []
page_number = 1
while endpoint:
response = requests.get(endpoint, auth=self.auth)
if response.status_code == 200:
data = response.json()
current_page_submissions = data.get('value', [])
print(f"Page {page_number} has {len(current_page_submissions)} submissions.")
all_submissions.extend(current_page_submissions)
endpoint = data.get('@odata.nextLink')
if endpoint and not endpoint.startswith('http'):
endpoint = f"{self.base_url}{endpoint}"
page_number += 1 if endpoint else 0
else:
raise Exception(f"Failed to fetch data. Status code: {response.status_code}, Response: {response.text}")
print(f"Total submissions fetched: {len(all_submissions)}")
return all_submissions
This Python class seems to work well. I don’t get the memory message anymore.
The only issue I now have is with some specific endpoints, for example an endpoint with this table name:
“Submissions.group_existing_site.group_add_photos.repeat_additional_photos”
I get this authenticate error message:
“Traceback (most recent call last):
File "/Users/name/Documents/Python_scripts/GetODK/download_monitoring_data_v4.py", line 324, in
json_additional_photos = client.get_all_submissions(form_id)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/name/Documents/Python_scripts/GetODK/download_monitoring_data_v4.py", line 150, in get_all_submissions
raise Exception(f"Failed to fetch data. Status code: {response.status_code}, Response: {response.text}")
Exception: Failed to fetch data. Status code: 401, Response: {"message":"Could not authenticate with the provided credentials.","code":401.2}”
It is also strange that this table name does not create any problem using PyODK in data requests…
For other endpoints, there is no problem (also with repeats) but for this specific one there is, only with get()… Very strange.