Download Audit of Edits on Central Using R & ruODK

Current versions of ODK Central allow editing on the server. Many studies, including clinical trials, require data team to keep an audit trail, but there's no obvious way to download a simple human readable audit document that shows key information.
This post is aimed to start a discussion about how to do this. The code below are here for concept only and I hope that community members will improve on this example.

In the simplest form, an audit trail for a clinical trial requires something that looks like this

Name of Enumerator Datetime Record Field Old Value New Value Reason for Change
Janna 2022-05-06 12:34 uuid:b5fb085d-aec1-42ef-81ba-5c7f13b6bbe1 satellites/village Oulton Broad Lowestoft Subject Moved Home
Chrissy 2022-05-04 15:21 uuid:b5fb085d-aec1-42ef-81ba-5c7f13b6bbe1 name.subject Kevinne Kevin Spelling mistake
Chrissy 2022-05-04 15:21 uuid:05365efe-5c6e-4980-97ca-a01ed55622022 household/hh.number KS1234 KS3212 Subject Moved House

ODK Central stores all the information needed to create the audit file in some JSON files and the Central API can be used to get to these, but you have to push a couple of different files together to get to an audit that makes sense.

First, you need to list all the versions of a form. This file has all the key information about who made changes and when they did so

https://odkcentral.docs.apiary.io/#reference/submissions/submission-versions/listing-versions

Then you need to get a list of all the specific differences (i.e. the changes that were made to the form)
https://odkcentral.docs.apiary.io/#reference/submissions/submission-versions/getting-changes-between-versions

Below there's an example of how to make a basic audit from these, using code cannibalised from ruODK.
Hopefully @Florian_May and friends can do something a bit tidier. Personally I just can't get my head around working with JSON so there's some outstanding issues (especially with repeats) but this does the basic job.

library(ruODK)
library(tidyverse)
library(jsonlite)


########################################################################################
#connect to server
########################################################################################

ruODK::ru_setup(
  url = "YOURSERVER.URL",
  pid = "NUMBER",
  fid = "NAMEOFFORM",
  un = Sys.getenv("USERNAME"),
  pw = Sys.getenv("PASSWORD"),
  tz = "Europe/London",
  
  verbose = TRUE
)

###############################################################################
# CREATE A NEW FUNCTION THAT PULLS THE DIFFS AND THE VERSIONS
# This is basically one of the ruODK commands rewritten
###############################################################################

odata_diffs_get<-function (uuid,form,comdiff,table = "Submissions", skip = NULL, top = NULL, count = FALSE,
                           wkt = FALSE, filter = NULL, parse = TRUE, download = TRUE,
                           orders = c("YmdHMS", "YmdHMSz", "Ymd HMS", "Ymd HMSz", "Ymd",
                                      "ymd"), local_dir = "media", pid = get_default_pid(),
                           fid = get_default_fid(), url = get_default_url(), un = get_default_un(),
                           pw = get_default_pw(), odkc_version = get_default_odkc_version(),
                           tz = get_default_tz(), retries = get_retries(), verbose = get_ru_verbose())
{
  ru_msg_info("Downloading submissions...", verbose = verbose)
  qry <- list(`$count` = ifelse(count == FALSE, "false", "true"),
              `$wkt` = ifelse(wkt == FALSE, "false", "true"))
  if (!is.null(skip)) {
    qry$`$skip` <- as.integer(skip)
    "Skipping first {as.integer(skip)} records" %>% glue::glue() %>%
      ru_msg_info(verbose = verbose)
  }
  if (!is.null(top)) {
    qry$`$top` <- as.integer(top)
    "Limiting to max {as.integer(top)} records" %>% glue::glue() %>%
      ru_msg_info(verbose = verbose)
  }
  if (odkc_version >= 1.1 && !is.null(filter) && filter !=
      "") {
    qry$`$filter` <- as.character(filter)
    "Filtering records with {as.character(filter)}" %>% glue::glue() %>%
      ru_msg_info(verbose = verbose)
  }
  
  
  qry <- qry[qry != ""]
  
  
  if(comdiff=="diffs"){
    sub <- httr::RETRY("GET", httr::modify_url(url, path = glue::glue("v1/projects/{pid}/forms/",form,"/submissions/",uuid,"/diffs")),
                       query = qry, times = retries, httr::add_headers(Accept = "application/json"),
                       httr::authenticate(un, pw)) %>% httr::content(.)
    sub<- as_tibble(enframe(unlist(sub,recursive = F))) %>%
      unnest_wider(col = value) %>%
      mutate(uuid = str_sub(name,start = 1,end = 41))
  }
  
  
  if(comdiff=="versions"){
    sub <- httr::RETRY("GET", httr::modify_url(url, path = glue::glue("v1/projects/{pid}/forms/",form,"/submissions/",uuid,"/versions")),
                       query = qry, times = retries, httr::add_headers(Accept = "application/json"),
                       httr::authenticate(un, pw)) %>% httr::content(.)
    sub<- as_tibble(enframe(unlist(sub,recursive = F)))
    sub<- sub %>% mutate(nn=rep(seq_len(nrow(sub)/7), each = 7)) %>%
      pivot_wider(id_cols = nn,names_from = name,values_from = value) %>%
      mutate(instanceId=as.character(instanceId)) %>%
      rename("uuid"="instanceId")
    
    
  }
  
  if(comdiff=="comments"){
    sub <- httr::RETRY("GET", httr::modify_url(url, path = glue::glue("v1/projects/{pid}/forms/",form,"/submissions/",uuid,"/comments")),
                       query = qry, times = retries, httr::add_headers(Accept = "application/json"),
                       httr::authenticate(un, pw)) %>% httr::content(.)
    sub<- as_tibble(enframe(unlist(sub,recursive = F)))
    
  }
  
  
  if(comdiff=="submittors"){
    sub <- httr::RETRY("GET", httr::modify_url(url, path = glue::glue("v1/projects/{pid}/forms/",form,"/submissions/",uuid,"/submitters")),
                       query = qry, times = retries, httr::add_headers(Accept = "application/json"),
                       httr::authenticate(un, pw)) %>% httr::content(.)
  }
  
  
  
  sub
  
}

###############################################################################
# CREATE A NEW FUNCTION THAT CREATES AN AUDIT BY COMBINING THE DIFF AND VERSION DATA WITH USER IDs. 
###############################################################################

do_audit<-function(currentuuid,formid){
  #get versions
  df7<-(odata_diffs_get(uuid = currentuuid,form = formid,comdiff="versions")) %>%
    mutate(across(everything(), as.character))
  
  #get changes for each uuid
  df4<-(odata_diffs_get(uuid = df7$uuid[nrow(df7)],form = formid,comdiff="diffs")) %>%
    mutate(across(everything(), as.character))
  
  audit<-full_join(df7,df4)
  
  users<-user_list() %>% rename(submitterId=id) %>% select(-created_at,-updated_at,-deleted_at) %>%   mutate(across(everything(), as.character))
  
  audit<-left_join(audit,users) %>%
    mutate(across(everything(), as.character))
  
  audit
}

###############################################################################
# DOWNLOAD SUBMISSIONS
###############################################################################

# get form schema
fq<-ruODK::form_schema()

df<-ruODK::odata_submission_get() %>%
  ruODK::odata_submission_rectangle(names_sep = NULL) %>%
  ruODK::handle_ru_datetimes(form_schema = fq)

###############################################################################
# GET THE FIRST RECORD
###############################################################################

df2<-do_audit(df$id[1],formid = "FORMNAME")

###############################################################################
# ITERATE THROUGH ALL SUBMISSIONS AND BIND AUDITS TOGETHER
###############################################################################

for(i in 2:nrow(df)){
  message(i)
  df2<-bind_rows(df2, do_audit(df$id[i],formid = "FORMNAME"))
}

|nn |submitterId|createdAt               |instanceName|uuid                                     |current|deviceId|userAgent                                                                                                                         |name                                      |old                                      |new                                                                                                                                          |path                                                           |email                  |type|display_name           |
|---|-----------|------------------------|------------|-----------------------------------------|-------|--------|----------------------------------------------------------------------------------------------------------------------------------|------------------------------------------|-----------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------|---------------------------------------------------------------|-----------------------|----|-----------------------|
|1  |19         |2022-06-09T07:45:43.402Z|NULL        |uuid:b5fb085d-aec1-42ef-81ba-5c7f13b6bbe1|TRUE   |NULL    |NULL                                                                                                                              |uuid:b5fb085d-aec1-42ef-81ba-5c7f13b6bbe11|2                                        |3                                                                                                                                            |list("satellites", "r_satellite_number")                       |chrissy.roberts@x.ac.uk|user|Chrissy h. Roberts (SA)|
|1  |19         |2022-06-09T07:45:43.402Z|NULL        |uuid:b5fb085d-aec1-42ef-81ba-5c7f13b6bbe1|TRUE   |NULL    |NULL                                                                                                                              |uuid:b5fb085d-aec1-42ef-81ba-5c7f13b6bbe12|2                                        |3                                                                                                                                            |list("satellites", "villages_count")                           |chrissy.roberts@x.ac.uk|user|Chrissy h. Roberts (SA)|
|1  |19         |2022-06-09T07:45:43.402Z|NULL        |uuid:b5fb085d-aec1-42ef-81ba-5c7f13b6bbe1|TRUE   |NULL    |NULL                                                                                                                              |uuid:b5fb085d-aec1-42ef-81ba-5c7f13b6bbe13|NA                                       |list(namenumber = "3", name = "BBB-Add_new_repeat")                                                                                          |list("satellites", list("villages", 2))                        |chrissy.roberts@x.ac.uk|user|Chrissy h. Roberts (SA)|
|1  |19         |2022-06-09T07:45:43.402Z|NULL        |uuid:b5fb085d-aec1-42ef-81ba-5c7f13b6bbe1|TRUE   |NULL    |NULL                                                                                                                              |uuid:b5fb085d-aec1-42ef-81ba-5c7f13b6bbe14|2                                        |3                                                                                                                                            |list("satellites", "satellite_data_count")                     |chrissy.roberts@x.ac.uk|user|Chrissy h. Roberts (SA)|
|1  |19         |2022-06-09T07:45:43.402Z|NULL        |uuid:b5fb085d-aec1-42ef-81ba-5c7f13b6bbe1|TRUE   |NULL    |NULL                                                                                                                              |uuid:b5fb085d-aec1-42ef-81ba-5c7f13b6bbe15|NA                                       |list(name_from_list = "BBB-Add_new_repeat", r_sat_province = "26", r_sat_territoire = "", r_sat_village = "Added repeat", r_no_sat_x = "400")|list("satellites", list("satellite_data", 2))                  |chrissy.roberts@x.ac.uk|user|Chrissy h. Roberts (SA)|
|1  |19         |2022-06-09T07:45:43.402Z|NULL        |uuid:b5fb085d-aec1-42ef-81ba-5c7f13b6bbe1|TRUE   |NULL    |NULL                                                                                                                              |uuid:b5fb085d-aec1-42ef-81ba-5c7f13b6bbe16|uuid:1a9c9d17-c2e7-4471-90ee-f91340d48b1b|uuid:b5fb085d-aec1-42ef-81ba-5c7f13b6bbe1                                                                                                    |list("meta", "instanceID")                                     |chrissy.roberts@x.ac.uk|user|Chrissy h. Roberts (SA)|
|1  |19         |2022-06-09T07:45:43.402Z|NULL        |uuid:b5fb085d-aec1-42ef-81ba-5c7f13b6bbe1|TRUE   |NULL    |NULL                                                                                                                              |uuid:b5fb085d-aec1-42ef-81ba-5c7f13b6bbe17|NA                                       |uuid:1a9c9d17-c2e7-4471-90ee-f91340d48b1b                                                                                                    |list("meta", "deprecatedID")                                   |chrissy.roberts@x.ac.uk|user|Chrissy h. Roberts (SA)|
|2  |19         |2022-06-09T07:38:12.441Z|NULL        |uuid:1a9c9d17-c2e7-4471-90ee-f91340d48b1b|FALSE  |NULL    |Enketo/3.1.0 Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/102.0.0.0 Safari/537.36|NA                                        |NA                                       |NA                                                                                                                                           |NA                                                             |chrissy.roberts@x.ac.uk|user|Chrissy h. Roberts (SA)|
|1  |19         |2022-06-09T07:42:47.123Z|NULL        |uuid:05365efe-5c6e-4980-97ca-a01ed5562202|TRUE   |NULL    |NULL                                                                                                                              |uuid:05365efe-5c6e-4980-97ca-a01ed55622021|AAA                                      |AAA_edit                                                                                                                                     |list("satellites", list("villages", 0), "name")                |chrissy.roberts@x.ac.uk|user|Chrissy h. Roberts (SA)|
|1  |19         |2022-06-09T07:42:47.123Z|NULL        |uuid:05365efe-5c6e-4980-97ca-a01ed5562202|TRUE   |NULL    |NULL                                                                                                                              |uuid:05365efe-5c6e-4980-97ca-a01ed55622022|AAA                                      |AAA_edit                                                                                                                                     |list("satellites", list("satellite_data", 0), "name_from_list")|chrissy.roberts@x.ac.uk|user|Chrissy h. Roberts (SA)|
|1  |19         |2022-06-09T07:42:47.123Z|NULL        |uuid:05365efe-5c6e-4980-97ca-a01ed5562202|TRUE   |NULL    |NULL                                                                                                                              |uuid:05365efe-5c6e-4980-97ca-a01ed55622023|uuid:ea6a1113-0285-400b-bec9-f41fb5ee69b9|uuid:05365efe-5c6e-4980-97ca-a01ed5562202                                                                                                    |list("meta", "instanceID")                                     |chrissy.roberts@x.ac.uk|user|Chrissy h. Roberts (SA)|
|1  |19         |2022-06-09T07:42:47.123Z|NULL        |uuid:05365efe-5c6e-4980-97ca-a01ed5562202|TRUE   |NULL    |NULL                                                                                                                              |uuid:05365efe-5c6e-4980-97ca-a01ed55622024|NA                                       |uuid:ea6a1113-0285-400b-bec9-f41fb5ee69b9                                                                                                    |list("meta", "deprecatedID")                                   |chrissy.roberts@x.ac.uk|user|Chrissy h. Roberts (SA)|
|2  |19         |2022-06-09T07:37:27.163Z|NULL        |uuid:ea6a1113-0285-400b-bec9-f41fb5ee69b9|FALSE  |NULL    |Enketo/3.1.0 Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/102.0.0.0 Safari/537.36|NA                                        |NA                                       |NA                                                                                                                                           |NA                                                             |chrissy.roberts@x.ac.uk|user|Chrissy h. Roberts (SA)|
|1  |19         |2022-06-09T07:37:07.962Z|NULL        |uuid:94cd77d9-66bf-4307-89ea-f1e271574057|TRUE   |NULL    |Enketo/3.1.0 Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/102.0.0.0 Safari/537.36|NA                                        |NA                                       |NA                                                                                                                                           |NA                                                             |chrissy.roberts@x.ac.uk|user|Chrissy h. Roberts (SA)|

Oustanding issues are

  • When repeats exist, all changes to a repeat are listed in a single entry in the old and new columns. These would need unnesting, potentially in to new rows.
  • Paths (i.e. field names that exist within repeats, groups etc) need concatenating or unnesting
  • Need to figure out how to add 'reason for change' field. Possibly just as simple as changing a value in a field on the form
  • My code is fairly clunky as I don't have a good handle on how to properly use Purrr and JSONlite in R.
  • This is really slow - needs an improved version that does whole table, rather than looping through each instance
6 Likes

Thanks for sharing this here, @chrissyhroberts!
I've added a link to this post at the ruODK issue tracking all as yet unimplemented API endpoints relating to submissions: https://github.com/ropensci/ruODK/issues/124

I don't have much spare bandwidth for this, but I'd be happy to review and collaborate on pull requests for an implementation of the relevant endpoints (get form changes, get comments) or even a worked example as a vignette.

The audit log supplied by Central contains only form changes done in Collect, but not those from Central, both for an individual submission, and with slight differences for the entire form.
It would be nice to have all form audits (form behaviour, changes, comments, both in Collect and Central) on one tap. Until someone funds this feature, ruODK can probably merge these different audit logs somehow.

1 Like