Basic Data Analysis with R (Scripts, Markdown & Dashboard)

During the @TAB meeting on 2021-05-12 we had some discussions about how to perform basic analysis in R, a free software for statistical analysis, visualisation and report generation.

R is incredibly powerful and new users should take a look at these resources to get a more in depth view of how to get started

You'll need to install a copy of R and the GUI Rstudio in order to use the examples below.

You'll also need this toy ODK data set in CSV format.
data.csv (570.1 KB)

What people want to do with their ODK data is pretty varied, but in the first instance most people will want some summaries and a couple of bar charts.

Here I provide a basic example of how to do this in R.

There are two scripts that you need to download in this zip file

R_Analysis_Scripts.zip (2.1 KB)

The easiest way to work these is to open RStudio, then click the file menu and select New Project. Save the project in a new folder, then open that folder. Paste the data set and the scripts. you downloaded above in to this folder, then click the .Rproj file to open the project.

In the Rstudio File menu you can then use open file to load the scripts.

To run them you simply press the source button.

Basic Analysis

For most people, the Basic.Data.Summaries.in.R file provides a useful start point. This script will

  • Open the data set
  • Select some useful variables
  • Save a summary table to the R project folder
  • Save another summary, split by a user defined variable
  • Save a barchart

Armed with a bit of basic knowledge from the training resources above and you'll be up and running with your own analysis pretty quickly.

R Markdown

Many users want to share a summary report to their stakeholders. R Markdown is a nice way to make reports in html or pdf format. You can also put these online as a form of basic dashboard.

The Basic.Data.Summaries.in.Rmd script does the same things as the basic analysis script, but makes a PDF report. You'll see that the key difference here is that you can window dress the documentation with text, show the code that was used to create the tables and charts and other things.

Simply open the .Rmd file in Rstudio and click Knit to create the PDF in your R project folder.

R Markdown is very versatile and can export a whole bunch of different formats.

Dashboards

There's a lot of ways to make interactive dashboards in R. Some are very sophisticated, but are also a bit harder to use. Perhaps the simplest way is to use R Markdown to create an html page or site.

Open the Basic.Data.Summaries.in.Rmd file and change line 3
from
output: pdf_document

to

output: rmdformats::readthedown

This will create an attractive html report that can be served from pretty much any website (I generally use github.io sites for this, see example here).

Hopefully this post will provide a start point for people who have so far used excel for their analysis. Using a professional package for statistical analysis can seem daunting and does come with a pretty steep learning curve, but is worth it in the long run.

Example output files
Example_Outputs.zip (1.0 MB)

10 Likes

Great!

Will check this out.

Paul

Great resources, thanks for posting these!

To get data directly from ODK Central into R, you can use the R package ruODK.

Installing R, RStudio (or any other IDE), ruODK and its dependencies can be a hassle on corpo-bricked Windows machines. urODK, the companion package to ruODK, provides a pre-built RStudio Server at Binder.

ruODK provides a template RMarkdown workbook with the basic "download and parse my data" workflow:

rmarkdown::draft("my_example.Rmd", "odata", package="ruODK")

The workbook contains instructions to configure ODK Central credentials, to download and parse your own data, and some initial insight and visualisations as a starting point for your own analysis.

In addition to the resources posted by @chrissyhroberts, R for applied epidemiology and public health looks useful.

Spatial data needs some special (spatial?) steps to visualise. Follow the worked examples of the ruODK vignette "Spatial data" to go from the three ODK types (geopoint, geotrace, geoshape) to native spatial objects in R.

If you write a paper or publication using ruODK, you can find the citation for ruODK here and for ODK in this thread.

5 Likes