Geo: Using the Mapbox SDK for Android

Hey @LN, thanks for laying that all out so clearly!

It pretty much all matches what I've been thinking except for one adjustment: I'm working on taking @langstonsmith's last branch and getting the Mapbox implementation up to feature parity with the Google and OSMdroid implementations, so that it can be a complete replacement for OSMdroid. I've run into a few tricky bits with that, which I'm discussing with @langstonsmith over on the other forum post. These mostly have to do with the drawing and manipulating of markers (symbols) on the map; I've filed a few issues on the Mapbox Annotation plugin.

1 Like

Unfortunately there is no way to restrict an API key to a specific app @LN - we currently only have URL restricted tokens for web applications.

1 Like

Hi all, catching up on this thread as I’m interested and want to help bring this forward. Thanks for all the work @zestyping @LN.

And just so we're on the same page, we're actually talking about MVT MBTiles (Mapbox Vector Tiles stored within a SQLite container), right?

Yes, I think this is correct based on the convo above and direction. Many MVT tools either produce an mbtiles file or a directory of MVT files (.mvt or .pbf). For most of our work at HOT (what we’ve used externally or produced internally) relies on vector tiles stored in mbtiles in the pbf format with gzip compression.

I'm particularly interested in your feedback on that proposal. Note that I'm suggesting one online basemap from a fixed list and one offline basemap that is user-specified.

I think this is related to this point/feedback here but one factor is the use cases for how data is loaded onto the device.

Through our work with OpenMapKit, there have been two main workflows for getting data onto the phones. This is mostly just on the user-specified basemap:

  1. Through a download endpoint. Via OpenMapKit Server, ODK/OMK android users can point their app to the OMK Server endpoint and download the data they need for their survey (similar to the ODK Collect Get Blank Form) -- this data is an mbtiles file and an osm file. This isn’t always used in a low-bandwidth environment given data costs and bandwidth but is a workflow for connecting the data you need to a survey form that is currently used.
  2. Hand loading data onto the phone. The standard method of connecting the phone to the computer and uploading the files to the correct directory.
1 Like

Okay, the mbtiles file makes sense.

What about the osm file? Is that something that's necessary to support, or can we support only GeoJSON for the selectable-geometry layer and expect deployments to convert their osm files to GeoJSON?

I'm wondering if that conversion would involve data loss that makes it harder (or impossible) to make use of geometry edits.

1 Like

Hi @LN!

Please forgive me for over-explaining stuff I'm quite sure you already know; I'm not just answering you but hopefully laying out some of the details for other readers of the thread!

File format

Yes, I'm thinking of MBTiles files with protofuffer blobs in the tiles table. So yes, Mapbox Vector Tiles format in a SQLite container—these have a .mbtiles extension, and the blobs have a .pbf type. As per the spec here.

The online tile-server equivalent of vector tiles have a .../z/x/y.mvt name schema. That spec is here.


I agree that it would be good to change the way users specify layers (and that "Mapping SDK" probably doesn't mean much to the average field data collector and should be removed). However, I'd like to specify in a bit more detail what I'm thinking in terms of the choices and how I'd see them stacking into three discrete layers:

  • Bottom: Opaque base layer (offline opaque MBTiles or an online tileserver)
  • Middle: non-editable semi-transparent layer (offline Vector MBTiles or online MVT from a tileserver, perhaps even static GeoJSON)
  • Top: editable layer (GeoJSON or other, to be implemented later)

Base Layer

Sometimes an actual picture (raster) is the only thing you can get that allows you to see what's on the ground. When dealing with an unmapped area, a new refugee camp, or a newly-damaged area from a natural disaster, there's little chance that you'll have vectors representing the current features on the ground, but you might be able to get an aerial image, even quite soon after a natural disaster. So that capability should always be there (as it is now with OSMDroid, albeit with a few rough edges in the implementation).

It should be possible to make this an online or offline layer without much practical difference. The current online Google or OSM layers being used by OSMDroid are rasters (well, the OSM ones certainly are, and I think the Google ones are as well). Once the functionality for a given online raster layer is built, it's relatively straightforward to enable any online raster layer (here is an example of a general way to access image tiles from any online raster tile server—albeit in Python, but the math and URL manipulations are all the same).

Semi-Transparent Layer

This may be standalone, as opposed to overlaid on top of a vector layer. Standalone (and opaque) is the usual style of the vector tiles in the MapBox SDK; the vector tiles are simply a way to generate a map view locally from much less data-intensive vector geometry instead of downloading a bunch of images (which is what the current OSM layer in ODK does). However, a huge advantage of vector tiles is they can overlay an raster with transparency between features. So you can have a layer with, for example, a bunch of road traces, but also see the satellite/aerial imagery of the raster layer beneath. Really useful for incomplete maps; you can see what's been already mapped, but you can also see the raw imagery beneath what hasn't been done yet.

Vector tiles can be offline using the MBTiles format with tile type "pbf"
instead of JPEG or PNG. They can also be online, using (among other potential sources) the MapBox vector tile server that they're (very generously!) offering ODK (that's where the .../z/x/y.mvt schema comes in; instead of grabbing the tiles out of a SQLite container, the application grabs them from an HTTP tile address).

Editable Layer

This should, I propose, be on top and always render overlaying the raster or tiled-vector layers. That's simply because they're the thing the mapper is probably working on! Between features should, of course, be transparent.


Note that I'm only proposing one layer of each type for now. As much as it might be tempting to allow multiple vector tile layers so people can choose what they want to see on the map—and maybe that's a discussion for the future—if we initially only allow a single vector tile layer, this still allows the survey creator to give the mapper anything they choose. The vector tile layer can contain, or not, roads, buildings, fire hydrants, etc. So for the moment I don't see the need to allow stacking multiple discrete vector tile layers. One is enough.

I do think it's worth letting people decide for either of the two bottom layers whether they want offline or online. There's no reason not to allow someone to use a high-resolution offline drone image layer as their raster background for a specific area, while still having the online MapBox/OSM layer on top of it. Or, I suppose, vice versa; an online raster layer with an offline vector tile overlay (though I can't really think of a situation where that would be tremendously useful given the greater file size of rasters).

1 Like

Thanks for laying out all these details, @Ivangayton! Never hurts to have clearer information out in the open so that everybody is on the same page.

There's seems to be a pretty clear consensus that, as @LN suggested, the user-facing settings should not be focused on a developer-facing concept like the choice of SDK.

I also think having three layers makes sense. @LN's proposal is a shift from internal implementation details toward user concepts, and I see @Ivangayton's proposal as a further step in that direction.

Let's present the layers in terms of the purpose they're used for; that makes sense. I'll try to propose some user-facing terminology, then:

  1. Base layer: A fully opaque layer. Could be a map with authoritative or general-purpose information; could be satellite or aerial imagery. Because the layer is fully opaque, it only makes sense to have one of these. (I'm suggesting "base layer" rather than "base map" so there's no confusion about what a "map" is: there are 3 layers and they go together to make a map.)

  2. Reference layer: An optional layer of additional information that is more specific to the task at hand, mostly transparent, drawn on top of the base layer. The purpose of the layer is to help you find your way around or to provide related information, but it is not the thing you are manipulating or collecting data about; hence the name "reference".

  3. Content layer: An optional layer of geometry drawn on top of the reference layer. The name "content" is intended to imply that this is the geometry you are working with—either you're collecting data about it, or you're editing the geometry.

You wouldn't be restricted to online, or offline, or vector, or raster, in any particular position among these layers—with the one exception that the content layer must be a vector layer.

So the ultimate list of supported formats would look like this:

  1. Base layer:

    • OSM online raster tiles
    • Mapbox online vector base map
    • Google Maps online vector base map
    • Any offline mbtiles file (raster or vector)
    • Any online mbtiles file (specify a URL)
    • Any online raster tile service (specify a URL template)
    • Any online vector tile service (specify a URL template)
  2. Reference layer:

    • Any offline mbtiles file (raster or vector)
    • Any online mbtiles file (specify a URL)
    • Any online raster tile service (specify a URL template)
    • Any online vector tile service (specify a URL template)
    • GeoJSON
  3. Content layer:

    • GeoJSON

From a user perspective, my question is: Would this make sense to users? My hypothesis is that presenting it this way, and minimizing the restrictions on which formats are allowed in each slot, means less extra stuff that users will need to understand.

From an implementation perspective:

  • The "specify a URL" options are not ones we've really been talking about so far. I would prefer to do them as part of a second round; they are only listed for completeness.
  • Although this does allow a wider range of possibilities than the proposals mentioned previously, I don't think it's actually harder to do with the Mapbox SDK. We'd be supporting the same set of formats and facing the same issues; all that's new is the possibility of two sets of raster tiles or two sets of vector tiles, which is not any harder than one set of each.
  • If Google is selected for the base layer, then the other layers are disabled. Or maybe we allow semitransparent raster tiles, but that's all.

Thoughts on this?

1 Like

On the contrary, I so appreciate the thoroughness of your response! It's all hugely helpful to have laid out so clearly. I wasn't so sure about the functional distinction between the two bottom layers and your explanation clears that up.

@zestyping, I really like what you're proposing. I'm a little unsure about the "Reference layer" term because I had to read your description to understand where the name comes from but I can't come up with a better alternative right now.


I agree and also agreed with the use cases and ideas around the base and semi-transparent layers from @Ivangayton. Just to add though on the user side, I almost think of the reference and base layer as one. Your base layer could include satellite imagery but also the semi-transparent road or boundary information. This might not change the implementation of how there are "layers" but might simplify on the user side to think about what do I want to show as the map versus what do I want to edit.

Thanks @zestyping. This was just the use case for the OMK app to download an OSM file onto the phone for editing. But I don't think it's something that necessarily needs to be supported right now. That OSM file was just an XML file and so I think focusing support on geojson only probably makes sense for now. And I don't think the conversion will involve any data loss or make it harder to use. :+1:

One clarifying question around how a user thinks about the layers:

@zestyping - are you saying that the content layer needs to be defined as a separate layer (and either have data loaded, or create data from scratch)? My question for this is around the use case of wanting to just bring a single vector data source onto the phone that can be used for the base map but also then edited. I think I understand the need for a content layer from a user perspective, but it seems like we're requiring users to duplicate the data we need to get on the phone. A use case here is that the same vector layer you use for your base map (or reference layer) is the same layer you want to use as your content layer.


@zestyping - are you saying that the content layer needs to be defined as a separate layer (and either have data loaded, or create data from scratch)?

Yes, that was what I had in mind.

It never occurred to me that you might want to use one vector layer as your base layer and also edit that layer. I assumed that one would always want to have at least some kind of (non-editable) base or reference information under the geometry being edited (if there's nothing under it to refer to, how would you know where to move the vertices?)

Could I trouble you to walk me through an example of a situation where one would want to do that?

Would it satisfy this use case if you could choose "none" for the base layer and "none" for the reference layer, and have only a content layer showing?


1 Like

Hi @smit1678,

I think one big reason to separate the Reference and (editable) Content layers that they will usually have different requirements and uses. It'll usually make sense to use different file formats for the two.

Requirements for each layer

We'll often want a large Reference layer with lots of stuff in it (i.e. the entire OSM feature set for a given area of interest, with a generous buffer around it).

The Content layer that an individual enumerator is editing will usually be much smaller. Perhaps it'll be a small part of the overall AOI, and perhaps it'll be only a single feature type (for example just the buildings), but there will be relatively few cases where someone is sent out into the field with a mandate to edit everything in a large area.

So generally the Reference layer requires compact file size and efficient loading, but not editability, while Content layer requires straightforward editability, but is likely to be a smaller dataset/file.

Implementation choices for each layer

MVT MBTiles will shine for the Reference layer; they are small, compact, locally rendered, and efficient. However, they are a horrible choice for editing (for lots of reasons I alluded to in my long post above).

For the Content layer, GeoJSON is a great choice because it's very straightforward to edit. However, we don't want to try loading an entire basemap of a large area in GeoJSON; even if the file size is manageable, the loading/rendering speed will be terrible (because there's no internal tiling; the whole file has to be loaded to display any part of it or any low-zoom rendering).

To sum up: I think in the vast majority of cases users will be better served by a large, fast Reference layer and a smaller, easily edited Content layer, probably consisting of vector tiles and GeoJSON respectively (maybe GeoPackage as another option later; that'll be more compact and higher-performance than GeoJSON, just harder for non-GIS users to generate).

1 Like

Thanks @zestyping @Ivangayton, hopefully I'm not getting us to off topic so mindful of trying to keep this convo in line with original scope so this is productive. @zestyping let us know how and if we need to separate some threads so we're getting you helpful information.

if there's nothing under it to refer to, how would you know where to move the vertices?

I think this is where we need to clarify the first use case. From my point of view, a very simple use case is not about making changes to vertices or drawing geometries on the phone, but interacting with existing data and linking survey data to it (which is the core OpenMapKit use case and is connected to some of the original use cases and purpose of starting the switch on this thread). I acknowledge that this isn’t what we all know what is possible but it seems like the most simple and basic use case for vector tiles plus Mapbox SDK that goes beyond just a new basemap. I think this is where I’m getting confused with making choices about the difference between base, reference, and content layers — because if we’re already moving onto editing geometries then what you’re doing and to what data needs to be clear to the user and so I agree if you’re editing vertices, you need a layer behind it that is a difference source so you know how to move it.

I think one big reason to separate the Reference and (editable) Content layers that they will usually have different requirements and uses. It'll usually make sense to use different file formats for the two.

Could I trouble you to walk me through an example of a situation where one would want to do that?

I don’t disagree with the potential need to have to have two different layers with two different formats. We do have the challenge that there are a lot of use cases that create different requirements.

From what I understand with vector tiles and the Mapbox SDK, we can then do things like show highlighted buildings when you select them: There is a Mapbox Demo app on the Play Store that has this example that I think is interesting to help see and test.

I think if we try to take a very simple use case (and I acknowledge that I’m biased towards an OSM use case) is that you have a map view of OpenStreetMap data and you interact with that map to select the feature to survey. So that Mapbox example above is that simple use case where you bring OSM data on the phone, it’s show as your “base map” but that same view is also the data you want to interact with. You select it to pick the building you want to survey.

For the Content layer, GeoJSON is a great choice because it's very straightforward to edit. However, we don't want to try loading an entire basemap of a large area in GeoJSON; even if the file size is manageable, the loading/rendering speed will be terrible (because there's no internal tiling; the whole file has to be loaded to display any part of it or any low-zoom rendering).

Totally agreed @Ivangayton and I think this is how the Mapbox SDK thinks — geojson as a layer on top for additional styling and interaction options. In looking at OpenStreetMap’s iD editor for example, they offer vector tiles support for editing (and just convert the vector tiles data into geojson within the application). A user doesn’t necessarily need to think about the geojson conversion (the application handles it).

Here is a first example file containing all the parcells we "manage". In this use case all I want is to add those polygons over the map to know if I am located inside or outside our owned sites.
It does also fit a second use case : find some hard to locate land parcells (eg. little parcells into marshes) (1.3 MB)

Here some "managing units" on which we make some long term (30 years) survey with ODK forms (I hope for the 30 coming years :smile: ). (23.5 KB)
And the last one, some species location, from same sites, to add over the map to check if they are still here. (16.6 KB)


Fantastic, thank you @mathieubossaert! I'm going to start looking into this. I'll open a new thread to talk about this in more detail.

When we release a version that uses the Mapbox SDK, we'll also be updating the Settings screens to let users select the base layer they want to see.

Here's the proposal: A new Settings screen for Maps

Please have a look and discuss there! Thanks.

Hi @zestyping,
we have modernized our old 2016 forms (thanks to @Jeanb) and we now use mapbox as the default map tool.
It is amazing how it is faster than openstreetmak sdk to locate us on the map... I don't know how and why but that's a fact.
But for the moment, with collect 1.22.4 I can't use any offline raster mbtile, even if it is listed and set as the base layer...
Is it a known and normal limitation due to the early use of mapbox ? Is it a bug ? Is it a bug from my side (misconfiguration) ?
Thank's a lot. Let me know if you prefer I create a new issue for that question.

@mathieubossaert, great timing, as always! We have been working on Mapbox support in stages and it did not include offline layer support in v1.22.

ODK Collect v1.23.0-beta.3 has just been released and thanks to @zestyping it includes offline layer support for both raster and vector mbtiles! Mapping settings have also moved to their own screen in General Settings.

Please try it out while it is still in beta and let us know what you think. The release should be out next week unless any major problems are found during the beta.

Very glad to hear that! The Mapbox library comes with native code for rendering the maps quickly. This is why the size of Collect has increased a bit but we do think the results are worth it.

1 Like

Hi @LN :slight_smile: I just tried this beta version.
It's great. The custom mbtiles overlays the mapbox map. Even if I am out of my map I can check m'y position ans plot a point. I'll try deeper tomorrow but if seems to work fine.
When I was talking about speed it was more about GPS fixing which is close to be immediate
Thanks for the work. Special thanks to @zestyping !



Got it. The OSM implementation only uses GPS for location fixes whereas Mapbox uses the fused location provider when available which also uses wifi, cell tower and other information to get quick updates.

@mathieubossaert (and anyone else trying out offline tiles over Mapbox basemaps), offline raster layers are currently displayed with 50% transparency when using a Mapbox basemap whereas they are displayed with no manipulation with other basemaps. Which do you prefer?

I'm also interested in what @Ivangayton has to say. You described the "reference layer" as a "semi-transparent layer" here. I took that to mean that someone creating this layer will generally want to either use transparent PNG rasters or vectors with a style that doesn't fill everything in. Did you instead mean that ODK should take whatever file the user provides and render it semi-transparently?