Geo: Using the Mapbox SDK for Android

Hi @Marena,

Please forgive me if I've not understood this in your response, but are you saying that MapBox is happy for @zestyping and the ODK community to proceed with the use of the MapBox SDK, understanding that this may be for using their own custom vector tiles in some cases, without a MapBox account or token?

I think most of us are delighted to see the MapBox tiles as the default datasource, but as far as I understand the idea of using the MapBox SDK is not to have "the fall-back to OSMdroid" anymore for custom or user-specific vector tiles, but rather to fall back to the MapBox SDK without necessarily having an access token or using the MapBox online data to render such tiles.

This would mean that people forking ODK would end up with a version of the MapBox SDK, albeit without an access token for the MapBox data but able to render arbitrary vector tiles. At the moment @zestyping is doing this by using a blank access token; is this acceptable from MapBox's perspective? No one in this community wants to make use of the MapBox SDK in a way that MapBox isn't comfortable with, but there's a strong desire for the ODK app itself to be a fully functional stand-alone tool, even for people who for whatever reason might fork it and not have a MapBox account (without, of course, expecting to have free access to anyone's tiles or data, only to be able to use their own tiles and data).

There are a number of libraries that can render vector tiles, but I think the desire here is to use the MapBox SDK both as a way to access the MapBox tiles from the official Play Store ODK Collect app, but also as a standalone component to render tiles (and we can't help but notice that it's also capable of rendering GeoJSON, which makes it a rather strong candidate for the likely next step you've mentioned, which is interacting with user's own feature layers, which of course should not be dependent on an access token).

Sorry for the rather obsessive querying on these details; I think we really need to get this exactly right to ensure no one misunderstands or is sad later!

1 Like

Hi @mathieubossaert,

That's my most critical use-case as well! I can't wait to visit a building, road, tent, tree, or village in the field, click on it, and fill out the form triggered by it.

That said, I also can't wait to draw a new building and add a missing segment of road.

The only reason I suggest MVT first is because I'm pretty sure it's a quick win, and it'll probably be easier to implement interaction with features after we've got some kind of vector layer rendering.

I agree with you that GeoPackage and GeoJSON are the best options for user-generated layers; as I mentioned I lean toward GeoJSON to make it easier for non-GIS people. A person can use something like geojson.io to create a really basic layer without having any GIS knowledge or tools.

It may be that some sophisticated users want to put really detailed layers in, which could result in performance issues using GeoJSON. For this, GeoPackage would probably work better, but I suspect that such a benefit to users of heavy datasets won't be worth the added burden on those wishing to use light ones quickly and easily.

Maybe there's a chance that GeoPackage support could come later as an added functionality if a lot of us hit the limit of GeoJSON performance!

2 Likes

Hi @Ivangayton - thank you for asking to confirm, it is good to be certain about these things!

I have confirmed with several colleagues today (thanks @langstonsmith) that yes, it is acceptable within Mapbox ToS to use the Mapbox SDK without a Mapbox token. The Mapbox SDK is open source, so if you develop with it in such a way that does not require a Mapbox token that is fine.

That approach does limit how much we can be involved and supportive (if the implementation starts to fork away from our SDK it is hard to guarantee what we can support). But we would not see that as a violation of our terms.

If the best situation for the ODK community is to use a version of the Mapbox SDK without a token, but in such a way that creates the option for people to more easily use Mapbox basemap tiles in cases where people want to, that's :+1: by us.

(Perhaps it would be a good idea to include info for developers that fork ODK that it contains a modified version of the Mapbox SDK - and that if developers want the full-featured, updated Mapbox SDK they should return to the source for that. I wouldn't want to see a developer thinking that they were using the full Mapbox SDK without realizing that it was a modified version.)

4 Likes

Thank you so much, @Marena! That's wonderful news!

That's the confirmation we need to move ahead with this plan, and I feel much better knowing that it's okay with you and your colleagues at Mapbox. I really appreciate the time you've taken to do the extra checking into that.

It's certainly my intention to avoid forking the SDK for as long as we can—it makes everyone's life easier. Right now, there is no need to modify the SDK at all; we can achieve what we want using it exactly as is (just without an access token). To my thinking, the only situation in which it would really be necessary to fork the Mapbox SDK is if all four of the following are true:

  • Mapbox releases a new version of the SDK in which the current code path that allows a blank access token is removed.

  • In that new version, there is no other way to instantiate a MapView through public API calls.

  • We want to upgrade to using that version in ODK Collect because it does something great and new that we need.

  • We still want to use the Mapbox SDK for the fallback situation (as opposed to implementing the fallback with Google Maps or reintroducing OSMdroid or some other mapping SDK).

If we did ever fork the SDK, it would absolutely make sense to document that clearly and point developers back at the original Mapbox SDK.

4 Likes

Thanks @Marena, indeed this is great news; many thanks to you and to your colleagues at MapBox.

Thanks for the clarity, and most of all thanks for creating and sharing such great software (the SDK), standards (the MVT format itself, as well as the other open formats Mapbox has created), and for giving us access to your tileservers and data.

1 Like

Thanks everyone, this is exciting all around! @Ivangayton makes a strong case for supporting MVT basemaps as a first step. And just so we're on the same page, we're actually talking about MVT MBTiles (Mapbox Vector Tiles stored within a SQLite container), right?

Here is a summary of what I think is going to happen in the short term:

  • @Marena and @yanokwa will coordinate in private to get the API key that Mapbox has generously offered into the build process with the documentation she describes in this comment.
  • @langstonsmith will finish his effort to get online Mapbox basemaps in Collect as described in Adding Mapbox vector tile basemaps - #36 by zestyping. This support will only work on the release build which will have the API key mentioned above. Alternately, those who fork can add their own API key as described in @Marena's documentation.
  • @zestyping will work off of the same branch and add support for offline vector mbtiles. This will work without requiring an API key.
  • I will be ready to code review as needed.

Did I get all that right?

@Marena, is there any way that you can restrict an API key to a specific Android app package name? I'm guessing probably not but thought I'd ask because that could make it very easy to use the special API key. It wouldn't need to be protected because it wouldn't work in any app but the officially released Collect. That's what Google has done (but of course they're entirely Android-centric).

As part of this work, I think it makes sense to rethink how users specify the various layers that should be shown, as @zestyping alluded to in his original post. Currently, there's a Mapping section in user interface settings with two preferences: Mapping SDK and Basemap. I'd like to propose that we remove "Mapping SDK" since that is not something a user should care about and instead have the following preferences:

  • Online basemap. Options will include the Google basemaps, OSM basemaps, Mapbox basemaps and none. If a selection is made here, that will determine which SDK is used behind the scenes.
  • Offline basemap. These can be rasters or vectors. If no online basemap was selected, the Mapbox SDK will be used. If a user selects a Google online basemap and a vector offline basemap, some kind of error message will be shown.
  • (Eventually) Editable vector layer (wording to be determined)

@Ivangayton, I'm particularly interested in your feedback on that proposal. Note that I'm suggesting one online basemap from a fixed list and one offline basemap that is user-specified.

We can start thinking about selectable and editable GeoJSON support in parallel but that will be considered a separate feature or more likely two (select may be a pretty low bar). Sample data files and desired behavior from @mathieubossaert and @Ivangayton will help guide that. No urgency so it will be quand @mathieubossaert pourra à nouveau respirer!

1 Like

A great teaser for promising new features !

Hey @LN, thanks for laying that all out so clearly!

It pretty much all matches what I've been thinking except for one adjustment: I'm working on taking @langstonsmith's last branch and getting the Mapbox implementation up to feature parity with the Google and OSMdroid implementations, so that it can be a complete replacement for OSMdroid. I've run into a few tricky bits with that, which I'm discussing with @langstonsmith over on the other forum post. These mostly have to do with the drawing and manipulating of markers (symbols) on the map; I've filed a few issues on the Mapbox Annotation plugin.

1 Like

Unfortunately there is no way to restrict an API key to a specific app @LN - we currently only have URL restricted tokens for web applications.

1 Like

Hi all, catching up on this thread as I’m interested and want to help bring this forward. Thanks for all the work @zestyping @LN.

And just so we're on the same page, we're actually talking about MVT MBTiles (Mapbox Vector Tiles stored within a SQLite container), right?

Yes, I think this is correct based on the convo above and direction. Many MVT tools either produce an mbtiles file or a directory of MVT files (.mvt or .pbf). For most of our work at HOT (what we’ve used externally or produced internally) relies on vector tiles stored in mbtiles in the pbf format with gzip compression.

I'm particularly interested in your feedback on that proposal. Note that I'm suggesting one online basemap from a fixed list and one offline basemap that is user-specified.

I think this is related to this point/feedback here but one factor is the use cases for how data is loaded onto the device.

Through our work with OpenMapKit, there have been two main workflows for getting data onto the phones. This is mostly just on the user-specified basemap:

  1. Through a download endpoint. Via OpenMapKit Server, ODK/OMK android users can point their app to the OMK Server endpoint and download the data they need for their survey (similar to the ODK Collect Get Blank Form) -- this data is an mbtiles file and an osm file. This isn’t always used in a low-bandwidth environment given data costs and bandwidth but is a workflow for connecting the data you need to a survey form that is currently used.
  2. Hand loading data onto the phone. The standard method of connecting the phone to the computer and uploading the files to the correct directory.
1 Like

Okay, the mbtiles file makes sense.

What about the osm file? Is that something that's necessary to support, or can we support only GeoJSON for the selectable-geometry layer and expect deployments to convert their osm files to GeoJSON?

I'm wondering if that conversion would involve data loss that makes it harder (or impossible) to make use of geometry edits.

1 Like

Hi @LN!

Please forgive me for over-explaining stuff I'm quite sure you already know; I'm not just answering you but hopefully laying out some of the details for other readers of the thread!

File format

Yes, I'm thinking of MBTiles files with protofuffer blobs in the tiles table. So yes, Mapbox Vector Tiles format in a SQLite container—these have a .mbtiles extension, and the blobs have a .pbf type. As per the spec here.

The online tile-server equivalent of vector tiles have a .../z/x/y.mvt name schema. That spec is here.

Layers

I agree that it would be good to change the way users specify layers (and that "Mapping SDK" probably doesn't mean much to the average field data collector and should be removed). However, I'd like to specify in a bit more detail what I'm thinking in terms of the choices and how I'd see them stacking into three discrete layers:

  • Bottom: Opaque base layer (offline opaque MBTiles or an online tileserver)
  • Middle: non-editable semi-transparent layer (offline Vector MBTiles or online MVT from a tileserver, perhaps even static GeoJSON)
  • Top: editable layer (GeoJSON or other, to be implemented later)

Base Layer

Sometimes an actual picture (raster) is the only thing you can get that allows you to see what's on the ground. When dealing with an unmapped area, a new refugee camp, or a newly-damaged area from a natural disaster, there's little chance that you'll have vectors representing the current features on the ground, but you might be able to get an aerial image, even quite soon after a natural disaster. So that capability should always be there (as it is now with OSMDroid, albeit with a few rough edges in the implementation).

It should be possible to make this an online or offline layer without much practical difference. The current online Google or OSM layers being used by OSMDroid are rasters (well, the OSM ones certainly are, and I think the Google ones are as well). Once the functionality for a given online raster layer is built, it's relatively straightforward to enable any online raster layer (here is an example of a general way to access image tiles from any online raster tile server—albeit in Python, but the math and URL manipulations are all the same).

Semi-Transparent Layer

This may be standalone, as opposed to overlaid on top of a vector layer. Standalone (and opaque) is the usual style of the vector tiles in the MapBox SDK; the vector tiles are simply a way to generate a map view locally from much less data-intensive vector geometry instead of downloading a bunch of images (which is what the current OSM layer in ODK does). However, a huge advantage of vector tiles is they can overlay an raster with transparency between features. So you can have a layer with, for example, a bunch of road traces, but also see the satellite/aerial imagery of the raster layer beneath. Really useful for incomplete maps; you can see what's been already mapped, but you can also see the raw imagery beneath what hasn't been done yet.

Vector tiles can be offline using the MBTiles format with tile type "pbf"
instead of JPEG or PNG. They can also be online, using (among other potential sources) the MapBox vector tile server that they're (very generously!) offering ODK (that's where the .../z/x/y.mvt schema comes in; instead of grabbing the tiles out of a SQLite container, the application grabs them from an HTTP tile address).

Editable Layer

This should, I propose, be on top and always render overlaying the raster or tiled-vector layers. That's simply because they're the thing the mapper is probably working on! Between features should, of course, be transparent.

Choices

Note that I'm only proposing one layer of each type for now. As much as it might be tempting to allow multiple vector tile layers so people can choose what they want to see on the map—and maybe that's a discussion for the future—if we initially only allow a single vector tile layer, this still allows the survey creator to give the mapper anything they choose. The vector tile layer can contain, or not, roads, buildings, fire hydrants, etc. So for the moment I don't see the need to allow stacking multiple discrete vector tile layers. One is enough.

I do think it's worth letting people decide for either of the two bottom layers whether they want offline or online. There's no reason not to allow someone to use a high-resolution offline drone image layer as their raster background for a specific area, while still having the online MapBox/OSM layer on top of it. Or, I suppose, vice versa; an online raster layer with an offline vector tile overlay (though I can't really think of a situation where that would be tremendously useful given the greater file size of rasters).

1 Like

Thanks for laying out all these details, @Ivangayton! Never hurts to have clearer information out in the open so that everybody is on the same page.

There's seems to be a pretty clear consensus that, as @LN suggested, the user-facing settings should not be focused on a developer-facing concept like the choice of SDK.

I also think having three layers makes sense. @LN's proposal is a shift from internal implementation details toward user concepts, and I see @Ivangayton's proposal as a further step in that direction.

Let's present the layers in terms of the purpose they're used for; that makes sense. I'll try to propose some user-facing terminology, then:

  1. Base layer: A fully opaque layer. Could be a map with authoritative or general-purpose information; could be satellite or aerial imagery. Because the layer is fully opaque, it only makes sense to have one of these. (I'm suggesting "base layer" rather than "base map" so there's no confusion about what a "map" is: there are 3 layers and they go together to make a map.)

  2. Reference layer: An optional layer of additional information that is more specific to the task at hand, mostly transparent, drawn on top of the base layer. The purpose of the layer is to help you find your way around or to provide related information, but it is not the thing you are manipulating or collecting data about; hence the name "reference".

  3. Content layer: An optional layer of geometry drawn on top of the reference layer. The name "content" is intended to imply that this is the geometry you are working with—either you're collecting data about it, or you're editing the geometry.

You wouldn't be restricted to online, or offline, or vector, or raster, in any particular position among these layers—with the one exception that the content layer must be a vector layer.

So the ultimate list of supported formats would look like this:

  1. Base layer:

    • OSM online raster tiles
    • Mapbox online vector base map
    • Google Maps online vector base map
    • Any offline mbtiles file (raster or vector)
    • Any online mbtiles file (specify a URL)
    • Any online raster tile service (specify a URL template)
    • Any online vector tile service (specify a URL template)
  2. Reference layer:

    • Any offline mbtiles file (raster or vector)
    • Any online mbtiles file (specify a URL)
    • Any online raster tile service (specify a URL template)
    • Any online vector tile service (specify a URL template)
    • GeoJSON
  3. Content layer:

    • GeoJSON

From a user perspective, my question is: Would this make sense to users? My hypothesis is that presenting it this way, and minimizing the restrictions on which formats are allowed in each slot, means less extra stuff that users will need to understand.

From an implementation perspective:

  • The "specify a URL" options are not ones we've really been talking about so far. I would prefer to do them as part of a second round; they are only listed for completeness.
  • Although this does allow a wider range of possibilities than the proposals mentioned previously, I don't think it's actually harder to do with the Mapbox SDK. We'd be supporting the same set of formats and facing the same issues; all that's new is the possibility of two sets of raster tiles or two sets of vector tiles, which is not any harder than one set of each.
  • If Google is selected for the base layer, then the other layers are disabled. Or maybe we allow semitransparent raster tiles, but that's all.

Thoughts on this?

1 Like

On the contrary, I so appreciate the thoroughness of your response! It's all hugely helpful to have laid out so clearly. I wasn't so sure about the functional distinction between the two bottom layers and your explanation clears that up.

@zestyping, I really like what you're proposing. I'm a little unsure about the "Reference layer" term because I had to read your description to understand where the name comes from but I can't come up with a better alternative right now.

Agreed.

I agree and also agreed with the use cases and ideas around the base and semi-transparent layers from @Ivangayton. Just to add though on the user side, I almost think of the reference and base layer as one. Your base layer could include satellite imagery but also the semi-transparent road or boundary information. This might not change the implementation of how there are "layers" but might simplify on the user side to think about what do I want to show as the map versus what do I want to edit.

Thanks @zestyping. This was just the use case for the OMK app to download an OSM file onto the phone for editing. But I don't think it's something that necessarily needs to be supported right now. That OSM file was just an XML file and so I think focusing support on geojson only probably makes sense for now. And I don't think the conversion will involve any data loss or make it harder to use. :+1:

One clarifying question around how a user thinks about the layers:

@zestyping - are you saying that the content layer needs to be defined as a separate layer (and either have data loaded, or create data from scratch)? My question for this is around the use case of wanting to just bring a single vector data source onto the phone that can be used for the base map but also then edited. I think I understand the need for a content layer from a user perspective, but it seems like we're requiring users to duplicate the data we need to get on the phone. A use case here is that the same vector layer you use for your base map (or reference layer) is the same layer you want to use as your content layer.

2 Likes

@zestyping - are you saying that the content layer needs to be defined as a separate layer (and either have data loaded, or create data from scratch)?

Yes, that was what I had in mind.

It never occurred to me that you might want to use one vector layer as your base layer and also edit that layer. I assumed that one would always want to have at least some kind of (non-editable) base or reference information under the geometry being edited (if there's nothing under it to refer to, how would you know where to move the vertices?)

Could I trouble you to walk me through an example of a situation where one would want to do that?

Would it satisfy this use case if you could choose "none" for the base layer and "none" for the reference layer, and have only a content layer showing?

Thanks!

1 Like

Hi @smit1678,

I think one big reason to separate the Reference and (editable) Content layers that they will usually have different requirements and uses. It'll usually make sense to use different file formats for the two.

Requirements for each layer

We'll often want a large Reference layer with lots of stuff in it (i.e. the entire OSM feature set for a given area of interest, with a generous buffer around it).

The Content layer that an individual enumerator is editing will usually be much smaller. Perhaps it'll be a small part of the overall AOI, and perhaps it'll be only a single feature type (for example just the buildings), but there will be relatively few cases where someone is sent out into the field with a mandate to edit everything in a large area.

So generally the Reference layer requires compact file size and efficient loading, but not editability, while Content layer requires straightforward editability, but is likely to be a smaller dataset/file.

Implementation choices for each layer

MVT MBTiles will shine for the Reference layer; they are small, compact, locally rendered, and efficient. However, they are a horrible choice for editing (for lots of reasons I alluded to in my long post above).

For the Content layer, GeoJSON is a great choice because it's very straightforward to edit. However, we don't want to try loading an entire basemap of a large area in GeoJSON; even if the file size is manageable, the loading/rendering speed will be terrible (because there's no internal tiling; the whole file has to be loaded to display any part of it or any low-zoom rendering).

To sum up: I think in the vast majority of cases users will be better served by a large, fast Reference layer and a smaller, easily edited Content layer, probably consisting of vector tiles and GeoJSON respectively (maybe GeoPackage as another option later; that'll be more compact and higher-performance than GeoJSON, just harder for non-GIS users to generate).

1 Like

Thanks @zestyping @Ivangayton, hopefully I'm not getting us to off topic so mindful of trying to keep this convo in line with original scope so this is productive. @zestyping let us know how and if we need to separate some threads so we're getting you helpful information.

if there's nothing under it to refer to, how would you know where to move the vertices?

I think this is where we need to clarify the first use case. From my point of view, a very simple use case is not about making changes to vertices or drawing geometries on the phone, but interacting with existing data and linking survey data to it (which is the core OpenMapKit use case and is connected to some of the original use cases and purpose of starting the switch on this thread). I acknowledge that this isn’t what we all know what is possible but it seems like the most simple and basic use case for vector tiles plus Mapbox SDK that goes beyond just a new basemap. I think this is where I’m getting confused with making choices about the difference between base, reference, and content layers — because if we’re already moving onto editing geometries then what you’re doing and to what data needs to be clear to the user and so I agree if you’re editing vertices, you need a layer behind it that is a difference source so you know how to move it.

I think one big reason to separate the Reference and (editable) Content layers that they will usually have different requirements and uses. It'll usually make sense to use different file formats for the two.

Could I trouble you to walk me through an example of a situation where one would want to do that?

I don’t disagree with the potential need to have to have two different layers with two different formats. We do have the challenge that there are a lot of use cases that create different requirements.

From what I understand with vector tiles and the Mapbox SDK, we can then do things like show highlighted buildings when you select them: https://docs.mapbox.com/android/maps/examples/select-a-building/. There is a Mapbox Demo app on the Play Store that has this example that I think is interesting to help see and test.

I think if we try to take a very simple use case (and I acknowledge that I’m biased towards an OSM use case) is that you have a map view of OpenStreetMap data and you interact with that map to select the feature to survey. So that Mapbox example above is that simple use case where you bring OSM data on the phone, it’s show as your “base map” but that same view is also the data you want to interact with. You select it to pick the building you want to survey.

For the Content layer, GeoJSON is a great choice because it's very straightforward to edit. However, we don't want to try loading an entire basemap of a large area in GeoJSON; even if the file size is manageable, the loading/rendering speed will be terrible (because there's no internal tiling; the whole file has to be loaded to display any part of it or any low-zoom rendering).

Totally agreed @Ivangayton and I think this is how the Mapbox SDK thinks — geojson as a layer on top for additional styling and interaction options. In looking at OpenStreetMap’s iD editor for example, they offer vector tiles support for editing (and just convert the vector tiles data into geojson within the application). A user doesn’t necessarily need to think about the geojson conversion (the application handles it).

Here is a first example file containing all the parcells we "manage". In this use case all I want is to add those polygons over the map to know if I am located inside or outside our owned sites.
It does also fit a second use case : find some hard to locate land parcells (eg. little parcells into marshes)
sites_cenlr.geojson.zip (1.3 MB)

Here some "managing units" on which we make some long term (30 years) survey with ODK forms (I hope for the 30 coming years :smile: ).
ug_garrigues.geojson.zip (23.5 KB)
And the last one, some species location, from same sites, to add over the map to check if they are still here.
plant_species_location.geojson.zip (16.6 KB)

2 Likes