Add a places layer #19

jake-low · 2025-09-08T21:12:10Z

This PR adds a "Places" layer containing OSM features tagged place=*, which represent countries, states, provinces, cities, towns, villages, neighborhoods, etc. Features in this layer are all Point geometries: they represent the approximate center of the feature.

I added this layer partly because it's useful to me, but also because it's simple and therefore a good testbed for two new schema ideas:

The population column is an integer. This is the first foray into parsing OSM's string-valued tags into other data types.
The name column contains the value of that tag in OSM, which (usually) represents the name in the primary local language. But this layer also contains a names column which is a Map<str, str> which contains all of the OSM tags whose keys start with name:, e.g. name:left, name:fr, and even name:etymology:wikidata. Keys in the map are the tag key with the name: prefix dropped, and values are the OSM tag value. So for example tags.names.es contains the value of name:es (the Spanish language name), if available. I also added alt_names and official_names maps to complement the alt_name and official_name columns.

1ec5 · 2025-09-08T21:31:58Z

src/places.py

+    Most place features are mapped as nodes, but some are mapped as areas
+    (typically neighborhoods, islands, etc). In these cases we include the
+    area's centroid in the output dataset.


There’s a potential for duplicate features if the same place is mapped as both a place point and a boundary relation and both are tagged with place=*. One approach is to limit place=hamlet/village/town/city to points only and avoid mapping a place point for territories that have no logical center, such as place=state. This approach is favored in some regions like the U.S. but not yet fully implemented.

/ref shortbread-tiles/shortbread-docs#86

Thanks, that thread and the links therein are useful references.

I'm currently including place areas only if they are not tagged boundary=*, to avoid creating duplicate rows in the output. Ideally I'd do something more sophisticated (checking if the boundary relation has a label member, for example) but this is beyond what I can easily do with pyosmium.

It seems like a reasonable suggestion to just omit certain place=* elements. The main use case I have in mind for this layer is as a dataset of populated human settlements, so omitting place=country/state/province which aren't settlements in the anthropological sense would be fine. I wonder if anyone would miss place=archipelago/island/islet if they were also omitted from this layer. Personally I would not, and having at most one row in the output for each human settlement seems like it's important enough to warrant some trade-offs.

place=archipelago/island/islet sound like good candidates for a different layer about landforms. Similarly, place=ocean would be a good candidate for a water layer, and place=square for a layer about pedestrian infrastructure or perhaps public spaces.

jake-low · 2025-09-09T06:16:01Z

I renamed the layer to settlements, and changed it to only include place=* values that represent human settlements (city, town, etc) or parts of settlements (borough, neighbourhood, etc). It now also only includes places mapped as nodes (rather than creating centroid points for places mapped as areas).

ianthetechie

Just a few minor nits. I like the way you're factoring out the map-style tags, the approach to alternate names, and the new name. Place is such an overloaded term; this captures it better I think!

src/settlements.py

ianthetechie · 2025-09-18T04:21:07Z

src/settlements.py

+def tags_with_prefix(prefix, tags):
+    """
+    Returns a dict of all tags with the given prefix string; keys in the
+    dict will have the prefix dropped.
+    """
+    prefix_len = len(prefix)
+    return {k[prefix_len:]: v for (k, v) in tags if k.startswith(prefix)}


I like this helper function! Since we'll probably need some variation of it elsewhere, maybe we should create put it in another module (helpers? sounds cliche but I'm not very creative at the moment :P) so others can call it as a top level function?

Good call, thanks! I wrote this thinking it'd be used in other layers too but forgot to move it to a helpers module.

Co-authored-by: Ian Wagner <ian.wagner@stadiamaps.com>

jake-low · 2025-09-18T06:02:50Z

Thank you both for the valuable feedback! I think this is ready to merge now. Definitely open to iterating on the schema further (and that goes for all of the other layers too), but by merging this we can get it built and available for people to try out, and also the helpers.py module will be available for other layers like Ian's boundaries PR.

Add a places layer

e17251d

1ec5 reviewed Sep 8, 2025

View reviewed changes

jake-low mentioned this pull request Sep 8, 2025

First pass at a boundary layer #18

Merged

jake-low marked this pull request as draft September 8, 2025 22:09

jake-low added 2 commits September 8, 2025 23:12

Exclute some place=* values from places layer

e1f5d17

Rename places layer to settlements

4ee6ba6

ianthetechie reviewed Sep 18, 2025

View reviewed changes

jake-low and others added 2 commits September 17, 2025 22:48

Handle really big settlements

90ff0db

Co-authored-by: Ian Wagner <ian.wagner@stadiamaps.com>

Move tags_with_prefix() to a helpers.py module

fa8892b

jake-low marked this pull request as ready for review September 18, 2025 05:56

jake-low merged commit bd52999 into main Sep 18, 2025

jake-low mentioned this pull request Sep 26, 2025

Consider flattening table schemas #25

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Add a places layer #19

Add a places layer #19

Uh oh!

jake-low commented Sep 8, 2025

Uh oh!

1ec5 Sep 8, 2025

Uh oh!

jake-low Sep 8, 2025

Uh oh!

1ec5 Sep 9, 2025

Uh oh!

jake-low commented Sep 9, 2025 •

edited

Loading

Uh oh!

ianthetechie left a comment

Uh oh!

Uh oh!

ianthetechie Sep 18, 2025

Uh oh!

jake-low Sep 18, 2025

Uh oh!

jake-low commented Sep 18, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

Add a places layer #19

Add a places layer #19

Uh oh!

Conversation

jake-low commented Sep 8, 2025

Uh oh!

1ec5 Sep 8, 2025

Choose a reason for hiding this comment

Uh oh!

jake-low Sep 8, 2025

Choose a reason for hiding this comment

Uh oh!

1ec5 Sep 9, 2025

Choose a reason for hiding this comment

Uh oh!

jake-low commented Sep 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ianthetechie left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ianthetechie Sep 18, 2025

Choose a reason for hiding this comment

Uh oh!

jake-low Sep 18, 2025

Choose a reason for hiding this comment

Uh oh!

jake-low commented Sep 18, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

jake-low commented Sep 9, 2025 •

edited

Loading