Skip to content

migrate 48 fields to type=measurement#1988

Draft
k-yle wants to merge 1 commit intomainfrom
kh/measurement
Draft

migrate 48 fields to type=measurement#1988
k-yle wants to merge 1 commit intomainfrom
kh/measurement

Conversation

@k-yle
Copy link
Copy Markdown
Collaborator

@k-yle k-yle commented Mar 5, 2026

Closes ideditor/schema-builder#15

This is a showcase of how the proposed measurement field could work.

The preview will be broken, and the CI will fail until ideditor/schema-builder#198 is merged & released.

My observations:

  • charge can't be supported, because it allows {currency} and {currency}/{time}
  • fire_hydrant:diameter can't be supported, because it allows unitless values, as well as m and in
  • incline, can't be supported, because it uses % which is dimensionless / unitless, it's not dimension=angle. Also, this tag frequently uses non-numeric values such as up/down/yes/steep

@k-yle k-yle added waiting-for-schema-builder-change waitfor-other This issue or PR is blocked by something that is not covered by other waitfor-* labels labels Mar 5, 2026
@k-yle k-yle marked this pull request as draft March 5, 2026 13:57
Comment thread data/fields/ele.json Outdated
Comment thread data/fields/ele.json
Comment thread data/fields/maxheight.json Outdated
Comment thread data/fields/maxspeed.json Outdated
""
],
"mile-per-hour": [
"mph"
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe it would make sense to contain it to areas where mph are a thing?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

currently iD offers both km/h and mph everywhere in the world, but the default option is chosen by country-coder.

So I think it would still be the responsibility of downstream apps to decide how they choose the default unit.

For example, some StreetComplete changesets contain a BCP47 extension like -u-mu-celsius, suggesting that SC already knows the users's unit preferences

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good argument for keeping usage in the schema. I don’t think it makes sense to keep bloating country-coder with redundant measurement properties by usage.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is gain from magic matches from usage is really worth specifying it over and over again and extra complexity?

In theory, dimension and usage mean that this PR shouldn’t need to repeat "inch: ["in"] in each preset that accepts inches.

If we omit dimension and usage, then id-tagging-schema will need to enumerate not only the allowable units but also the default unit by country. That would be much more verbose. The only exception is the combination of length and vehicle, because rapideditor/country-coder#30 implemented roadheight as a stopgap. It was tricky to implement. I don’t look forward to doing that a few more times over for the differences in ideditor/schema-builder#15 (comment).

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My main worry is that relying on magic providing of "best" unit may end with ones not preferred for osm tagging, with this happening in way that will not be noticeable and may change as upstream package changes

Would there be at least way to see what is the outcome of that automatching?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am worrying less about actual bugs and more where CLDR reasonably lists X first while for OSM tagging Y should be used.

Copy link
Copy Markdown
Contributor

@1ec5 1ec5 Mar 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR and ideditor/schema-builder#198 use units to enumerate the allowable units. Another possible interpretation is that they restrict the units from an external source such as CLDR to just the enumerated units. However, this PR does not address the default choice of unit when a user goes to tag something that hasn’t been tagged yet. It leaves the choice up to the client implementation. In order for the client implementation to choose correctly, it could:

  1. Use country-coder to find the preferred unit by coordinate. This library is not specifically about OSM tagging conventions and has generally declined to duplicate CLDR locale data. It made an exception for roadheight but probably would not attempt to replicate CLDR’s entire unit preference table. It’s just too difficult to manage in a single GeoJSON file.
  2. Use country-coder to find the country code, then look up the country code in CLDR based on dimension and usage. On Android and iOS, this second step would use a simple method call built into the platform’s standard library. For now, a Web application needs to use a library provided by Unicode, but a proposal to standardize dimension and usage lookups is on the standards track behind another proposal for unit conversion.
  3. Use country-coder to find the country code, then look up the country code in a hard-coded table specific to that client. Every client would need to implement this lookup table separately.

Only a hard-coded heuristic could possibly account for an idiosyncratic choice by a local OSM community to tag in a unit that differs from real-world custom. Let’s say the Canadian community decides to tag maxspeed=* in kilometers per hour on railways, despite miles per hour being standard in the real world. In this case, option (B) would require an override, but it would be relatively rare and easier to maintain than a full implementation of (C).

Nothing is magical about this part of CLDR. In principle, id-tagging-schema could define a full lookup table for (C) so that clients wouldn’t have to maintain it themselves. If so, each field would still need to specify a dimension and usage, referring to a centralized unit-territory file under data/. Otherwise, these unit-territory preferences would bloat each field redundantly.

Copy link
Copy Markdown
Collaborator Author

@k-yle k-yle Mar 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no harm in adding a field for ūsage, but i want to clarify some things:

#1988 (comment)
This is a good argument for keeping usage in the schema. I don’t think it makes sense to keep bloating country-coder with redundant measurement properties by usage.

The problem is, CLDR's ūsage data is very limited. So even if we add it, the regional defaults often wouldn’t make sense. For example:

  1. dimension=length + ūsage=default says that 001
    uses kilometres to measure length. CLDR has no ūsage for the circumference of a tree, depth of the ocean, elevation, track-gauge of a railway, height of a building, etc. All of these examples would not use kilometers.
  2. In some cases, OSM tags map to multiple CLDR ūsages, For example, minheight is both person-height and ??? (there is no usage value that makes sense for building height)

In theory, dimension and usage mean that this PR shouldn’t need to repeat "inch: ["in"] in each preset that accepts inches.

#1988 (comment)
These unit symbols are getting repeated all over. Can we centralize them in a new JSON file under data/? It shouldn’t ever be the case that some unit symbol is acceptable in one key but not another.

Sure. But even if we define a mapping of CLDR units --> OSM units in a single global file, each field will still need to define:

  1. the plausible units for that tag. For maxheight: meter, centimeter, foot-and-inch, yard. This is to exclude illogical values like nanometer and light-year.
  2. the unit which does not require a suffix in the OSM tag. For maxheight: meter
  3. some way to indicate whether the default value can be determined from CLDR data, to workaround the issue described above. For tags where CLDR's data is not sufficient, I guess data consumers have to use OSM's default unit and not support regional-specific defaults? :/

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was under the impression that the person-height usage was intended to mean “things of a human scale”, the kind of thing that gets measured in meter-and-centimeter in many metric countries and feet and inches in some metric countries (but not vehicle dimensions that are only ever in meters). In other words, I think they only split out a new usage when needed for a regional distinction.

My suggestion of a central units file is based on the fact that we have a single units article on the wiki. Some individual tagging schemes specify which units are acceptable or implied, but no tagging scheme comes with its own unique unit symbol. The individual field JSONs can continue to specify acceptable units that reference entries in the central units JSON.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i see, then I think #1988 (comment) (specifically ideditor/schema-builder@1611122) addresses most of these issues.

  • units defined in a single global file.
  • each field must define a usage
  • each field can optionally further limit the allowed fields
  • each field can define the unit which is implied by default

both PRs updated

Comment thread data/fields/diameter.json Outdated
Comment thread data/fields/ele.json Outdated
Comment thread data/fields/ele_node.json Outdated
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’m surprised we aren’t already reusing this field for rollercoasters (“You must be yea tall to ride”). I think the usage would differ in that case.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, if we add this field to amusement park features in the future, we will need to split it into 2 JSON files. so no action required right now imo

Comment thread data/fields/maxspeed.json Outdated
""
],
"mile-per-hour": [
"mph"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good argument for keeping usage in the schema. I don’t think it makes sense to keep bloating country-coder with redundant measurement properties by usage.

Comment thread data/fields/maxheight.json Outdated
Comment thread data/fields/ele.json
@matkoniecz
Copy link
Copy Markdown
Collaborator

(somehow I cannot rely inline)

This is a good argument for keeping usage in the schema. I don’t think it makes sense to keep bloating country-coder with redundant measurement properties by usage.

is gain from magic matches from usage is really worth specifying it over and over again and extra complexity?

Copy link
Copy Markdown
Contributor

@1ec5 1ec5 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some more fields to migrate over:

"key": "distance",
"type": "text",
"key": "frequency",
"type": "combo",
"key": "frequency",
"type": "combo",
"key": "incline",
"type": "combo",
"key": "gauge",
"type": "combo",
"key": "voltage",
"type": "combo",
"key": "voltage",
"type": "combo",
"key": "maxweight",
"type": "combo",
"key": "maxweight",
"type": "combo",
"key": "maxaxleload",
"type": "combo",

(somehow I cannot rely inline)

(It’s because I responded to another comment as part of a larger review. You can click on the permalink for my comment to navigate to the original thread.)

Comment thread data/fields/maxspeed.json Outdated
""
],
"mile-per-hour": [
"mph"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is gain from magic matches from usage is really worth specifying it over and over again and extra complexity?

In theory, dimension and usage mean that this PR shouldn’t need to repeat "inch: ["in"] in each preset that accepts inches.

If we omit dimension and usage, then id-tagging-schema will need to enumerate not only the allowable units but also the default unit by country. That would be much more verbose. The only exception is the combination of length and vehicle, because rapideditor/country-coder#30 implemented roadheight as a stopgap. It was tricky to implement. I don’t look forward to doing that a few more times over for the differences in ideditor/schema-builder#15 (comment).

Comment thread data/fields/maxwidth.json Outdated
Comment on lines +7 to +26
"units": {
"meter": [
""
],
"centimeter": [
"cm"
],
"millimeter": [
"mm"
],
"foot": [
"'"
],
"inch": [
"in"
],
"yard": [
"yd"
]
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These unit symbols are getting repeated all over. Can we centralize them in a new JSON file under data/? It shouldn’t ever be the case that some unit symbol is acceptable in one key but not another. (Some duration fields like maxstay=* use calendar units. They need a field type other than measurement but could potentially be powered by CLDR too.) We could go further and centralize most of these units specifications. The list of allowable units is generally determined by the dimension, not the individual key. units would only be necessary for overriding the usual list with some key-specific quirks, like ele only accepting meters.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done, unit suffixes are defined in a new file now.

If a unit is not defined in the global unit.json file, then it can't be used in any fields. for example we wouldn't include lightyears or imperial teaspoons since no OSM tag would ever want those values.

fields that have strict rules like ele can still further constrain the list of allowed units.

@k-yle
Copy link
Copy Markdown
Collaborator Author

k-yle commented Mar 19, 2026

#1988 (review)
Some more fields to migrate over:
[...]

Added all of them except incline, which won't work because:

  1. % is dimensionless / unitless, it's not dimension=angle.
  2. this tag frequently uses non-numeric values such as up/down/yes/steep

@k-yle k-yle changed the title migrate 39 fields to type=measurement migrate 48 fields to type=measurement Mar 19, 2026
},
"minValue": 0,
"label": "Duration (minutes)",
"label": "{duration}",
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure whether we want to remove hint

are we assuming that iD and other will add it back, somehow, using impliedUnit

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but then we have maxspeed where we have impliedUnit but it should be pushed far less strongly

},
"minValue": 0,
"label": "Duration (minutes)",
"label": "{duration}",
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but then we have maxspeed where we have impliedUnit but it should be pushed far less strongly

Comment thread data/fields/maxspeed.json
"type": "measurement",
"measurement": {
"dimension": "speed",
"usage": "default",
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we should not need to state default and default to default as value?

"type": "measurement",
"measurement": {
"dimension": "power",
"usage": "default"
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there reason to not specify here and similar MW as preferred? Is placeholder enough?

I am not sure how it will work in iD in practice, I guess it can be tweaked later?

"type": "measurement",
"measurement": {
"dimension": "volume",
"usage": "fluid",
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is the benefit of specifying it here? (I am still confused by that entire usage thingy, and trying educate myself via https://cldr.unicode.org/ )

Comment thread .vscode/settings.json
Copy link
Copy Markdown
Collaborator Author

@k-yle k-yle Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i just learned that railway:position and railway:position:exact use mi:12.3 instead of 12.3 mi 😭

and that tag uses a "fake" unit called pkm in argentina. i guess we can't support that field

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

waitfor-other This issue or PR is blocked by something that is not covered by other waitfor-* labels waiting-for-schema-builder-change

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add measurement field type

3 participants