Conversation
| "" | ||
| ], | ||
| "mile-per-hour": [ | ||
| "mph" |
There was a problem hiding this comment.
maybe it would make sense to contain it to areas where mph are a thing?
There was a problem hiding this comment.
currently iD offers both km/h and mph everywhere in the world, but the default option is chosen by country-coder.
So I think it would still be the responsibility of downstream apps to decide how they choose the default unit.
For example, some StreetComplete changesets contain a BCP47 extension like -u-mu-celsius, suggesting that SC already knows the users's unit preferences
There was a problem hiding this comment.
This is a good argument for keeping usage in the schema. I don’t think it makes sense to keep bloating country-coder with redundant measurement properties by usage.
There was a problem hiding this comment.
is gain from magic matches from usage is really worth specifying it over and over again and extra complexity?
In theory, dimension and usage mean that this PR shouldn’t need to repeat "inch: ["in"] in each preset that accepts inches.
If we omit dimension and usage, then id-tagging-schema will need to enumerate not only the allowable units but also the default unit by country. That would be much more verbose. The only exception is the combination of length and vehicle, because rapideditor/country-coder#30 implemented roadheight as a stopgap. It was tricky to implement. I don’t look forward to doing that a few more times over for the differences in ideditor/schema-builder#15 (comment).
There was a problem hiding this comment.
My main worry is that relying on magic providing of "best" unit may end with ones not preferred for osm tagging, with this happening in way that will not be noticeable and may change as upstream package changes
Would there be at least way to see what is the outcome of that automatching?
There was a problem hiding this comment.
I am worrying less about actual bugs and more where CLDR reasonably lists X first while for OSM tagging Y should be used.
There was a problem hiding this comment.
This PR and ideditor/schema-builder#198 use units to enumerate the allowable units. Another possible interpretation is that they restrict the units from an external source such as CLDR to just the enumerated units. However, this PR does not address the default choice of unit when a user goes to tag something that hasn’t been tagged yet. It leaves the choice up to the client implementation. In order for the client implementation to choose correctly, it could:
- Use country-coder to find the preferred unit by coordinate. This library is not specifically about OSM tagging conventions and has generally declined to duplicate CLDR locale data. It made an exception for
roadheightbut probably would not attempt to replicate CLDR’s entire unit preference table. It’s just too difficult to manage in a single GeoJSON file. - Use country-coder to find the country code, then look up the country code in CLDR based on
dimensionandusage. On Android and iOS, this second step would use a simple method call built into the platform’s standard library. For now, a Web application needs to use a library provided by Unicode, but a proposal to standardizedimensionandusagelookups is on the standards track behind another proposal for unit conversion. - Use country-coder to find the country code, then look up the country code in a hard-coded table specific to that client. Every client would need to implement this lookup table separately.
Only a hard-coded heuristic could possibly account for an idiosyncratic choice by a local OSM community to tag in a unit that differs from real-world custom. Let’s say the Canadian community decides to tag maxspeed=* in kilometers per hour on railways, despite miles per hour being standard in the real world. In this case, option (B) would require an override, but it would be relatively rare and easier to maintain than a full implementation of (C).
Nothing is magical about this part of CLDR. In principle, id-tagging-schema could define a full lookup table for (C) so that clients wouldn’t have to maintain it themselves. If so, each field would still need to specify a dimension and usage, referring to a centralized unit-territory file under data/. Otherwise, these unit-territory preferences would bloat each field redundantly.
There was a problem hiding this comment.
There's no harm in adding a field for ūsage, but i want to clarify some things:
#1988 (comment)
This is a good argument for keeping usage in the schema. I don’t think it makes sense to keep bloating country-coder with redundant measurement properties by usage.
The problem is, CLDR's ūsage data is very limited. So even if we add it, the regional defaults often wouldn’t make sense. For example:
dimension=length+ūsage=defaultsays that 001
uses kilometres to measure length. CLDR has noūsagefor the circumference of a tree, depth of the ocean, elevation, track-gauge of a railway, height of a building, etc. All of these examples would not use kilometers.- In some cases, OSM tags map to multiple CLDR
ūsages, For example, minheight is bothperson-heightand???(there is nousagevalue that makes sense for building height)
In theory,
dimensionandusagemean that this PR shouldn’t need to repeat"inch: ["in"]in each preset that accepts inches.#1988 (comment)
These unit symbols are getting repeated all over. Can we centralize them in a new JSON file under data/? It shouldn’t ever be the case that some unit symbol is acceptable in one key but not another.
Sure. But even if we define a mapping of CLDR units --> OSM units in a single global file, each field will still need to define:
- the plausible units for that tag. For maxheight:
meter,centimeter,foot-and-inch,yard. This is to exclude illogical values likenanometerandlight-year. - the unit which does not require a suffix in the OSM tag. For maxheight:
meter - some way to indicate whether the default value can be determined from CLDR data, to workaround the issue described above. For tags where CLDR's data is not sufficient, I guess data consumers have to use OSM's default unit and not support regional-specific defaults? :/
There was a problem hiding this comment.
I was under the impression that the person-height usage was intended to mean “things of a human scale”, the kind of thing that gets measured in meter-and-centimeter in many metric countries and feet and inches in some metric countries (but not vehicle dimensions that are only ever in meters). In other words, I think they only split out a new usage when needed for a regional distinction.
My suggestion of a central units file is based on the fact that we have a single units article on the wiki. Some individual tagging schemes specify which units are acceptable or implied, but no tagging scheme comes with its own unique unit symbol. The individual field JSONs can continue to specify acceptable units that reference entries in the central units JSON.
There was a problem hiding this comment.
i see, then I think #1988 (comment) (specifically ideditor/schema-builder@1611122) addresses most of these issues.
- units defined in a single global file.
- each field must define a
usage - each field can optionally further limit the allowed fields
- each field can define the unit which is implied by default
both PRs updated
There was a problem hiding this comment.
I’m surprised we aren’t already reusing this field for rollercoasters (“You must be yea tall to ride”). I think the usage would differ in that case.
There was a problem hiding this comment.
yeah, if we add this field to amusement park features in the future, we will need to split it into 2 JSON files. so no action required right now imo
| "" | ||
| ], | ||
| "mile-per-hour": [ | ||
| "mph" |
There was a problem hiding this comment.
This is a good argument for keeping usage in the schema. I don’t think it makes sense to keep bloating country-coder with redundant measurement properties by usage.
|
(somehow I cannot rely inline)
is gain from magic matches from usage is really worth specifying it over and over again and extra complexity? |
1ec5
left a comment
There was a problem hiding this comment.
Some more fields to migrate over:
id-tagging-schema/data/fields/distance.json
Lines 2 to 3 in 93113ec
id-tagging-schema/data/fields/frequency.json
Lines 2 to 3 in 93113ec
id-tagging-schema/data/fields/frequency_electrified.json
Lines 2 to 3 in 93113ec
id-tagging-schema/data/fields/incline.json
Lines 2 to 3 in 93113ec
id-tagging-schema/data/fields/gauge.json
Lines 2 to 3 in 93113ec
id-tagging-schema/data/fields/voltage.json
Lines 2 to 3 in 93113ec
id-tagging-schema/data/fields/voltage_electrified.json
Lines 2 to 3 in 93113ec
id-tagging-schema/data/fields/maxweight.json
Lines 2 to 3 in 93113ec
id-tagging-schema/data/fields/maxweight_bridge.json
Lines 2 to 3 in 93113ec
id-tagging-schema/data/fields/maxaxleload_bridge.json
Lines 2 to 3 in 93113ec
(somehow I cannot rely inline)
(It’s because I responded to another comment as part of a larger review. You can click on the permalink for my comment to navigate to the original thread.)
| "" | ||
| ], | ||
| "mile-per-hour": [ | ||
| "mph" |
There was a problem hiding this comment.
is gain from magic matches from usage is really worth specifying it over and over again and extra complexity?
In theory, dimension and usage mean that this PR shouldn’t need to repeat "inch: ["in"] in each preset that accepts inches.
If we omit dimension and usage, then id-tagging-schema will need to enumerate not only the allowable units but also the default unit by country. That would be much more verbose. The only exception is the combination of length and vehicle, because rapideditor/country-coder#30 implemented roadheight as a stopgap. It was tricky to implement. I don’t look forward to doing that a few more times over for the differences in ideditor/schema-builder#15 (comment).
| "units": { | ||
| "meter": [ | ||
| "" | ||
| ], | ||
| "centimeter": [ | ||
| "cm" | ||
| ], | ||
| "millimeter": [ | ||
| "mm" | ||
| ], | ||
| "foot": [ | ||
| "'" | ||
| ], | ||
| "inch": [ | ||
| "in" | ||
| ], | ||
| "yard": [ | ||
| "yd" | ||
| ] | ||
| } |
There was a problem hiding this comment.
These unit symbols are getting repeated all over. Can we centralize them in a new JSON file under data/? It shouldn’t ever be the case that some unit symbol is acceptable in one key but not another. (Some duration fields like maxstay=* use calendar units. They need a field type other than measurement but could potentially be powered by CLDR too.) We could go further and centralize most of these units specifications. The list of allowable units is generally determined by the dimension, not the individual key. units would only be necessary for overriding the usual list with some key-specific quirks, like ele only accepting meters.
There was a problem hiding this comment.
done, unit suffixes are defined in a new file now.
If a unit is not defined in the global unit.json file, then it can't be used in any fields. for example we wouldn't include lightyears or imperial teaspoons since no OSM tag would ever want those values.
fields that have strict rules like ele can still further constrain the list of allowed units.
Added all of them except
|
| }, | ||
| "minValue": 0, | ||
| "label": "Duration (minutes)", | ||
| "label": "{duration}", |
There was a problem hiding this comment.
I am not sure whether we want to remove hint
are we assuming that iD and other will add it back, somehow, using impliedUnit
There was a problem hiding this comment.
but then we have maxspeed where we have impliedUnit but it should be pushed far less strongly
| }, | ||
| "minValue": 0, | ||
| "label": "Duration (minutes)", | ||
| "label": "{duration}", |
There was a problem hiding this comment.
but then we have maxspeed where we have impliedUnit but it should be pushed far less strongly
| "type": "measurement", | ||
| "measurement": { | ||
| "dimension": "speed", | ||
| "usage": "default", |
There was a problem hiding this comment.
maybe we should not need to state default and default to default as value?
| "type": "measurement", | ||
| "measurement": { | ||
| "dimension": "power", | ||
| "usage": "default" |
There was a problem hiding this comment.
is there reason to not specify here and similar MW as preferred? Is placeholder enough?
I am not sure how it will work in iD in practice, I guess it can be tweaked later?
| "type": "measurement", | ||
| "measurement": { | ||
| "dimension": "volume", | ||
| "usage": "fluid", |
There was a problem hiding this comment.
what is the benefit of specifying it here? (I am still confused by that entire usage thingy, and trying educate myself via https://cldr.unicode.org/ )
There was a problem hiding this comment.
i just learned that railway:position and railway:position:exact use mi:12.3 instead of 12.3 mi 😭
and that tag uses a "fake" unit called pkm in argentina. i guess we can't support that field
Closes ideditor/schema-builder#15
This is a showcase of how the proposed
measurementfield could work.The preview will be broken, and the CI will fail until ideditor/schema-builder#198 is merged & released.
My observations:
chargecan't be supported, because it allows{currency}and{currency}/{time}fire_hydrant:diametercan't be supported, because it allows unitless values, as well asmandinincline, can't be supported, because it uses%which is dimensionless / unitless, it's notdimension=angle. Also, this tag frequently uses non-numeric values such asup/down/yes/steep