As mentioned during the 2024-04-10 task force call, there is interest in providing summary statistics (or more broadly, descriptive statistics) in Croissant format. I'm focusing on summary statistics (mean, max, min, median, mode, standard deviation, etc.) because they are already well defined in the Data Document Initiative (DDI) format.
For example, for a dataset with a variable called "stars" that indicates the number of stars on GitHub, the summary statistics in DDI can be represented like this:
<var ID="v30256083" name="stars" intrvl="discrete">
<location fileid="f6867331"/>
<labl level="variable">stars</labl>
<sumStat type="medn">4.0</sumStat>
<sumStat type="mean">38.71014492753635</sumStat>
<sumStat type="mode">.</sumStat>
<sumStat type="vald">138.0</sumStat>
<sumStat type="max">732.0</sumStat>
<sumStat type="invd">0.0</sumStat>
<sumStat type="min">0.0</sumStat>
<sumStat type="stdev">110.13079171235681</sumStat>
<varFormat type="numeric"/>
</var>
The question is, where can I put summary statistics in Croissant?
Update: This issue seems related (mentions statistics):
As mentioned during the 2024-04-10 task force call, there is interest in providing summary statistics (or more broadly, descriptive statistics) in Croissant format. I'm focusing on summary statistics (mean, max, min, median, mode, standard deviation, etc.) because they are already well defined in the Data Document Initiative (DDI) format.
For example, for a dataset with a variable called "stars" that indicates the number of stars on GitHub, the summary statistics in DDI can be represented like this:
The question is, where can I put summary statistics in Croissant?
Update: This issue seems related (mentions statistics):