Skip to content

📊 Add child labor aggregates from ILO-UNICEF#5902

Draft
paarriagadap wants to merge 35 commits intomasterfrom
data-childlabor-ilounicef
Draft

📊 Add child labor aggregates from ILO-UNICEF#5902
paarriagadap wants to merge 35 commits intomasterfrom
data-childlabor-ilounicef

Conversation

@paarriagadap
Copy link
Copy Markdown
Contributor

@paarriagadap paarriagadap commented Apr 8, 2026

Summary

New dataset from the ILO-UNICEF 2024 Global Estimates of Child Labour report. Extracts data from both the statistical annex tables (pages 54–58) and chart labels throughout the report (pages 8, 9, 30, 34, 44) using pdfplumber.

Data sources

Source Pages Content
Statistical annex 54–55 Child labour by region × sex × age (2024)
Statistical annex 56–57 Hazardous work by region × sex × age (2024)
Statistical annex 58 Trends 2016–2024 by region, sex, age, income
Chart data 9 Global trends 2000–2012, regional/age trends 2008–2012
Chart data 8 Household chores shares, not-in-school 5-14/15-17
Chart data 30 Child labour share by sex 2000–2012
Chart data 34 Sector distribution by SDG region
Chart data 44 Not-in-school shares by ILO region

Output tables

child_labor — main table, index: [country, year, sex, age]

  • share_child_labor, number_child_labor — child labour prevalence
  • share_hazardous_work, number_hazardous_work — hazardous work prevalence
  • share_child_labor_not_in_school, number_child_labor_not_in_school — not attending school (child labour)
  • share_hazardous_work_not_in_school, number_hazardous_work_not_in_school — not attending school (hazardous work)
  • share_child_labor_incl_household_chores — including household chores (≥21h/week)
  • Years: 2000–2024 (sparse for earlier years)
  • Countries: World, ILO/SDG/UNICEF regions, income groups
  • Sex: total, boys, girls
  • Age: 5-11, 12-14, 15-17, 5-14, 5-17

sector — sector distribution, index: [country, year, sector, sex, age]

  • share_child_labor, number_child_labor, share_hazardous_work, number_hazardous_work
  • Sectors: Agriculture, Industry, Services (rescaled to sum to 100%)
  • Countries: World + 5 SDG regions

Pipeline

  • Snapshot: PDF extraction with pdfplumber (table extraction for annex, word position analysis for chart labels). All chart-derived data defined as constants in the snapshot script.
  • Meadow: Numeric conversion (comma-formatted strings → float), column indexing.
  • Garden: Wide → long reshaping, merging child labour + hazardous work, concat with trends, 5-14 bracket computation, not-in-school/household chores column joins, sector rescaling, country harmonization.
  • Grapher: Standard long-to-wide conversion.

@codex review

🤖 Generated with Claude Code

@owidbot
Copy link
Copy Markdown
Contributor

owidbot commented Apr 8, 2026

Quick links (staging server):

Site Dev Site Preview Admin Wizard Docs

Login: ssh owid@staging-site-data-childlabor-ilounicef

chart-diff: ✅
  • 10/10 reviewed charts
  • Modified: 0/0
  • New: 10/10
  • Rejected: 0
  • Data changes: 0
  • Metadata changes: 0
data-diff: ❌ Found differences
= Dataset garden/un/2026-02-03/ilostat
  = Table regions
  = Table ilostat
    ~ Column employment_by_sex_and_status_in_employment (changed metadata)
-       -     This data comes from the ILO Modelled Estimates series. The [The International Labour Organization (ILO)](#dod:ilo) combines countries' own reported estimates with statistically modeled estimates when observations are missing. This improves comparability across countries and over time and allows the ILO to calculate regional and global aggregates for every year. You can read more about how the ILO produces these estimates in the [Modelled Estimates documentation](https://ilostat.ilo.org/methods/concepts-and-definitions/ilo-modelled-estimates/).
+       +     This data comes from the ILO Modelled Estimates series. The [International Labour Organization (ILO)](#dod:ilo) combines countries' own reported estimates with statistically modeled estimates when observations are missing. This improves comparability across countries and over time and allows the ILO to calculate regional and global aggregates for every year. You can read more about how the ILO produces these estimates in the [Modelled Estimates documentation](https://ilostat.ilo.org/methods/concepts-and-definitions/ilo-modelled-estimates/).
    ~ Column labour_force_by_sex_and_age (changed metadata)
-       -     This data comes from the ILO Modelled Estimates series. The [The International Labour Organization (ILO)](#dod:ilo) combines countries' own reported estimates with statistically modeled estimates when observations are missing. This improves comparability across countries and over time and allows the ILO to calculate regional and global aggregates for every year. You can read more about how the ILO produces these estimates in the [Modelled Estimates documentation](https://ilostat.ilo.org/methods/concepts-and-definitions/ilo-modelled-estimates/).
+       +     This data comes from the ILO Modelled Estimates series. The [International Labour Organization (ILO)](#dod:ilo) combines countries' own reported estimates with statistically modeled estimates when observations are missing. This improves comparability across countries and over time and allows the ILO to calculate regional and global aggregates for every year. You can read more about how the ILO produces these estimates in the [Modelled Estimates documentation](https://ilostat.ilo.org/methods/concepts-and-definitions/ilo-modelled-estimates/).
    ~ Column labour_force_participation_rate_by_sex_and_age (changed metadata)
-       -     This data comes from the ILO Modelled Estimates series. The [The International Labour Organization (ILO)](#dod:ilo) combines countries' own reported estimates with statistically modeled estimates when observations are missing. This improves comparability across countries and over time and allows the ILO to calculate regional and global aggregates for every year. You can read more about how the ILO produces these estimates in the [Modelled Estimates documentation](https://ilostat.ilo.org/methods/concepts-and-definitions/ilo-modelled-estimates/).
+       +     This data comes from the ILO Modelled Estimates series. The [International Labour Organization (ILO)](#dod:ilo) combines countries' own reported estimates with statistically modeled estimates when observations are missing. This improves comparability across countries and over time and allows the ILO to calculate regional and global aggregates for every year. You can read more about how the ILO produces these estimates in the [Modelled Estimates documentation](https://ilostat.ilo.org/methods/concepts-and-definitions/ilo-modelled-estimates/).
    ~ Column obs_status_labour_force_participation_rate_by_sex_and_age (changed metadata)
-       -     This data comes from the ILO Modelled Estimates series. The [The International Labour Organization (ILO)](#dod:ilo) combines countries' own reported estimates with statistically modeled estimates when observations are missing. This improves comparability across countries and over time and allows the ILO to calculate regional and global aggregates for every year. You can read more about how the ILO produces these estimates in the [Modelled Estimates documentation](https://ilostat.ilo.org/methods/concepts-and-definitions/ilo-modelled-estimates/).
+       +     This data comes from the ILO Modelled Estimates series. The [International Labour Organization (ILO)](#dod:ilo) combines countries' own reported estimates with statistically modeled estimates when observations are missing. This improves comparability across countries and over time and allows the ILO to calculate regional and global aggregates for every year. You can read more about how the ILO produces these estimates in the [Modelled Estimates documentation](https://ilostat.ilo.org/methods/concepts-and-definitions/ilo-modelled-estimates/).
    ~ Column obs_status_unemployment_rate_by_sex_and_age (changed metadata)
-       -     This data comes from the ILO Modelled Estimates series. The [The International Labour Organization (ILO)](#dod:ilo) combines countries' own reported estimates with statistically modeled estimates when observations are missing. This improves comparability across countries and over time and allows the ILO to calculate regional and global aggregates for every year. You can read more about how the ILO produces these estimates in the [Modelled Estimates documentation](https://ilostat.ilo.org/methods/concepts-and-definitions/ilo-modelled-estimates/).
+       +     This data comes from the ILO Modelled Estimates series. The [International Labour Organization (ILO)](#dod:ilo) combines countries' own reported estimates with statistically modeled estimates when observations are missing. This improves comparability across countries and over time and allows the ILO to calculate regional and global aggregates for every year. You can read more about how the ILO produces these estimates in the [Modelled Estimates documentation](https://ilostat.ilo.org/methods/concepts-and-definitions/ilo-modelled-estimates/).
    ~ Column share_employment_by_sex_and_status_in_employment (changed metadata)
-       -     This data comes from the ILO Modelled Estimates series. The [The International Labour Organization (ILO)](#dod:ilo) combines countries' own reported estimates with statistically modeled estimates when observations are missing. This improves comparability across countries and over time and allows the ILO to calculate regional and global aggregates for every year. You can read more about how the ILO produces these estimates in the [Modelled Estimates documentation](https://ilostat.ilo.org/methods/concepts-and-definitions/ilo-modelled-estimates/).
+       +     This data comes from the ILO Modelled Estimates series. The [International Labour Organization (ILO)](#dod:ilo) combines countries' own reported estimates with statistically modeled estimates when observations are missing. This improves comparability across countries and over time and allows the ILO to calculate regional and global aggregates for every year. You can read more about how the ILO produces these estimates in the [Modelled Estimates documentation](https://ilostat.ilo.org/methods/concepts-and-definitions/ilo-modelled-estimates/).
    ~ Column unemployment_rate_by_sex_and_age (changed metadata)
-       -     This data comes from the ILO Modelled Estimates series. The [The International Labour Organization (ILO)](#dod:ilo) combines countries' own reported estimates with statistically modeled estimates when observations are missing. This improves comparability across countries and over time and allows the ILO to calculate regional and global aggregates for every year. You can read more about how the ILO produces these estimates in the [Modelled Estimates documentation](https://ilostat.ilo.org/methods/concepts-and-definitions/ilo-modelled-estimates/).
+       +     This data comes from the ILO Modelled Estimates series. The [International Labour Organization (ILO)](#dod:ilo) combines countries' own reported estimates with statistically modeled estimates when observations are missing. This improves comparability across countries and over time and allows the ILO to calculate regional and global aggregates for every year. You can read more about how the ILO produces these estimates in the [Modelled Estimates documentation](https://ilostat.ilo.org/methods/concepts-and-definitions/ilo-modelled-estimates/).
+ Dataset garden/un/2026-04-08/child_labor_report
+ + Table child_labor
+   + Column number_child_labor
+   + Column share_child_labor
+   + Column number_hazardous_work
+   + Column share_hazardous_work
+   + Column number_child_labor_not_in_school
+   + Column share_child_labor_not_in_school
+   + Column number_hazardous_work_not_in_school
+   + Column share_hazardous_work_not_in_school
+   + Column share_child_labor_incl_household_chores
+ + Table sector
+   + Column number_child_labor
+   + Column share_child_labor
+   + Column number_hazardous_work
+   + Column share_hazardous_work


Legend: +New  ~Modified  -Removed  =Identical  Details
Hint: Run this locally with etl diff REMOTE data/ --include yourdataset --verbose --snippet

Automatically updated datasets matching excess_mortality|covid|fluid|flunet|country_profile|garden/ihme_gbd/2019/gbd_risk are not included

Edited: 2026-04-14 16:56:30 UTC
Execution time: 5.33 seconds

paarriagadap and others added 27 commits April 8, 2026 15:32
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…nize categories

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…into main table

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…stant

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…YEAR

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…rs by 1000

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…nds table

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…age group

- Snapshot: add page 8 chart data (household chores, not-in-school shares) to trends CSV
- Garden: extract special rows, compute 5-14 bracket, merge as columns

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ig 14

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@paarriagadap
Copy link
Copy Markdown
Contributor Author

Hi @veronikasamborska1994! This one was clearly guided by Claude, because the extraction came from many places. I guess checking every detail of the code is not needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants