mlscores/USAGE.md at main · johnsamuelwrites/mlscores

Usage Examples

Basic Usage

Generate multilinguality scores for a single Wikidata item (Q5) in English:

python3 -m mlscores Q5 -l en

Generate multilinguality scores for a single Wikidata item (Q5) in multiple languages (English, French, and Spanish):

python3 -m mlscores Q5 -l en fr es

Multiple Identifiers

Generate multilinguality scores for multiple Wikidata items (Q5, Q10, and Q15) in a single language (English):

python3 -m mlscores Q5 Q15 -l en

Generate multilinguality scores for multiple Wikidata items (Q5, Q10, and Q15) in multiple languages (English, French, and Spanish):

python3 -m mlscores Q5 Q10 Q15 -l en fr es

Missing Translations

Generate a list of properties and property values missing translations in multiple languages (English, French, and Spanish) for Wikidata item Q5:

python3 -m mlscores Q5 -l en fr es -m

Generate a list of properties and property values missing translations in multiple languages (English, French, and Spanish) for multiple Wikidata items (Q5, Q10, and Q15):

python3 -m mlscores Q5 Q10 Q15 -l en fr es -m

Output Formats

mlscores supports three output formats: table (default), json, and csv.

Table Output (Default)

python3 -m mlscores Q5 -l en fr es

JSON Output

Output JSON to console:

python3 -m mlscores Q5 -l en fr es -f json

Output JSON to file:

python3 -m mlscores Q5 -l en fr es -f json -o results.json

Example JSON output:

[
  {
    "item_id": "Q5",
    "property_label_percentages": {
      "en": 99.5,
      "fr": 96.2,
      "es": 92.1
    },
    "value_label_percentages": {
      "en": 98.8,
      "fr": 95.5,
      "es": 91.0
    },
    "combined_percentages": {
      "en": 99.1,
      "fr": 95.8,
      "es": 91.5
    }
  }
]

CSV Output

Output CSV to console:

python3 -m mlscores Q5 -l en fr es -f csv

Output CSV to file:

python3 -m mlscores Q5 -l en fr es -f csv -o results.csv

Example CSV output:

item_id,type,en,fr,es
Q5,property_labels,99.5,96.2,92.1
Q5,value_labels,98.8,95.5,91.0
Q5,combined,99.1,95.8,91.5

Including Missing Translation Details in JSON

When using JSON format with the -m flag, missing translations are included:

python3 -m mlscores Q5 -l en fr es -f json -m

Example output:

[
  {
    "item_id": "Q5",
    "property_label_percentages": {"en": 99.5, "fr": 96.2, "es": 92.1},
    "value_label_percentages": {"en": 98.8, "fr": 95.5, "es": 91.0},
    "combined_percentages": {"en": 99.1, "fr": 95.8, "es": 91.5},
    "missing_property_translations": {
      "fr": ["P11955", "P12596"],
      "es": ["P5806", "P9495", "P11955"]
    },
    "missing_value_translations": {
      "fr": ["Q12345"],
      "es": ["Q12345", "Q67890"]
    }
  }
]

Special Cases

Generate multilinguality scores for a Wikidata property (e.g., P31):

python3 -m mlscores P31 -l en fr es

Generate multilinguality scores for a Wikidata item with a special identifier (e.g., Q2012):

python3 -m mlscores Q2012

Web Interface

mlscores includes a web interface for interactive use and REST API access.

Prerequisites

Install web dependencies before using the web interface:

# Using pip with extras
pip install ".[web]"

# Or using requirements file
pip install -r requirements-web.txt

Starting the Web Server

Start with default settings (localhost:8000):

python3 -m mlscores --web

Start with custom host and port:

python3 -m mlscores --web --host 0.0.0.0 --port 5000

Browser-Only WASM UI (Pyodide)

A browser-only interface is available in addition to the FastAPI-backed UI:

Local URL (served by FastAPI static files): http://127.0.0.1:8000/static/wasm/index.html
Static file path: mlscores/web/static/wasm/index.html
This mode executes scoring in-browser via Pyodide and directly queries a SPARQL endpoint.

This is suitable for static hosting scenarios such as GitHub Pages, subject to endpoint CORS policies.

Accessing the Web Interface

Once the server is running:

Interactive UI: Open http://127.0.0.1:8000/ in your browser
API Documentation (Swagger): http://127.0.0.1:8000/api/docs
API Documentation (ReDoc): http://127.0.0.1:8000/api/redoc

REST API Endpoints

Health Check

curl http://127.0.0.1:8000/api/health

Get Scores for a Single Item

curl "http://127.0.0.1:8000/api/scores/Q5?languages=en,fr,es"

Get Scores for Multiple Items

curl -X POST http://127.0.0.1:8000/api/scores \
  -H "Content-Type: application/json" \
  -d '{"item_ids": ["Q5", "Q10"], "languages": ["en", "fr", "es"], "include_missing": true}'

Batch Processing

For processing many items, you can combine shell scripting with mlscores:

# Process a list of items and save to JSON
for item in Q5 Q10 Q15 Q20; do
  python3 -m mlscores $item -l en fr es -f json -o "${item}_scores.json"
done

Or process all items in a single command:

python3 -m mlscores Q5 Q10 Q15 Q20 -l en fr es -f json -o all_scores.json

Tips

Performance: Processing items with many properties may take longer due to SPARQL query complexity. Results are cached to improve performance on repeated queries.
Language Codes: Use standard ISO 639-1 language codes (e.g., en, fr, es, de, zh, ja).
Rate Limiting: The Wikidata Query Service has rate limits. If you encounter errors, try reducing the number of concurrent requests.
Output Files: When using -o with an existing file, the file will be overwritten.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Usage Examples

Basic Usage

Multiple Identifiers

Missing Translations

Output Formats

Table Output (Default)

JSON Output

CSV Output

Including Missing Translation Details in JSON

Special Cases

Web Interface

Prerequisites

Starting the Web Server

Browser-Only WASM UI (Pyodide)

Accessing the Web Interface

REST API Endpoints

Batch Processing

Tips

FilesExpand file tree

USAGE.md

Latest commit

History

USAGE.md

File metadata and controls

Usage Examples

Basic Usage

Multiple Identifiers

Missing Translations

Output Formats

Table Output (Default)

JSON Output

CSV Output

Including Missing Translation Details in JSON

Special Cases

Web Interface

Prerequisites

Starting the Web Server

Browser-Only WASM UI (Pyodide)

Accessing the Web Interface

REST API Endpoints

Batch Processing

Tips