🎭 Bias Analysis of Attractiveness in AI-Generated Images

Investigating how the generative AI model Sora depicts human attractiveness and whether it reflects societal stereotypes.

This repository contains the code, analyses, and report for the course 02445: Project in Statistical Evaluation of Artificial Intelligence and Data at DTU (June 2025).

Authors:

Valdemar Stamm Kristensen (s244742)
Frederik Lysholm Jønsson (s245362)
William Hoffmann Hyldig (s245176)
Gustav Christensen (s246089)

Study line: Artificial Intelligence and Data

📌 What’s this project about?

We explored potential biases in Sora’s AI-generated images of men and women, focusing on how the model interprets attractiveness based on prompt wording.

By generating a balanced dataset with prompts such as “attractive man/woman”, “unattractive man/woman”, and “man/woman”, we analyzed whether skin tone, hair color, age, and other visual traits systematically varied across groups.

⚠️ Key limitations

Some labels (e.g., age, hair length, hair health) involved subjective judgments and manual annotation.
Faces in the same 3×3 grid may not be fully independent samples.
We conducted many statistical tests (risk of false positives) without multiple-testing correction.
While sample size was estimated with ANOVA, the actual tests used were non-parametric due to lack of normality.

🧠 How we did it

Dataset generation: 972 portraits from Sora via balanced prompt design.
Feature extraction: Skin/hair luminance from RGB values, categorical labels (glasses, beard, hijab, hairstyle, etc.).
Preprocessing: Cropping, luminance calculation, manual + GPT-4.1 annotations for age.
Statistical analysis:
- Shapiro–Wilk & Levene → normality/variance tests (rejected).
- Kruskal–Wallis and Mann–Whitney U-tests for skin/hair luminance.
- Chi-squared and Fisher’s exact tests for categorical attributes.

📊 What we found

Skin & Hair Biases:
- Attractive women → lighter skin.
- Attractive men → darker hair, medium-length styles.
- Unattractive categories → older age groups, lighter hair, absence of minority representation.
Categorical Biases:
- Glasses nearly absent in “attractive” groups but common in “unattractive”.
- Beards most frequent in attractive men, least in unattractive men.
- Hijabs underrepresented in “attractive” and entirely absent in “unattractive”.
Overall:
Sora consistently associates attractiveness with youth, lighter female skin, darker male hair, and the absence of accessories or cultural markers.

📂 Repository structure

Final/
- CheckDataTypes.ipynb → Data type checks.
- DataCleaning.ipynb → Preprocessing and handling missing/subjective data.
- Extract-RGB-Script.ipynb → RGB extraction for skin/hair.
- RGB-to-Luminans.ipynb → Luminance calculations.
- SampleSize.ipynb → ANOVA-based sample size estimation.
- StatisticAnalysis.ipynb → Statistical tests and results.
- Plots/ → Visualizations (violin plots, mosaic plots, correlation plots).
- Csv-files/ → Processed datasets.
- Other/ → Supporting scripts and annotations.
README.md → Project overview.

🔮 Reflections & Future Work

Generative models like Sora risk reproducing and amplifying beauty stereotypes.
Future research could:
- Treat image grids as dependent units.
- Apply corrections for multiple testing.
- Compare results across other generative models.
- Explore bias mitigation strategies (e.g., prompt engineering, diverse training data).

📖 References

Introduction to Machine Learning and Data Mining (Herlau, Schmidt & Mørup, DTU, 2023)
Introduction to Statistics at DTU (Brockhoff et al., 2024)
DTU 02445 course slides on model evaluation and bias
scikit-learn documentation
OpenAI documentation on GPT-4.1 and Sora

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
Final		Final
.DS_Store		.DS_Store
.gitattributes		.gitattributes
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🎭 Bias Analysis of Attractiveness in AI-Generated Images

📌 What’s this project about?

⚠️ Key limitations

🧠 How we did it

📊 What we found

📂 Repository structure

🔮 Reflections & Future Work

📖 References

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🎭 Bias Analysis of Attractiveness in AI-Generated Images

📌 What’s this project about?

⚠️ Key limitations

🧠 How we did it

📊 What we found

📂 Repository structure

🔮 Reflections & Future Work

📖 References

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages