Dataset from the paper "Visual question answering for cultural heritage", P. Bongini, F. Becattini, A. D. Bagdanov, A. Del Bimbo, 2020.
The Artpedia dataset contains a collection of 2,930 paintings, each associated to variable number of textual descriptions collected from WikiPedia. Each sentence is labelled either as a visual sentence or as a contextual sentence, if it does not describe the visual content of the artwork. Contextual sentences can describe the historical context of the artwork, its author, the artistic influence or the place where the painting is exhibited. The dataset contains a total of 28,212 sentences, 9,173 labelled as visual sentences and the remaining 19,039 as contextual sentences. Artpedia can be downloaded from: https://aimagelab-legacy.ing.unimore.it/imagelab/page.asp?IdPage=35
We manually annotated a subset of images with both visual and contextual question-answer pairs, based on the available images and descriptions.
To access the question and answers of ArtpediaVQA just download the json file.
If you use ArtpediaVQA or find it useful, please cite the following paper:
@inproceedings{bongini2020visual,
title={Visual question answering for cultural heritage},
author={Bongini, Pietro and Becattini, Federico and Bagdanov, Andrew D and Del Bimbo, Alberto},
booktitle={IOP Conference Series: Materials Science and Engineering},
volume={949},
number={1},
pages={012074},
year={2020},
organization={IOP Publishing}
}
You might also be interested in checking out these related papers for additional benchmarks and results:
@inproceedings{bongini2022gpt,
title={Is GPT-3 all you need for visual question answering in cultural heritage?},
author={Bongini, Pietro and Becattini, Federico and Del Bimbo, Alberto},
booktitle={European Conference on Computer Vision},
pages={268--281},
year={2022},
organization={Springer}
}
@article{becattini2023viscounth,
title={VISCOUNTH: a large-scale multilingual visual question answering dataset for cultural heritage},
author={Becattini, Federico and Bongini, Pietro and Bulla, Luana and Bimbo, Alberto Del and Marinucci, Ludovica and Mongiov{\`\i}, Misael and Presutti, Valentina},
journal={ACM Transactions on Multimedia Computing, Communications and Applications},
volume={19},
number={6},
pages={1--20},
year={2023},
publisher={ACM New York, NY}
}