Data Sets



Data Sets

In contrast to existing datasets with very few video resources and limited accessibility due to copyright constraints, LIRIS-ACCEDE consists of videos with a large content diversity annotated along affective dimensions. All excerpts are shared under Creative Commons licenses and can thus be freely distributed without copyright issues. The dataset (the video clips, annotations, features and protocols) is publicly available.
The LaFin: Large-scale Flickr interestingness dataset (hereafter “the Dataset”) is a collection of Flickr image IDs corresponding to about 123k Flickr images, equally balanced between interesting and non-interesting images, and their corresponding metadata. In addition to the images, their binary labels, and associated metadata, some precomputed features are provided: CNNs, semantic features that derived from image captioning and Word2Vec representations of Flickr tags.  It is intended to be used for analyzing socially-driven image interestingness and...
The recent VR/AR applications require some way to evaluate the Quality of Experience (QoE) which can be described in terms of comfort, acceptability, realism, and ease of use. In order to assess all these different dimensions, it is necessary to take into account the user’s senses and in particular, for A/V content, the vision. Understanding how users watch a 360° image, how they scan the content, where they look and when is thus necessary to...
Automatic extraction of face tracks is a key component of systems that analyzes people in audio-visual content such as TV programs and movies. Due to the lack of annotated content of this type, popular algorithms for extracting face tracks have not been fully assessed in the literature. To help fill this gap, we introduce a new dataset, based on the full audio-visual person annotation of a feature movie. Thanks to this dataset, state-of-art tracking metrics...
The VSD benchmark is a collection of ground-truth files based on the extraction of violent events in movies and web videos, together with high-level audio and video concepts. It is intended to be used for assessing the quality of methods for the detection of violent scenes and/or the recognition of some high level, violence-related, concepts in movies and web videos. The data was produced by Technicolor for the 2012 subset and by the Fudan University and the Ho Chi Minh University...
 The Interestingness Dataset is a collection of movie excerpts and key-frames and their corresponding ground-truth files based on the classification into interesting and non-interesting samples. It is intended to be used for assessing the quality of methods for predicting the interestingness of multimedia content. The data has been produced by the MediaEval 2016 Predicting Interestingness and the MediaEval 2017 Predicting Interestingness Tasks' organizers and was used in the context of this benchmark. A detailed description of the benchmark can...
The automatic recognition of human emotions is of great interest in the context of multimedia applications and brain-computer interfaces. While users’ emotions can be assessed based on questionnaires, the results may be biased because the answers could be influenced by social expectations. More objective measures of emotions can be obtained by studying the users' physiological responses. The present database has been constructed in particular to evaluate the usefulness of electroencephalography (EEG) for emotion recognition in the context...
We provide a set of synchronized Light-Field video sequences captured by a 4x4 camera rig at 30fps. Each camera has a resolution of 2048x1088 pixels and a 12mm lens. The Field Of View (FOV) is 50˚ x 37˚. For each Light-Field video sequence we provide the captured images after the color homogenization and the demosaicking, as well as the pseudo-rectified images. Images are provided in PNG format.  Calibration parameters are also provided. Please see [1]...
The Movie Memorability Database and related software is a collection of movie excerpts and corresponding ground-truth files based on the measurement of long-term memory performance when recognizing small movie excerpts from weeks to years after having viewed them. It is accompanied with audio and video features extracted from the movie excerpts. It is intended to be used for assessing the quality of methods for predicting the memorability of multimedia content. A detailed description of the...
The VideoMem or Video Memorability Database is a collection of sound-less video excerpts and their corresponding ground-truth memorability files. The memorability scores are computed based on the measurement of short-term and long-term memory performances when recognizing small video excerpts a few minutes after viewing them for the short-term case, and 24 to 72 hours later, for the long-term case. It is accompanied with video features extracted from the video excerpts. It is intended to be...