Movie Memorability Dataset

About

The Movie Memorability Database and related software is a collection of movie excerpts and corresponding ground-truth files based on the measurement of long-term memory performance when recognizing small movie excerpts from weeks to years after having viewed them. It is accompanied with audio and video features extracted from the movie excerpts.

It is intended to be used for assessing the quality of methods for predicting the memorability of multimedia content. A detailed description of the benchmark can be found on our Data Description page. The license conditions are mentioned on the Download page.

CITING THE MOVIE MEMORABILITY DATASET

All documents and papers that report on research that uses the Movie Memorability Dataset must acknowledge the use of the dataset by including an appropriate citation to the following:

2018 R.Cohendet, K. Yadati, N. Q. Duong and C.-H. Demarty. Annotating, understanding, and predicting long-term video memorability. In Proceedings of the ICMR 2018 Conference, Yokohama, Japan, June 11-14, 2018.

@InProceedings{Cohendet2018,

Title = {Annotating, understanding, and predicting long-term video memorability},

Author = {Romain Cohendet and Karthik Yadati and Ngoc Q.K. Duong and Claire-H\'{e}l\`{e}ne Demarty},

Booktitle = {Proc. of the ICMR 2018 Workshop, Yokohama, Japan, June 11-14},

Year = {2018},

}

It would be highly appreciated if this use was shared with InterDigital, at this address: moviememmanagement@interdigital.com.

Description

The delivered package contains the dataset used in [1]. It was intended for the tasks of video memorability understanding and prediction. It is composed of:

A list of 660 short movie excerpts extracted from 100 Hollywood-like movies;
The corresponding ground truth for the 660 movie excerpts (i.e., for each excerpt, a long term-memorability score and its type – neutral vs. typical – and the number of annotations);
Extracted audio and video features that were used in [1].

It is accompanied by the original movie excerpts that were used to build the ground truth and to extract the audio-visual features. You are informed that these excerpts remain the ownership of their legitimate owners and that no license is granted on these excerpts. They are provided and may be used exclusively under the article L.122-5 3° a) of the French Code of intellectual property or, where applicable, under the “fair use” doctrine or its equivalent.

List of movies

The complete list of movies is provided in file movie_list.txt.

Ground truth

Ground truth is provided in file ground-truth.xlsx. For each video sequence, it consists of:

The corresponding movie’s title
The start and the end times (in seconds) of the sequence (obtained after a manual segmentation of the movie)
The sequence’s name
Its type – neutral vs. typical (annotated 1 for neutral and 0 for typical in the .xlsx file). A neutral video is a part of a movie which contains no element that would enable someone to easily guess this video belongs to a particular movie. The list of undesirable elements includes but is not limited to: recognizable famous actors, typical music, style, etc. Typical videos are simply defined as non-neutral videos. See [1] for a complete definition.
The number of annotations of each sequence
Its memorability score

Features

The set of audio and video features used to train the model presented in [1] is provided together with the data.

C3D
Audioset
Image captions
SentiBank
Affect

Python scripts are also provided to read the different features along with sequences’ names.

C3D

C3D is a feature for generic video analysis [2]. C3D is obtained by training a deep 3D convolutional network on a large annotated video dataset. We used the 4096-dimensional output of the fully convolutional layers of the 3D CNN as a feature vector for training and evaluating the model for memorability prediction. We used the source code and the pre-trained models from the following location: https://github.com/facebook/C3D

AudioSet

AudioSet [3] consists of an expanding ontology of 632 audio event classes and a collection of 2,084,320 human-labeled 10-second sound clips drawn from YouTube videos [3]. We used the code provided by Google to extract the 128-dimensional embeddings for the audio tracks of the video segments in our dataset. The ontology of the events can be found here: https://github.com/audioset/ontology

And the code and models can be found here: https://github.com/tensorflow/models/tree/master/research/audioset

Image captions

We extracted the image captions for frames sampled every one second of the video segment. For each word in the image caption, we computed the word embedding using the word2vec model. We used the following model to compute the image captions: https://github.com/karpathy/neuraltalk2, that implemented the work in [4]. For the word2vec model, we use Python’s Natural Language Toolkit (NLTK).

SentiBank

In order to obtain the sentiment from the visual signal, we used the Sentibank visual concept detector [5]. SentiBank is a set of 1200 trained visual concept detectors providing a mid-level representation of sentiment, associated training images acquired from Flickr, and a benchmark containing 603 photo tweets covering a diverse set of 21 topics. We picked the top-50 visual concepts that have the highest confidence and for each of these 50 visual concepts, we extracted the word2vec embeddings and average the vectors across the video. We used the code and data that is available here: http://www.ee.columbia.edu/ln/dvmm/vso/download/sentibank.html

Affect

For extracting emotion related information, we computed the arousal and valence scores from the audio-visual signal. For arousal, which represents the level of excitement in the video, we used the shot change frequency, energy in the audio signal, and the motion activity. For valence, which represents the positive/negative emotion in the video, we used the HSV histogram of each frame in the video. We used an implementation which is based on the publication of Hanjalic & Xu, 2005 [6].

References

[1] Cohendet, R., Yadati, K., Duong, N. Q., and Demarty, C.-H. Annotating, understanding, and predicting long-term video memorability. In Proceedings of the ICMR 2018 Conference, Yokohama, Japan, June 11-14, 2018.

[2] Tran, D., Bourdev, L. D., Fergus, R., Torresani, L., & Paluri, M. (2014). C3D: generic features for video analysis. CoRR, abs/1412.0767, 2(7), 8.

[3] Gemmeke, J. F., Ellis, D. P., Freedman, D., Jansen, A., Lawrence, W., Moore, R. C., ... & Ritter, M. (2017, March). Audio set: An ontology and human-labeled dataset for audio events. In Acoustics, Speech and Signal Processing (ICASSP), 2017 IEEE International Conference on (pp. 776-780). IEEE.

[4] Vinyals, O., Toshev, A., Bengio, S., Erhan, D. (2015). Show and Tell: A Neural Image Caption Generator. In Proceedings of the Computer Vision and Pattern Recognition Conference, 2015.

[5] Borth, D., Ji, R., Chen, T., Breuel, T., & Chang, S. F. (2013, October). Large-scale visual sentiment ontology and detectors using adjective noun pairs. In Proceedings of the 21st ACM international conference on Multimedia (pp. 223-232). ACM.

[6] Hanjalic, A., & Xu, L. Q. (2005). Affective video content representation and modeling. IEEE transactions on multimedia, 7(1), 143-154.

Download

In order to get the data, you are asked to supply your name and email address. You will receive instructions on how to download the dataset via this email address. We may store the data you supplied in order to contact you later about benchmark related matters. The data will not be used in any other way.

To download the Movie Memorability dataset please send an email to moviememmanagement@interdigital.com. By doing this you irrevocably agree to any and all provision of the license agreement on this page.

LIMITED DATABASE AND SOFTWARE EVALUATION LICENSE AGREEMENT

This Limited Database and Software Evaluation License Agreement (the “Agreement”) is entered into as of Your download of the database and software (“Effective Date”).

This Agreement governs the download and use of the Database and Software (as defined below). Your use of the Database and Software is subject to the terms and conditions set forth in this Agreement. By installing, using, accessing or copying the Database and Software, you hereby irrevocably accept the terms and conditions of this Agreement. If you do not accept all or parts of the terms and conditions of this Agreement you cannot install, use, access nor copy the Database and Software.

Definitions

“Authorized Purpose” means any use of the Database and Software for research on the Database and Software and evaluation of the Database and Software exclusively, and academic research using the Database and Software without any commercial use. For the avoidance of doubt, a commercial use includes, but is not limited to:

development of commercial systems,
proving the efficiency of commercial systems,
training or testing of commercial systems,
using screenshots of data from the database in advertisements,
selling data from the database”

“Database” means the database which consists of:

A list of 660 short movie excerpts extracted from 100 Hollywood-like movies;
The corresponding ground truth for the 660 movie excerpts (i.e., for each excerpt, a long term-memorability score and its type – neutral typical – and the number of annotations);
Audio and video features that were extracted from the movie excerpts.

“Limited Period” means the life of the Intellectual Property Right owned by InterDigital on the Database and Software in each and every country where such Intellectual Property Right would exist.

“Intellectual Property Rights” means all copyrights, trademarks, trade secrets, patents, mask works and other intellectual property rights recognized in any jurisdiction worldwide, including all applications and registrations with respect thereto.

“Materials” means the relevant short excerpts of movies from which the Database has been built and that are provided with the Database and the Software in the only intend to ease the use of the Database and the Software.

“Software” means the software delivered by InterDigital to interact with the Database.

License

InterDigital grants Licensee a free, worldwide, non-exclusive, license on copyright owned on the Database and the Software to download, use and reproduce solely for the Authorized Purpose for the Limited Period.

Restrictions on use

Licensee shall not remove, obscure or modify any copyright, trademark or other proprietary rights notices, marks or labels contained on or within the Database or the Software, falsify or delete any author attributions, legal notices or other labels of the origin or source of the Materials.

Without prior written approval from InterDigital, the Database and/or Software and/or the Materials, in whole or in part, shall not be further distributed, published, copied, or disseminated in any way or form whatsoever. For the avoidance of any doubt, this prohibition does not include further distributing, copying or disseminating to a different facility or organizational unit in the same requesting university, organization, or company.

Without prior written approval from InterDigital, the Database and/or Software and/or the Materials, in whole or in part, may not be modified or used for commercial purposes. For commercial use of the dataset, a specific paying license may be negotiated, please contact us.

In no case should the Database and/or Software be used in any way that could directly or indirectly harm InterDigital. InterDigital permits publication (paper or web-based) of the data for scientific purposes only. Any other publication without scientific and academic value is strictly prohibited.

Ownership

Title to and ownership of the Database and the Software, the Documentation and/or any Intellectual Property Right protecting the Database and the Software shall, at all times, remain with InterDigital. Licensee agrees that except for the rights granted on copyright on the Database and the Software set forth in Section 2 above, in no event does anything in this Agreement grant, provide or convey any other rights, immunities or interest in or to any Intellectual Property Rights (including especially patents) of InterDigital or any of its Affiliates whether by implication, estoppel or otherwise.

You are informed that, for efficiency reasons, the Materials are provided by InterDigital with the Database and the Software. You are informed that the Materials remains the ownership of its legitimate owner and that no license is granted on such material. The Materials are provided and may be used exclusively under the article L.122-5 3° a) of the French Code of intellectual property or, where applicable, under the “fair use” doctrine or its equivalent. Any use, redistribution or diffusion of such materials is strictly prohibited out of the strict necessity of the use of the Database and/or Software in the strict compliance of this license.

Publication/Communication

Any publication or oral communication regarding the Database and/or the Software shall be elaborated in good faith and shall not be driven by a deliberate will to denigrate InterDigital or any of its products. In any publication and on any support joined to oral communication (for instance a PowerPoint document) resulting from the use of the Database and/or Software, the following statement/citation shall be inserted:

The dataset was provided by InterDigital and is described in the following publication:

R.Cohendet, K.Yadati, N.Q.Duong and C.-H.Demarty. Annotating, understanding, and predicting long-term video memorability. In Proceedings of the ICMR 2018 Conference, Yokohama, Japan, June 11-14, 2018.

In any oral communication resulting from the use of the Database and the Software, the Licensee shall orally indicate that the Database and the Software are InterDigital's property.

No Warranty - Disclaimer

THE DATABASE AND THE SOFTWARE ARE PROVIDED TO LICENSEE ON AN “AS IS” BASIS. INTERDIGITAL MAKES NO WARRANTY THAT THE LICENSED TECHNOLOGY WILL OPERATE ON ANY PARTICULAR HARDWARE, PLATFORM, OR ENVIRONMENT. THERE IS NO WARRANTY THAT THE OPERATION OF THE LICENSED TECHNOLOGY SHALL BE UNINTERRUPTED, WITHOUT BUGS OR ERROR-FREE. THE DATABASE AND THE SOFTWARE AND DOCUMENTATION ARE PROVIDED HEREUNDER WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY IMPLIED LIABILITIES AND WARRANTIES OF NONINFRINGEMENT OF INTELLECTUAL PROPERTY, FREEDOM FROM INHERENT DEFECTS, CONFORMITY TO A SAMPLE OR MODEL, MERCHANTABILITY, FITNESS AND/OR SUITABILITY FOR A SPECIFIC OR GENERAL PURPOSE AND THOSE ARISING BY STATUTE OR BY LAW, OR FROM A CAUSE OF DEALING OR USAGE OF TRADE.

Hence, the Licensee uses the Database and the Software at his own cost, risks and responsibility. InterDigital shall not be liable for any damage that could arise to Licensee by using the Database and/or the Software, either in accordance with this Agreement or not.

InterDigital shall not be liable for any consequential or indirect losses, including any indirect loss of profits, revenues, business, and/or anticipated savings, whether or not in the contemplation of the Parties at the time of entering into the Agreement unless expressly set out in the Agreement, or arising from gross negligence, willful misconduct or fraud.

Licensee agrees that it will defend, indemnify and hold harmless InterDigital and its Affiliates against any and all losses, damages, costs and expenses arising from a breach by the Licensee of any of its obligations or representations hereunder, including, without limitation, any third party, and/or any claims in connection with any such breach and/or any use of the Database and/or the Software, including any claim from third party arising from access, use or any other activity in relation to this Database and/or Software.

The Licensee shall not make any warranty, representation, or commitment on behalf of InterDigital to any other third party.

Term and Termination

This Agreement shall terminate at the end of the Limited Period, unless earlier terminated by either party on the ground of material breach by the other party, which breach is not remedied after thirty (30) days advance written notice, specifying the breach with reasonable particularity and referencing this Agreement.

General Provisions

12.1 Severability. If any provision of this Agreement shall be held to be in contravention of applicable law, this Agreement shall be construed as if such provision were not a part thereof, and in all other respects, the terms hereof shall remain in full force and effect.

12.2 Governing Law. Regardless of the place of execution, delivery, performance or any other aspect of this Agreement, this Agreement and all of the rights of the parties under this Agreement shall be governed by, construed under and enforced in accordance with the substantive law of France without regard to conflicts of law principles. In case of a dispute that could not be settled amicably, the courts of Nanterre shall be exclusively competent.

12.3 Assignment. InterDigital may assign this license to any third party. Such assignment will be announced on the website as defined in article 5. Licensee may not assign this agreement to any third party without the previous written agreement from InterDigital.

The Principal Investigators can be contacted via moviememmanagement@interdigital.com