Experiment

The head mounted display (HMD) Oculus-DK2 was used for this test. It has a frame refresh rate of 75Hz, resolution of 960x1080 per eye and a total viewing angle of 100o in both horizontal and vertical directions. The gyroscopic sensors within the device are able to transmit the orientation data at a rate equal to the device frame refresh rate. A small eye-tracking camera from Sensomotoric Instruments (SMI) was integrated into the device and was able to transmit eye-tracking data binocularly at 60Hz.

The software setup included a custom build unity software along with the Oculus-DK2 driver version 2.0. The software had a feature to check for calibration accuracy every two minutes and re-calibrated each time if necessary.

A total of 17 observers in the age group of 25-52 participated in the test, out of which 9 were experts who were using a VR headset everyday at work. All observers were made to explicitly answer a questionnaire before the start of the test, which asked for their expertise and age.

Observers were tested for visual acuity using the Snellen Test and their dominant eye was also determined using the cardboard technique. Data from the dominant eye was used for all further analysis and calibration. To maintain a natural (free-viewing like) gaze pattern, subjects were made to view the scene normally without the need to provide explicit quantitative measurements. They were instructed to watch the scene as normally as possible with a combination of head and eye-movement. Observers were also free to stop the test anytime in case they felt fatigued or had a sensation of vertigo. There were five images used as a training for the observers before starting the actual test.

A total of 60 stimuli were shown the observers in a sequence. Each stimuli lasted for 25 seconds and there was a 5 second gray screen between two stimuli. Every two minutes there was a calibration performed to check the accuracy of the eye-tracker. The test itself lasted for about 35 minutes and the observers had a pause of 5 minutes at the half point of the experiment. The observers were themselves seated comfortably in a turn-chair and were free to rotate the full 360 degrees and also move the chair within the room if necessary. The position of each 360 image was reset to the equirectangular image center at the start of each viewing (irrespective of their position). This was done to ensure that all observers start at the same starting position in the panorama.
 

Detailed description of database contents

The dataset contains 60 images (360°), illustrated in the figure bellow, along with eye tracking data from four different classes:

  • Indoor/Outdoor natural scenes
  • scenes containing human faces
  • sports scenes
  • computer graphics contents

 

 

Figure: Three sample images from each of the five classes used for the test. (Top Row): Cityscapes - outdoor scenes, (Second Row): Small rooms - indoor scenes, (Third Row): Scenes containing human faces, (Fourth Row): Great halls - indoor scenes, (Bottom Row): Naturescapes - outdoor scenes.

 

The eye-tracking data are provided in three forms (in respective sub-folders):

1 - Scan-path Data

It includes 40 images with the associated scan-path data from 48 observers (each of who have observed the data for a total of 25 seconds). The scan-paths are composed of individual fixations and are extracted from a combination of the raw head and eye movement data. The data is also organized in a text file named "SP<ImageNumber>.txt". Each line contains a quadruple vector that indicates the Fixation Number, Fixation-Time, X-Position (Equirectangular) and Y-Position (Equirectangular) respectively. The fixation number increments serially for a particular observer and resets to 1 when we reach the next observer, after all of the fixations of the given observer are completed. The fixation time is indicated in seconds and X and Y positions are indicated in pixels (of the respective Equi-Rectangular image).

A simple illustration of the scan-path data from 3 different observers can be found bellow. The image shows the scan-path data for the observers, each observer labelled in a particular color. The corresponding number next to the circle indicates the order of the fixation. The circle itself indicates the fixated location.

2 - Head Motion based Saliency maps

We provide a total of 20 images and the associated saliency-map for each image composed from the head movement data (Yaw, Pitch, Roll) of 48 observers who have watched the image for 25 seconds each. The data is organised into a binary file "SH<ImageNumber>.bin" containing double values (8 bytes), depicting the saliency value of each pixel. The saliency data is organized row-wise across the image pixels. The minimum value of saliency is 0 and the sum of all pixel saliencies equals to one.

A simple illustration of the saliency map for each of the images can be found bellow. While the more saturated red regions indicate the regions of frequent attention, the more saturated blue regions are attended relatively sparsely.

 

3 - Head+Eye-Motion based Saliency maps

We provide a total of 40 images and the associated saliency-map for each image composed from the head and eye movement data (Yaw, Pitch, Roll and also X-Gaze, Y-Gaze) of 48 observers who have watched the image for 25 seconds each. The data is organized into a binary file " SHE_<ImageNumber>.bin" containing double values (8 bytes), depicting the saliency value of each pixel. The saliency data is organized row-wise across the image pixels. The minimum value of saliency is 0 and the sum of all pixel saliencies equals to one.

A simple illustration of the saliency map for each of the images can be found bellow. While the more saturated red regions indicate the regions of frequent attention, the more saturated blue regions are attended relatively sparsely.

Detailed description of source code provided

Matlab and C++ functions to parse the data are provided in each respective sub-folders.

 

About

Introduction

The recent VR/AR applications require some way to evaluate the Quality of Experience (QoE) which can be described in terms of comfort, acceptability, realism and ease of use. In order to assess all these different dimensions, it is necessary to take into account the user’s senses and in particular, for A/V content, the vision. Understanding how users watch a 360° image, how they scan the content, where they look and when is thus necessary to develop appropriate rendering devices and create good content for consumers. The goal of this data-base is to help solve those questions and design good models.

The dataset contains 60 images (360°), along with eye tracking data provided as scan-path and saliency maps and collected from 48 different observers. The data has been produced by Technicolor and by the University of Nantes (LS2N laboratory) in 2017. It has been described in several publications. 
 

Acknowledgments

We would like to thank the different users who participated to this capture.
The creation of this benchmark has also been supported, in part, by: PROVISION, a Marie-Curie ITN of the European Commission.
 

Citing the Database

Please cite the following paper in your publications making use of the Salient360! database:

Yashas Rai, Patrick Le Callet and Philippe Guillotel. “Which saliency weighting for omni directional image quality assessment?”. In Proceedings of the IEEE Ninth International Conference on Quality of Multimedia Experience (QoMEX’17). Erfurt, Germany, pp. 1-6, June 2017.

Yashas Rai, Jesús Gutiérrez, and Patrick Le Callet. 2017. “A Dataset of Head and Eye Movements for 360 Degree Images“. In Proceedings of the 8th ACM on Multimedia Systems Conference (MMSys'17). ACM, New York, NY, USA, pp. 205-210, June 2017.

Download

To download the database, we ask you to provide your name, email address and affiliation and to fill in and sign the EULA (End User License Agreement) form available here, by which you agree to the terms of use described below.

Then send an email to salient360management@interdigital.com asking for the database, with the EULA file attached and the above required information. You will receive instruction on how to download the dataset via the provided email address.

We may store the data you supplied in order to contact you later about benchmark related matters. The data will not be used in any other way.

 

Terms of use

1. Commercial use

The user may only use the dataset for academic research. The user may not use the database for any commercial purposes. Commercial purposes include, but are not limited to:

  • proving the efficiency of commercial systems,
  • training or testing of commercial systems,
  • using screenshots of data from the database in advertisements,
  • selling data from the database,
  • creating military applications.

 

2. Distribution

The user may not distribute the dataset or portions thereof in any way, with the exception of using small portions of data for the exclusive purpose of clarifying academic publications or presentations. Note that publications will have to comply with the terms stated in article 4.

 

3. Access

The user may only use the database after this End User License Agreement (EULA) has been signed and returned to the dataset administrators (email). The signed EULA should be returned in digital format by including it to the mail when requesting access to the dataset. Upon receipt of the EULA, information to access the dataset will be issued. The user may not grant anyone access to the database by giving out this information.

 

4. Publications

Publications include not only papers, but also presentations for conferences or educational purposes. All documents and papers that report on research that uses the Salient360! dataset will cite the following papers:

 

Yashas Rai, Patrick Le Callet and Philippe Guillotel. “Which saliency weighting for omni directional image quality assessment?”. In Proceedings of the IEEE Ninth International Conference on Quality of Multimedia Experience (QoMEX’17). Erfurt, Germany, pp. 1-6, June 2017.

Yashas Rai, Jesús Gutiérrez, and Patrick Le Callet. 2017. “A Dataset of Head and Eye Movements for 360 Degree Images“. In Proceedings of the 8th ACM on Multimedia Systems Conference (MMSys'17). ACM, New York, NY, USA, pp. 205-210, June 2017.  

 

5. Warranty

The database comes without any warranty. University of Nantes and Technicolor R&D France cannot be held accountable for any damage (physical, financial or otherwise) caused by the use of the database.