Image retrieval in large image databases is an important problem that drives a number of applications. Yet the use of supervised approaches that address this problem has been limited due to the lack of large labeled datasets for training. Hence, in this paper we introduce two new datasets composed of images extracted from publicly available videos from the Cable News Network (CNN). The proposed datasets are particularly suited to supervised learning for image retrieval and are larger than any other existing dataset of a similar nature. The datasets are further provided with a set of pre-computed, state-of-the-art image feature vectors, as well as baseline results. In order to facilitate research in this important topic, we also detail a generic, supervised learning formulation for image retrieval and a related stochastic solver.
The CNN News Footage Dataset: Enabling Supervision in Image Retrieval
The CNN News Footage Dataset: Enabling Supervision in Image Retrieval
The CNN News Footage Dataset: Enabling Supervision in Image Retrieval
Related Content
To work at scale, a complete image indexing system comprises two components: An inverted file index to restrict the actual search to only a subset that should contain most of the items relevant to the query; An approximate distance computation mechanism to rapidly scan these lists. While supervised deep learning has recently enabled improvements to the latter, t…
This article presents an empirical study that investigated and compared two “big data” text analysis methods: dictionary-based analysis, perhaps the most popular automated analysis approach in social science research, and unsupervised topic modeling (i.e., Latent Dirichlet Allocation [LDA] analysis), one of the most widely used algorithms in the field of compute…
The ability of multimedia data to attract and keep people’s interest for longer periods of time is gaining more and more importance in the fields of information retrieval and recommendation, especially in the context of the ever growing market value of social media and advertising. In this chapter we introduce a benchmarking framework (dataset and evaluation too…
Webinar /Jun 2024
Blog Post /May 2025