Kennedy L. Advanced Techniques for Multimedia Search - Leveraging Cues from Content and Structure

Дисертация

формат pdf
размер 20.4 МБ
добавлен 25 июля 2011 г.

Kennedy L. Advanced Techniques for Multimedia Search - Leveraging Cues from Content and Structure

Диссертация, Колумбийский университет, 2009, -184 pp.

Современные методы поиска в мультимедийных данных.

This thesis investigates a number of advanced directions and techniques in multimedia search with a focus on search over visual content and its associated multimedia information. This topic is of interest as the size and availability of multimedia databases are rapidly multiplying and users have increasing need for methods for indexing and accessing these collections in a variety of applications, including Web image search, personal photo collections, and biomedical applications, among others.
Multimedia search refers to retrieval over databases containing multimedia documents. The design principle is to leverage the diverse cues contained in these data sets to index the semantic visual content of the documents in the database and make them accessible through simple query interfaces. The goal of this thesis is to develop a general framework for conducting these semantic visual searches and exploring new cues that can be leveraged for enhancing retrieval within this framework.
A promising aspect of multimedia retrieval is that multimedia documents contain a richness of relevant cues from a variety of sources. A problem emerges in deciding how to use each of these cues when executing a query. Some cues may be more powerful than others and these relative strengths may change from query to query. Recently, systems using classes of queries with similar optimal weightings have been proposed; however, the definition of the classes is left up to system designers and is subject to human error. We propose a framework for automatically discovering query-adaptive multimodal search methods. We develop and test this framework using a set of search cues and propose a new machine leaing-based model for adapting the usage of each of the available search cues depending upon the type of query provided by the user. We evaluate the method against a large standardized video search test set and find that automatically-discovered query classes can significantly out-perform hand-defined classes.
While multiple cues can give some insight to the content of an image, many of the existing search methods are subject to some serious flaws. Searching the text around an image or piece of video can be helpful, but it also may not reflect the visual content. Querying with image examples can be powerful, but users are not likely to adopt such a model of interaction. To address these problems, we examine the new direction of utilizing pre-defined, pre-trained visual concept detectors (such as \person" or \boat") to automatically describe the semantic content in images in the search set. Textual search queries are then mapped into this space of semantic visual concepts, essentially allowing the user to utilize a preferred method of interaction (typing in text keywords) to search against semantic visual content. We test this system against a standardized video search set. We find that larger concept lexicons logically improve retrieval performance, but there is a severely diminishing level of retu. Also, we propose an approach for leveraging many visual concepts by mining the cooccurrence of these concepts in some initial search results and find that this process can significantly increase retrieval performance.
We observe that many traditional multimedia search systems are blind to structural cues in datasets authored by multiple contributors. Specifically, we find that many images in the news or on the Web are copied, manipulated, and reused. We propose that the most frequently copied images are inherently more \interesting" than others and that highly-manipulated images can be of particular interest, representing drifts in ideological perspective. We use these cues to improve search and summarization. We develop a system for reranking image search results based on the number of times that images are reused within the initial search results and find that this reranking can significantly improve the accuracy of the retued list of images especially for queries of popular named entities. We further develop a system to characterize the types of edits present between two copies of an image and infer cues about the image's edit history. Across a plurality of images, these give rise to a sort of \family tree" for the image. We find that this method can find the most-original and most-manipulated images from within these sets, which may be useful for summarization.
The specific significant contributions of this thesis are as follows. (1) The first system to use a machine leaing-based approach to discover classes of queries to be used for query-adaptive search, a process which we show to outperform humans in conducting the same task. (2) An in-depth investigation of using visual concept lexicons to rank visual media against textual keywords, a promising approach which provides a keyword-based interface to users but indexes media based solely on its visual content. (3) A system to utilize image reuse trends (specifically, duplication) behaviors of authors to enhance retrieval in web image retrieval. (4) The first system to attempt to recover the manipulation histories of images for the purposes of summarization and exploration.

Introduction.
Query-class-dependent Models for Multimodal Search.
Leveraging Concept Lexicons and Detectors for Semantic Visual Search.
Improved Search through Mining Multimedia Reuse Patte.
Making Sense of Iconic Content in Search Results: Tracing Image Manipulation Histories.
Conclusions and Future Work.

Похожие разделы

Смотрите также

Allen R.L., Mills D.W. Signal analysis (Time, frequency, scale, and structure)

формат pdf
размер 3.5 МБ
добавлен 02 декабря 2009 г.

This text provides a complete introduction to signal analysis. Inclusion of fundamental ideas—analog and discrete signals, linear systems, Fourier transforms, and sampling theory—makes it suitable for introductory courses, self-study, and refreshers in the discipline. But along with these basics, Signal Analysis: Time, Frequency, Scale, and Structure gives a running tutorial on functional analysis—the mathematical concepts that generalize linear...

Giaretta D. Advanced Digital Preservation

формат pdf
размер 7.85 МБ
добавлен 06 августа 2011 г.

Springer, 2011. - 510 p. - ISBN: 978-3-642-16808-6 There is growing recognition of the need to address the fragility of digital information, on which our society heavily depends for smooth operation in all aspects of daily life. This has been discussed in many books and articles on digital preservation, so why is there a need for yet one more? Because, for the most part, those other publications focus on documents, images and webpages objects th...

Haykin S (ed.). Blind Deconvolution

формат djvu
размер 2.3 МБ
добавлен 14 июля 2011 г.

Издательство Prentice Hall, 1994, -289 pp. System identification and channel equalization are important examples of deconvolution. Indeed, the literature devoted to the study of deconvolution in one form or another is very expensive, which testifies to the importance of the subject. In the usual formulation of deconvolution problem, it is assumed that the system input and output are both known. There are, however, many important physical situati...

Ibnkahla M. (ed.) Signal Processing for Mobile Communication Systems

формат pdf
размер 6.97 МБ
добавлен 27 декабря 2011 г.

Издательство Springer, 2005, -812 pp. Signal processing (SP) is a key research area in mobile communications. The recent years have known a real explosion in research addressing different aspects of mobile communications signal processing. This area is continuously expanding with emerging applications and services such as interactive multimedia and Internet. SP has to meet the new challenges presented to future mobile communication systems such...

Kahrs M., Brandenburg K. Applications of Digital Signal Processing to Audio and Acoustics

формат pdf
размер 3.97 МБ
добавлен 25 июля 2011 г.

Издательство Kluwer, 2002, -571 pp. With the advent of multimedia, digital signal processing (DSP) of sound has emerged from the shadow of bandwidth-limited speech processing. Today, the main applications of audio DSP are high quality audio coding and the digital generation and manipulation of music signals. They share common research topics including perceptual measurement techniques and analysis/synthesis methods. Smaller but nonetheless very...

Kondoz A. Visual Media Coding and Transmission

формат pdf
размер 13.5 МБ
добавлен 03 августа 2011 г.

Издательство Wiley, 2009, -580 pp. SNET II is a European Union Network of Excellence (NoE) in the 6th Framework Programme, which brings together 12 leading European organizations in the field of Networked Audiovisual Media Technologies. The consortium consists of organizations with a proven track record and strong national and international reputations in audiovisual information technologies. VISNET II integrates over 100 researchers who have ma...

Nishi K. (ed.) Multimedia

формат pdf
размер 33.46 МБ
добавлен 30 сентября 2011 г.

Издательство InTech, 2010, -460 pp. Multimedia technology will play a dominant role during the 21st century and beyond, continuously changing the world. It has been embedded in every electronic system: PC, TV, audio, mobile phone, internet application, medical electronics, traffic control, building management, financial trading, plant monitoring and other various man-machine interfaces. The usability or user-friendliness is depending on maturit...

Shen J., Shepherd J., Cui B., Liu L. Intelligent Music Information Systems. Tools and Methodologies

формат pdf
размер 7.64 МБ
добавлен 24 декабря 2011 г.

Издательство Information Science Reference, 2008, -380 pp. As an important form of human expression and creativity, music data has permeated into every corner of our daily life. At the beginning of 21st century, empowered by advances in networking, data compression and physical storage, modern information systems deal with ever-increasing amounts of musical data. However, effective searching and retrieving continue to be one of the most challeng...

Wysocki T.A., Honary B., Wysocki B.J. (eds.) Signal Processing for Telecommunications and Multimedia

формат pdf
размер 11.35 МБ
добавлен 06 января 2012 г.

Издательство Springer, 2005, -302 pp. The unprecedented growth in the range of multimedia services offered these days by modern telecommunication systems has been made possible only because of the advancements in signal processing technologies and algorithms. In the area of telecommunications, application of signal processing allows for new generations of systems to achieve performance close to theoretical limits, while in the area of multimedia...

Yang D.T., Kyriakakis C., Jay Kuo C.-C. High-Fidelity Multichannel Audio Coding

формат pdf
размер 2.86 МБ
добавлен 11 августа 2011 г.

Издательство Hindawi, 2006, -233 pp. Audio is one of the fundamental elements in multimedia signals. Audio signal processing has attracted attention from researchers and engineers for several decades. By exploiting unique features of audio signals and common features of all multimedia signals, researchers and engineers have been able to develop more efficient technologies to compress audio data. Although books on digital audio have been availabl...