Tanguiane A. Artificial Perception and Music Recognition

формат djvu
размер 1.48 МБ
добавлен 16 января 2012 г.

Издательство Springer, 1993, -223 pp.

In this book I summarize my studies in music recognition aimed at developing a computer system for automatic notation of performed music. The performance of such a system is supposed to be similar to that of speech recognition systems: acoustical data at the input and music score printing at the output.
In this essay I develop an approach to patte recognition which is entitled artificial perception. It is based on self-organizing input data in order to segregate pattes before their identification by artificial intelligence methods. The performance of the related model is similar to distinguishing objects in abstract painting without their explicit recognition.
In this approach I try to follow nature rather than to invent a new technical device. The model incorporates the correlativity of perception, based on two fundamental perception principles, the grouping principle and the simplicity principle, in a very tight interaction.
The grouping principle is understood as the capacity to discover similar configurations of stimuli and to form high-level configurations from them. This is equivalent to describing information in terms of generative elements and their transformations.
The simplicity principle is modeled by finding the least complex representations of data that are possible. The complexity of data is understood in the sense of Kolmogorov, i.e., as the amount of memory storage required for the data representation.
The tight interdependence between these two principles corresponds to finding generative elements and their transformations with regard to the complexity of the total representation of data. This interdependence justifies the term "correlativity", which is more than relativity of perception.
The model of correlative perception is applied to voice separation (chord recognition) and rhythm/tempo tracking.
Chord spectra are described in terms of generative spectra and their transformations. The generative spectrum corresponds to a tone spectrum which is repeated several times in the chord spectrum. The transformations of the generative spectrum are its translations along the log2-scaled frequency axis. These translations correspond to intervals between the chord tones. Therefore, a chord is understood as an acoustical contour drawn by a tone spectral patte in the frequency domain.
Time events are also described in terms of generative rhythmic pattes. A series of time events is represented as a repetition of a few rhythmic pattes which are distorted by music elaboration and tempo fluctuations associated with the tempo curve. The interdependence between tempo and rhythm is overcome by minimizing the total complexity of representation, e.g., the total amount of memory needed for storing rhythmic pattes and the tempo curve.
The model also explains the function of interval hearing, certain statements of music theory, and some phenomena in rhythm perception.
Generally speaking, I investigate hierarchical representations of data. In particular, I pose the following questions:
(a) Why a hierarchy?
(b) Which hierarchy? and
(c) How does the hierarchy correspond to the reality?
From the standpoint of the model, the answers to these questions are, respectively:
(a) A hierarchy makes a data representation compact, which is desirable in most cases;
(b) consequently, a better hierarchy is one which requires less memory for the related data representation; and
(c) under certain assumptions such a hierarchy reveals perception pattes and causal relationships in their generation, making the first step towards a semantical description of the data.
One can see that the main distinction of this approach is finding optimal representations of data instead of directly recognizing pattes. In a sense, analysis of pattes is replaced by synthesis of data representations. Since self-organization is used instead of leaing, the threshold criteria used in most patte recognition models are avoided.
The correspondence between music perception and the performance of the model, together with the diversity of its applications, can hardly be regarded as simply a coincidence. It makes an impression that the model really simulates certain perception mechanisms. Probably, the related model can be applied to speech recognition, computer vision, and even simulation of abstract thinking. All of this is a subject for discussion.

Introduction
Correlativity of Perception
Substantiating the Model
Implementing the Model
Experiments on Chord Recognition
Applications to Rhythm Recognition
Applications to Music Theory
General Discussion
Conclusions

Смотрите также

Allen J.B., Chan W.-Y.G., Voran S. (eds.) Perceptual Models for Speech, Audio, and Music Processing

формат pdf
размер 7.71 МБ
добавлен 20 января 2012 г.

EURASIP Journal on Audio, Speech, and Music Processing, 2007, -92 pp. New understandings of human auditory perception have recently contributed to advances in numerous areas related to audio, speech, and music processing. These include coding, speech and speaker recognition, synthesis, signal separation, signal enhancement, automatic content identification and retrieval, and quality estimation. Researchers continue to seek more detailed, accurat...

Gan W.-S., Kuo S.M., Hansen J.H.L. (eds.) Intelligent Audio, Speech, and Music Processing Applications

формат pdf
размер 11.7 МБ
добавлен 20 января 2012 г.

EURASIP Journal on Audio, Speech, and Music Processing, 2008, -136 pp. Future audio, speech, and music processing applications need innovative intelligent algorithms that allow interactive human/environmental interfaces with surrounding devices/systems in real-world settings to control, process, render, and playback/project sound signals for different platforms under a diverse range of listening environments. These intelligent audio, speech, and...

Giaretta D. Advanced Digital Preservation

формат pdf
размер 7.85 МБ
добавлен 06 августа 2011 г.

Springer, 2011. - 510 p. - ISBN: 978-3-642-16808-6 There is growing recognition of the need to address the fragility of digital information, on which our society heavily depends for smooth operation in all aspects of daily life. This has been discussed in many books and articles on digital preservation, so why is there a need for yet one more? Because, for the most part, those other publications focus on documents, images and webpages objects th...

Kahrs M., Brandenburg K. Applications of Digital Signal Processing to Audio and Acoustics

формат pdf
размер 3.97 МБ
добавлен 25 июля 2011 г.

Издательство Kluwer, 2002, -571 pp. With the advent of multimedia, digital signal processing (DSP) of sound has emerged from the shadow of bandwidth-limited speech processing. Today, the main applications of audio DSP are high quality audio coding and the digital generation and manipulation of music signals. They share common research topics including perceptual measurement techniques and analysis/synthesis methods. Smaller but nonetheless very...

Klapuri A., Davy M. Signal Processing Methods for Music Transcription

формат pdf
размер 24.89 МБ
добавлен 03 августа 2011 г.

Издательство Springer, 2006, -443 pp. Signal processing techniques, and information technology in general, have undergone several scientific advances which permit us to address the very complex problem of automatic music transcription (AMT). During the last ten years, the interest in AMT has increased rapidly, and the time has come for a book-length overview of this subject. The purpose of this book is to present signal processing algorithms ded...

Olkkonen H. (ed.) Discrete Wavelet Transforms - Biomedical Applications

формат pdf
размер 7.92 МБ
добавлен 06 ноября 2011 г.

Издательство InTech, 2011, -378 pp. The discrete wavelet transform (DWT) has an established role in multi-scale processing of biomedical signals, such as EMG and EEG. Since DWT algorithms provide both octave-scale frequency and spatial timing of the analyzed signal. Hence, DWTs are constantly used to solve and treat more and more advanced problems. The DWT algorithms were initially based on the compactly supported conjugate quadrature filters (C...

Rocchesso D. Introduction to Sound Processing

формат pdf
размер 3.41 МБ
добавлен 06 августа 2011 г.

Издательство PHASAR Srl, Firenze, 2003, -256 pp. What you have in your hands, or on your screen, is an introductory book on sound processing. By reading this book, you may expect to acquire some knowledge on the mathematical, algorithmic, and computational tools that I consider to be important in order to become proficient sound designers or manipulators. The book is targeted at both science- and art-oriented readers, even though the latter may...

Shen J., Shepherd J., Cui B., Liu L. Intelligent Music Information Systems. Tools and Methodologies

формат pdf
размер 7.64 МБ
добавлен 24 декабря 2011 г.

Издательство Information Science Reference, 2008, -380 pp. As an important form of human expression and creativity, music data has permeated into every corner of our daily life. At the beginning of 21st century, empowered by advances in networking, data compression and physical storage, modern information systems deal with ever-increasing amounts of musical data. However, effective searching and retrieving continue to be one of the most challeng...

Thornburg H., Serafin S., Valle A. (eds.) Environmental Sound Synthesis, Processing, and Retrieval

формат pdf
размер 10.65 МБ
добавлен 20 января 2012 г.

EURASIP Journal on Audio, Speech, and Music Processing, 2010, -85 pp. This special issue of the EURASIP Journal on Audio, Speech and Music Processing is dedicated to Environmental Sound Synthesis, Processing, and Retrieval. It aims at targeting the multifaceted area of research devoted to the complex relation between environment and sound, a relation that still needs to be investigated. Indeed, we are literally immersed into sound: as Handel say...

White S. Digital Signal Processing. A Filtering Approach

формат pdf
размер 3.28 МБ
добавлен 09 августа 2011 г.

Издательство Delmar, 2000, -229 pp. Digital signal processing (DSP) refers to anything that can be done to a signal using code on a computer or DSP chip. To reduce certain sinusoidal frequency components in a signal in amplitude, digital filtering is done. One may want to obtain the integral of a signal. If the signal comes from a tachometer, the integral gives the position. If the signal is noisy, then filtering the signal to reduce the amplitu...