
Speech Parametrization
▶ Speech Analysis
Speech Processing
Speech processing is a technology that operates on the
stream of speech.
▶ Speaker Recognition, Standardization
Speech Production
LAURA DOCIO-FERNAND EZ,CARMEN GARCIA-MATEO
University of Vigo, Vigo, Spain
Synonyms
Speech system; Sound generation
Definition
Speech production is the process of uttering articulated
sounds or words, i.e., how humans generate meaningful
speech. It is a complex feedback process in which also
hearing, perception, and information processing in the
nervous system and the brain is involved.
Speaking is in essence the by-product of a necessar y
bodily process, the expulsion from the lungs of air
charged with carbon dioxide after it has fulfilled its
function in respiration. Most of the time one breathes
out silently; but it is possible, by contracting and relax-
ing the vocal tract to change the characteristics of the
air expelled from the lungs.
Introduction
Speech is one of the most natural forms of communi-
cation for human beings. Researchers in speech
technolog y are working on developing systems with
the ability to understand speech and speak with a
human being .
Human–computer interaction is a discipline con-
cerned with the design, evaluation, and implementation
of the most natural interactive computing systems for
human use [1]. Computers with this kind of ability are
gradually becoming a reality today, through the success
of speech synthesis, speech recognition, and other
related speech technologies. However, in order to give
them functions that are much closer to those of human
beings, one must learn more about the mechanisms by
which speech is produced and perceived, and develop
speech information processing technologies that make
use of these functions.
However, progress in advanced computer speech
interfaces is limited in part due to incomplete knowl-
edge of the physics of speech production. For compu-
ter generated speech output, this means limitations in
the naturalness and intelligibility of synthetic speech.
The generation of human speech involves a re-
markably complex process. In modeling the process
of human speech production one may recognize two
principal stages:
1. Formation in the mind of thoughts to be expressed
as well as the choice of words to be used. The
message is organized on the linguistic level and
structured grammatically and phonologically.
2. The string of phonemes is converted into a set of
continuous signals controlling the musculature of
the various articulators. This results in a highly
complex integrated movement sequence in which
generally participate all the articulators, the lips,
the tongue, the mandible, etc. Finally, the physical
interaction of the vibrating vocal cords and the
moving articulatory structure produces a continu-
ous acoustic signal perceived as speech.
Speech production is an activity embodied in a com-
plex physical system. It is produced by a cooperation of
lungs, glottis (with vocal cords), and articulation tract
(mouth and nose cavity). The speaker produces a
speech signal in the form of pressure waves that travel
from the speaker’s head to the listener’s ears. This
signal consists of variations in pressure as a function
of time and is usually measured directly in front of the
mouth, the primary sound source. The amplitude var-
iations correspond to deviations from atmospheric
pressure caused by traveling waves.
1290
S
Speech Parametrization