Sound localization refers to a listener's
ability to identify the location or origin of a detected sound in direction and
distance. It may also refer to the methods in acoustical engineering to
simulate the placement of an auditory cue in a virtual 3D space (see binaural
The sound localization mechanisms of the human auditory
system have been extensively studied. The human auditory system uses several
cues for sound source localization, including time- and level-differences
between both ears, spectral information, timing analysis, correlation analysis,
and pattern matching.
cues are also used by animals, but there may be differences in usage, and there
are also localization cues which are absent in the human auditory system, such
as the effects of ear movements.
Sound localization by
the human auditory system
(left, ahead, right)
For determining the lateral input direction (left,
front, right) the auditory system analyzes the following ear signal
Sound from the right side reaches the right ear earlier than the left ear.
The auditory system evaluates interaural time differences from
Phase delays at low frequencies
group delays at high frequencies
Sound from the right side has a higher level at the right ear than at the
left ear, because the head shadows the left ear. These level differences
are highly frequency dependent and they increase with increasing
For frequencies below 800 Hz, mainly interaural time
differences are evaluated (phase delays), for frequencies above 1600 Hz
mainly interaural level differences are evaluated. Between 800 Hz
and 1600 Hz there is a transition zone, where both mechanisms play a role
Evaluation for low frequencies
For frequencies below 800 Hz the dimensions of
the head (ear distance 21.5 cm, corresponding to an interaural time delay
of 625 µs), are smaller than the half wavelength
of the sound waves. So the auditory system can determine phase delays between
both ears very precisely. Interaural level difference are very low in this
frequency range, so that a precise evaluation of the input direction is nearly
impossible on the basis of level differences. As the frequency drops below
80 Hz it becomes difficult or impossible to use either time difference or
level difference to determine a sound's lateral source, because the phase
difference between the ears becomes too small for a directional evaluation (i.e.
the phase difference is great enough that the lagging wave sensed in the
offside ear coincides with the next wave which is being sensed by the nearer
Evaluation for high frequencies
For frequencies above 1600 Hz the dimensions of
the head are greater than the length of the sound waves. An unambiguous determination
of the input direction based on interaural phases is not possible at these
frequencies. However, the interaural level differences become bigger, and these
level differences are evaluated by the auditory system. Also, group delays between the ears can be
evaluated; this is more pronounced at higher frequencies. This means, if there is
a sound onset, the delay of this onset between both ears can be used to
determine the input direction of the corresponding sound source. This mechanism
becomes especially important in reverberant environment. After a sound onset
there is a short time frame, where the direct sound reaches the ears, but not
yet the reflected sound. The auditory system uses this short time frame for
evaluating the sound source direction, and keeps this detected direction as
long as reflections and reverberation prevent an unambiguous direction
The mechanisms described above cannot be used to
differentiate between a sound source ahead of the hearer or behind the hearer;
therefore additional cues have to be evaluated.
Sound localization in the median plane (front, above,
The human outer ear,
i.e. the structures of the pinna
and the external ear canal, form direction-selective filters. Depending on
the sound input direction in the median plane, different filter resonances
become active. These resonances implant direction-specific patterns into the frequency responses of the ears, which can be
evaluated by the auditory system (directional bands).
Together with other direction-selective reflections at the head, shoulders and
torso, they form the outer ear transfer functions.
These patterns in the ear's frequency responses are highly individual,
depending on the shape and size of the outer ear. If sound is presented through
headphones, and has been recorded via another head with different-shaped outer
ear surfaces, the directional patterns differ from the listener's own, and
problems will appear when trying to evaluate directions in the median plane
with these foreign ears. As a consequence, front–back permutations or
inside-the-head-localization can appear when listening to dummy head recordings.
Distance of the sound source
The human auditory system has only limited
possibilities to determine the distance of a sound source. In the
close-up-range there are some indications for distance determination, such as
extreme level differences (e.g. when whispering into one ear) or specific pinna
resonances in the close-up range.
The auditory system uses these clues to estimate the
distance to a sound source:
spectrum : High frequencies are more quickly damped by the air than
low frequencies. Therefore a distant sound source sounds more muffled than
a close one, because the high frequencies are attenuated. For sound with a
known spectrum (e.g. speech) the distance can be estimated roughly with
the help of the perceived sound.
Distant sound sources have a lower loudness than close ones. This aspect
can be evaluated especially for well-known sound sources (e.g. known
Similar to the visual system there is also the phenomenon of motion parallax
in acoustical perception. For a moving listener nearby sound sources are
passing faster than distant sound sources.
In enclosed rooms two types of sound are arriving at a listener: The
direct sound arrives at the listener's ears without being reflected at a
wall. Reflected sound has been reflected at least one time at a wall
before arriving at the listener. The ratio between direct sound and
reflected sound can give an indication about the distance of the sound
Sound processing of the human auditory system is
performed in so-called critical bands. The hearing
range is segmented into 24 critical bands, each with a width of 1 Bark or
For a directional analysis the signals inside the critical band are analyzed
The auditory system can extract the sound of a
desired sound source out of interfering noise. So the auditory system can
concentrate on only one speaker if other speakers are also talking (the cocktail party effect). With the help of the
cocktail party effect sound from interfering directions is perceived attenuated
compared to the sound from the desired direction. The auditory system can
increase the signal-to-noise ratio by up to 15 dB, which means
that interfering sound is perceived to be attenuated to half (or less) of its
Localization in enclosed rooms
In enclosed rooms not only the direct sound from a
sound source is arriving at the listener's ears, but also sound which has been reflected at the walls. The auditory system
analyses only the direct sound, which is arriving first, for sound
localization, but not the reflected sound, which is arriving later (law
of the first wave front). So sound localization remains possible even in an
In order to determine the time periods, where the
direct sound prevails and which can be used for directional evaluation, the
auditory system analyzes loudness changes in different critical bands and also
the stability of the perceived direction. If there is a strong attack of the
loudness in several critical bands and if the perceived direction is stable,
this attack is in all probability caused by the direct sound of a sound source,
which is entering newly or which is changing its signal characteristics. This
short time period is used by the auditory system for directional and loudness
analysis of this sound. When reflections arrive a little bit later, they do not
enhance the loudness inside the critical bands in such a strong way, but the
directional cues become unstable, because there is a mix of sound of several
reflection directions. As a result no new directional analysis is triggered by
the auditory system.
This first detected direction from the direct sound
is taken as the found sound source direction, until other strong loudness
attacks, combined with stable directional information, indicate that a new
directional analysis is possible. (see Franssen
Since most animals have also two ears, many of the
effects of the human auditory system can also be found at animals. Therefore
interaural time differences (interaural phase differences) and interaural level
differences play a role for the hearing of many animals. But the influences on
localization of these effects are dependent on head sizes, ear distances, the
ear positions and the orientation of the ears.
Lateral information (left, ahead, right)
If the ears are located at the side of the head,
similar lateral localization cues as for the human auditory system can be used.
This means: evaluation of interaural time differences (interaural phase
differences) for lower frequencies and evaluation of interaural level
differences for higher frequencies. The evaluation of interaural phase
differences is useful, as long as it gives unambiguous results. This is the
case, as long as ear distance is smaller than half the length (maximal one
wavelength) of the sound waves. For animals with a larger head than humans the
evaluation range for interaural phase differences is shifted towards lower
frequencies, for animals with a smaller head, this range is shifted towards
The lowest frequency which can be localized depends
on the ear distance. Animals with a greater ear distance can localize lower
frequencies than humans can. For animals with a smaller ear distance the lowest
localizable frequency is higher than for humans.
If the ears are located at the side of the head,
interaural level differences appear for higher frequencies and can be evaluated
for localization tasks. For animals with ears at the top of the head, no
shadowing by the head will appear and therefore there will be much less
interaural level differences, which could be evaluated. Many of these animals
can move their ears, and these ear movements can be used as a lateral
Sound localization in the median plane (front, above,
For many mammals there are also pronounced structures
in the pinna near the entry of the ear canal. As a consequence,
direction-dependent resonances can appear, which could be used as an additional
localization cue, similar to the localization in the median plane in the human
auditory system. There are additional localization cues which are also used by animals.
For sound localization in the median plane (elevation
of the sound) also two detectors can be used, which are positioned at different
heights. In animals, however, rough elevation information is gained simply by
tilting the head, provided that the sound lasts long enough to complete the
movement. This explains the innate behavior of cocking the head to one side
when trying to localize a sound precisely. To get instantaneous localization in
more than two dimensions from time-difference or amplitude-difference cues
requires more than two detectors.
Localization with one ear (flies)
The tiny parasitic fly Ormia
ochracea has become a model organism in sound localization experiments
because of its unique ear.
The animal is too small for the time difference of sound arriving at the two
ears to be calculated in the usual way, yet it can determine the direction of
sound sources with exquisite precision. The tympanic
membranes of opposite ears are directly connected mechanically, allowing
resolution of sub-microsecond time differences
and requiring a new neural coding strategy.
showed that the coupled-eardrum system in frogs can produce increased
interaural vibration disparities when only small arrival
time and sound level differences were available to the animal’s head.
Efforts to build directional microphones based on the coupled-eardrum structure
Bi-coordinate sound localization in owls
Most owls are nocturnal or crepuscularbirds
of prey. Because they hunt at night, they must rely on non-visual senses.
Experiments by Roger Payne 
have shown that owls are sensitive to the sounds made by their prey, not the
heat or the smell. In fact, the sound cues are both necessary and sufficient for
localization of mice from a distant location where they are perched. For this
to work, the owls must be able to accurately localize both the azimuth and the
elevation of the sound source.
ITD and ILD
Owls living above ground must be able to determine
the necessary angle of descent, i.e. the elevation, in addition to azimuth
(horizontal angle to the sound). This bi-coordinate sound localization is
accomplished through two binaural cues: the interaural time difference (ITD) and the
interaural level difference (ILD), also known as the interaural intensity
difference (IID). The ability in owls is unusual; in ground-bound mammals such
as mice, ITD and ILD are redundant cues for azimuth.
ITD occurs whenever the distance from the source of
sound to the two ears is different, resulting in differences in the arrival
times of the sound at the two ears. When the sound source is directly in front
of the owl, there is no ITD, i.e. the ITD is zero. In sound localization, ITDs
are used as cues for location in the azimuth. ITD changes systematically with
azimuth. Sounds to the right arrive first at the right ear; sounds to the left
arrive first at the left ear.
In mammals there is a level difference in sounds at
the two ears caused by the sound-shadowing effect of the head. But in many
species of owls, level differences arise primarily for sounds that are shifted
above or below the elevation of the horizontal plane. This is due to the
asymmetry in placement of the ear openings in the owl's head, such that sounds
from below the owl reach the left ear first and sounds from above reach the
right ear first.
IID is a measure of the difference in the level of the sound as it reaches each
ear. In many owls, IIDs for high-frequency sounds (higher than 4 or 5 kHz)
are the principal cues for locating sound elevation.
Parallel processing pathways in the brain
The axons of the auditory
nerve originate from the hair cells of the cochlea in the inner ear.
Different sound frequencies are encoded by different fibers of the auditory
nerve, arranged along the length of the auditory nerve, but codes for the
timing and level of the sound are not segregated within the auditory nerve.
Instead, the ITD is encoded by phase
locking, i.e. firing at or near a particular phase angle of the sinusoidal
stimulus sound wave, and the IID is encoded by spike rate. Both parameters are
carried by each fiber of the auditory nerve.
The fibers of the auditory nerve innervate
both cochlear nuclei in the brainstem, the cochlear
nucleus magnocellularis (mammalian anteroventral cochlear nucleus) and the cochlear
nucleus angularis (see figure; mammalian posteroventral and dorsal cochlear
nuclei). The neurons of the nucleus magnocellularis phase-lock, but are fairly
insensitive to variations in sound pressure, while the neurons of the nucleus
angularis phase-lock poorly, if at all, but are sensitive to variations in
sound pressure. These two nuclei are the starting points of two separate but
parallel pathways to the inferior colliculus: the pathway from nucleus
magnocellularis processes ITDs, and the pathway from nucleus angularis
Parallel processing pathways in the brain
for time and level for sound localization in the owl
In the time pathway, the nucleus laminaris (mammalian
medial superior olive) is the first site of binaural convergence. It is here
that ITD is detected and encoded using neuronal delay lines
and coincidence detection, as in the Jeffress
model; when phase-locked impulses coming from the left and right ears coincide
at a laminaris neuron, the cell fires most strongly. Thus, the nucleus
laminaris acts as a delay-line coincidence detector, converting distance
traveled to time delay and generating a map of interaural time difference.
Neurons from the nucleus laminaris project to the core of the central nucleus
of the inferior colliculus and to the anterior lateral lemniscal nucleus.
In the sound level pathway, the posterior lateral
lemniscal nucleus (mammalian lateral superior olive) is the site of binaural
convergence and where IID is processed. Stimulation of the contralateral ear
excites and that of the ipsilateral ear inhibits the neurons of the nuclei in
each brain hemisphere independently. The degree of excitation and inhibition
depends on sound pressure, and the difference between the strength of the
inhibitory input and that of the excitatory input determines the rate at which
neurons of the lemniscal nucleus fire. Thus the response of these neurons is a
function of the difference in sound pressure between the two ears.
The time and sound-pressure pathways converge at the lateral
shell of the central nucleus of the inferior colliculus. The lateral shell
projects to the external nucleus, where each space-specific neuron responds to
acoustic stimuli only if the sound originates from a restricted area in space,
i.e. the receptive field of that neuron. These neurons
respond exclusively to binaural signals containing the same ITD and IID that
would be created by a sound source located in the neuron’s receptive field.
Thus their receptive fields arise from the neurons’ tuning to particular
combinations of ITD and IID, simultaneously in a narrow range. These
space-specific neurons can thus form a map of auditory space in which the
positions of receptive fields in space are isomorphically projected onto the
anatomical sites of the neurons.
Significance of asymmetrical ears for localization of
The ears of many species of owls are asymmetrical.
For example, in barn
owls (Tyto alba), the placement of the two ear flaps (operculi)
lying directly in front of the ear canal opening is different for each ear. This
asymmetry is such that the center of the left ear flap is slightly above a
horizontal line passing through the eyes and directed downward, while the
center of the right ear flap is slightly below the line and directed upward. In
two other species of owls with asymmetrical ears, the saw-whet
Owl and the long-eared owl, the asymmetry is achieved by different
means: in saw whets, the skull is asymmetrical; in the long-eared owl, the skin
structures lying near the ear form asymmetrical entrances to the ear canals,
which is achieved by a horizontal membrane. Thus, ear asymmetry seems to have
evolved on at least three different occasions among owls. Because owls depend
on their sense of hearing for hunting, this convergent evolution in owl ears suggests that
asymmetry is important for sound localization in the owl.
Ear asymmetry allows for sound originating from below
the eye level to sound louder in the left ear, while sound originating from
above the eye level to sound louder in the right ear. Asymmetrical ear placement
also causes IID for high frequencies (between 4 kHz and 8 kHz) to
vary systematically with elevation, converting IID into a map of elevation.
Thus, it is essential for an owl to have the ability to hear high frequencies.
Many birds have the neurophysiological machinery to process both ITD and IID,
but because they have small heads and low frequency sensitivity, they use both
parameters only for localization in the azimuth. Through evolution, the ability
to hear frequencies higher than 3 kHz, the highest frequency of owl flight
noise, enabled owls to exploit elevational IIDs, produced by small ear
asymmetries that arose by chance, and began the evolution of more elaborate
forms of ear asymmetry.
Another demonstration of the importance of ear
asymmetry in owls is that, in experiments, owls with symmetrical ears, such as
owl (Otus asio) and the great
horned owl (Bubo virginianus), could not be trained to locate prey
in total darkness, whereas owls with asymmetrical ears could be trained.
In vertebrates, inter-aural time differences are known to be
calculated in the superior olivary
nucleus of the brainstem. According to Jeffress,
this calculation relies on delay lines: neurons in the
superior olive which accept innervation from each ear with different connecting
axon lengths. Some
cells are more directly connected to one ear than the other, thus they are
specific for a particular inter-aural time difference. This theory is
equivalent to the mathematical procedure of cross-correlation.
However, because Jeffress' theory is unable to account for the precedence
effect, in which only the first of multiple identical sounds is used to
determine the sounds' location (thus avoiding confusion caused by echoes), it
cannot be entirely used to explain the response.
Neurons sensitive to ILDs are excited by stimulation of one
ear and inhibited by stimulation of the other ear, such that the response
magnitude of the cell depends on the relative strengths of the two inputs,
which in turn, depends on the sound intensities at the ears.
In the auditory midbrain nucleus, the inferior colliculus (IC), many ILD sensitive
neurons have response functions that decline steeply from maximum to zero
spikes as a function of ILD. However, there are also many neurons with much
more shallow response functions that do not decline to zero spikes.
Processing of head-related transfer functions for
biological sound localization occurs in the auditory