How is the perception of sounds briefly. Auditory analyzer. The mechanism of perception of sounds of different frequencies. The organs of perception of sound

It is a complex specialized organ consisting of three sections: the outer, middle and inner ear.

The outer ear is a sound-detecting apparatus. Sound vibrations are captured by the auricles and transmitted through the external auditory canal to the tympanic membrane, which separates the outer ear from the middle ear. Catching sound and the whole process of listening with two ears, the so-called biniural hearing, are important in determining the direction of sound. Sound vibrations coming from the side reach the nearest ear a few decimal fractions of a second (0.0006 s) earlier than the other. This extremely small difference in the time of arrival of sound to both ears is enough to determine its direction.

The middle ear is an air cavity that connects to the nasopharynx cavity through the Eustachian tube. Oscillations from the tympanic membrane through the middle ear are transmitted by 3 auditory ossicles connected to each other - the malleus, incus and stapes, and the latter through the membrane of the oval window transmits these oscillations of the fluid located in inner ear- perilymph. Thanks to the auditory ossicles, the amplitude of the oscillations decreases and their strength increases, which allows the column of fluid in the inner ear to be set in motion. The middle ear has a special mechanism for adapting to changes in sound intensity. With strong sounds, special muscles increase the tension of the tympanic membrane and reduce the mobility of the stapes. This reduces the vibration amplitude and protects the inner ear from damage.

The inner ear with the cochlea located in it is located in the pyramid of the temporal bone. The human snail forms 2.5 helical coils. The cochlear canal is divided by two septa (main membrane and vestibular membrane) into 3 narrow passages: upper (vestibular ladder), middle (membranous canal) and lower (tympanic ladder). At the top of the snail there is a hole connecting the upper and lower channels into a single one, going from the oval window to the top of the snail and further to the round window. Their cavity is filled with a liquid - perilymph, and the cavity of the middle membranous canal is filled with a liquid of a different composition - endolymph. In the middle channel there is a sound-perceiving apparatus - Corti's organ, in which there are receptors for sound vibrations - hair cells.

Sound perception mechanism. The physiological mechanism of sound perception is based on two processes occurring in the cochlea: 1) the separation of sounds of different frequencies at the place of their greatest impact on the main membrane of the cochlea and 2) the transformation of mechanical vibrations into nervous excitation by receptor cells. Sound vibrations entering the inner ear through the oval window are transmitted to the perilymph, and vibrations of this fluid lead to displacements of the main membrane. The height of the column of the oscillating liquid and, accordingly, the place of the greatest displacement of the main membrane depend on the pitch of the sound. Thus, with sounds of different heights, different hair cells and different nerve fibers are excited. An increase in sound intensity leads to an increase in the number of excited hair cells and nerve fibers, which makes it possible to distinguish the intensity of sound vibrations.
The transformation of vibrations into the process of excitation is carried out by special receptors - hair cells. The hairs of these cells are embedded in the integumentary membrane. Mechanical vibrations under the action of sound lead to a displacement of the integumentary membrane relative to the receptor cells and bending of the hairs. In receptor cells, mechanical displacement of hairs causes a process of excitation.

Sound conductivity. Distinguish between air and bone conduction. V normal conditions in humans, air conduction prevails: sound waves are captured by the outer ear, and air vibrations are transmitted through the outer ear canal to the middle and inner ear. In the case of bone conduction, sound vibrations are transmitted through the bones of the skull directly to the cochlea. This mechanism of transmission of sound vibrations is important when a person dives under water.
A person usually perceives sounds with a frequency of 15 to 20,000 Hz (in the range of 10-11 octaves). In children, the upper limit reaches 22,000 Hz and decreases with age. The highest sensitivity was found in the frequency range from 1000 to 3000 Hz. This area corresponds to the most common frequencies of human speech and music.

The process of our perception of sounds depends on the quality of the incoming sound information and on the state of our psyche.

About sounds and what we hear.

Sound can be thought of as a wave compaction of a medium moving in a straight line from a source of vibrations at a certain speed. With distance, the wave loses its "density", gradually fading away. Sound attenuation is inversely proportional to the square of the distance from the sound source. The speed of sound propagation in gases depends on the nature of the gas, the density of the medium, temperature and static atmospheric pressure. For liquid and gaseous media - mainly from the nature of the medium. So, in air, this value is from 330 to 345 m / s when the temperature changes from 0 to 200C, in water - about 1500 m / s, in steel - 6000 m / s.

The article on the structure of the auditory analyzer describes the main mechanism of perception of sounds by the organs of hearing through the outer and middle ear and transformations sound waves into electrical impulses in the inner ear. In addition to the air pathway for conducting sound to the receptor cells of the inner ear, there is also a bony pathway for sound perception, since sound waves not only enter the external auditory canal, but also cause the skull bone to vibrate. This mechanism is important for understanding why we hear the sound of our own voice distorted. With bone conduction of sound, only high sounds with a small amplitude of oscillations reach the receptor cells, so we hear our voice higher than others hear it.

There is also a microwave auditory effect, which is the auditory perception of microwave radiation. When exposed to pulsed or modulated microwave radiation, the perception of sounds occurs directly inside the human skull. During this process, shock waves arise, which are perceived by a person as sound information that no one else can hear. It was also found that at appropriate choice modulating signal, it is possible to transmit audio information to a person in the form of separate words or phrases through microwave radiation.

Selectivity of auditory sensations sound information.

The sounds we hear are sound information decoded by the brain, converted into subjective sound representations or images. The sounds that reach us can be measured and objectively described, but the perception of sound is individual and selective. It depends not only on the quality of our auditory analyzer, but also psychological state, mood, current needs.

Usually we do not hear the clock ticking or the fan making noise, we may not hear the conversation of people nearby if we are busy with a matter of interest to us. But, having listened, let us hear our own breathing. Loud sounds that do not irritate us pass “on our ears,” but interesting and important, even very quiet ones, can cause a serious emotional response. Our hearing aids are extremely selective for sound information. This subjective perception of sounds occurs due to a kind of input filter of the brain, which inhibits the perception of sounds that we do not need. Filtering sounds, filtering out useless "spam", allows us to highlight the information that is really important at the moment.

However, the filtering of sound information without the participation of consciousness has a downside. Some sound structures with low frequencies and slow rhythms have the effect of deep muscle or mental relaxation. Perception of the sounds of such music and rhythms can also create conditions for the mobilization of the body without the usual influence of conscious control on it. For example, it has been known since ancient times that the rhythm of the drum helps soldiers to walk stupidly even when they are very tired. Such sound information is used to enhance the effect of suggestion by shamans, hypnotists, or psychotherapists.

The transformation of the sound waves arriving to us into sound information is carried out in the auditory analyzer, and the final processing of the incoming signals can be done in several hearing centers the brain, exchanging information with other important centers, primarily the motor center and the center of vision. It is also possible to use the auditory perception of sound information stored in memory for comparison and identification of a new sound representation.

Determination of the direction of the sound stimulus.

To understand where the sound information comes from, the crocodile must turn the body, the cat only needs to unfold the ears, and the person does not need to make any movements at all.

A person has a stereophonic perception of sound, determining the horizontal direction of sound in two main ways: by the time delay between the sound input into one ear and its entrance into the other, and by the difference between the intensity of sounds in both ears. The first mechanism for perception of sound works best at frequencies below 3000 hertz (Hz), and the second mechanism at higher frequencies, since the head at these frequencies is a more significant barrier to sound information.

If a person looks directly at the source of the sound, the sound information reaches both ears at the same time, but if one ear is closer to the stimulus than the other, the sound signals from the first ear enter the brain a few microseconds before the sound information from the second.

The distinction between whether the sound source is in front of or behind a person, as well as above or below, is achieved mainly with the help of a sophisticated shape of the auricles, which changes the intensity of the sound entering the ear, depending on the direction from which it is coming.

Psychoacoustics is a field of science that studies the auditory sensations of a person when sound is applied to the ears.

People with absolute (analytical) ear for music can accurately determine the pitch, volume and timbre of a sound, are able to memorize the sound of instruments and recognize them after a while. They can correctly analyze what they have heard, correctly highlight individual instruments.

People who do not have perfect pitch can determine the rhythm, timbre, tonality, but it is difficult for them to correctly analyze the material they listened to.

When listening to high-quality audio equipment, as a rule, expert opinions differ. Some people prefer high transparency and fidelity of transmission of each overtone, they are annoyed by the lack of detail in the sound. Others prefer the sound of a blurry, indistinct character, quickly get tired of the abundance of details in the musical image. Someone focuses on the harmony in the sound, someone on the spectral balance, and someone on the dynamic range. It turns out that everything depends on the type of the individual. The types of people are divided into the following dichotomies (paired classes): sensory and intuitive, thinking and feeling, extraverted and introverted, decisive and perceiving.

People with sensory dominance have a clear diction, perfectly perceive all the nuances of a speech or musical image. For them, the transparency of the sound is extremely important, when all sounding instruments are clearly distinguished.

Listeners with an intuitive dominance prefer a blurred musical image, attach the utmost importance to the balance of the sound of all musical instruments.

Listeners with a thinking dominant prefer musical works with a high dynamic range, with a clearly marked major and minor dominant, with a pronounced meaning and structure of the piece

People with a feeling dominant impart great importance harmony in musical works, they prefer works with slight deviations of major and minor from the neutral value, i.e. "Music for the soul."



A listener with an extroverted dominant successfully separates the signal from the noise, prefers to listen to music at a high volume level, the major or minority of a piece of music is determined by the frequency position of the musical image at the moment.

People with an introverted dominant pay significant attention to the internal structure of a musical image, major-minority is also assessed by the frequency shift of one of the harmonics in the resonances that arise, extraneous noises make it difficult to perceive audio information.

People with a decisive dominant prefer regularity in music, the presence of internal periodicity.

Listeners with a perceiving dominant prefer improvisation in music.

Everyone knows for himself that the same music on the same equipment and in the same room is not always perceived in the same way. Probably, depending on the psychoemotional state, our feelings are either dulled or exacerbated.

On the other hand, excessive detail and naturalness of the sound can irritate a tired and worried listener with a sensory dominant, that in such a state he will prefer blurry and soft music, roughly speaking, he will prefer to listen to live instruments in a hat with earflaps.

To some extent, the sound quality is influenced by the "quality" of the mains voltage, which in turn depends on both the day of the week and the time of day (during peak hours, the mains voltage is most "polluted"). The time of day also affects the noise level in the room, and hence the real dynamic range.

A 20-year-old case is well remembered about the effect of ambient noise. Late in the evening after the village wedding, the youth stayed behind to help clear the tables and wash the dishes. The music was organized in the courtyard: an electric accordion with a two-channel amplifier and two speakers, a four-channel power amplifier according to Shushurin's scheme, to the input of which an electric accordion was connected, and to the outputs - two 3-way and two 2-way acoustic systems. A tape recorder with recordings made at 19 speeds with anti-parallel bias. At about 2 o'clock in the morning, when everyone was free, the youth gathered in the yard and asked to include something for the soul. Imagine the surprise of the musicians and the music lovers present when a medley on Beatles themes performed by STARS on 45 sounded. For an ear adapted to the perception of music in an atmosphere of increased noise, the sound in the silence of the night became surprisingly clear and nuanced.

Perception by frequency

The human ear perceives the oscillatory process as sound only if the frequency of its oscillations is in the range from 16 ... 20 Hz to 16 ... 20 kHz. At a frequency below 20 Hz, vibrations are called infrasonic, above 20 kHz - ultrasonic. Sounds with a frequency below 40 Hz are rare in music, and in colloquial speech and are completely absent. The perception of high sound frequencies is highly dependent on both the individual characteristics of the hearing organs and the age of the listener. So, for example, at the age of up to 18 years, sounds with a frequency of 14 kHz are heard by about 100%, while at the age of 50 ... 60 years - only 20% of listeners. Sounds with a frequency of 18 kHz by the age of 18 are heard by about 60%, and by 40 ... 50 years - only 10% of listeners. But this does not mean at all that for elderly people the requirements for the quality of the sound reproduction path are reduced. It has been experimentally established that people who barely perceive signals with a frequency of 12 kHz very easily recognize the lack of high frequencies in a phonogram.

Hearing resolution for a frequency change of about 0.3%. For example, two tones 1000 and 1003 Hz, following one after the other, can be distinguished without instruments. And by beating the frequencies of two tones, a person can detect a frequency difference of up to tenths of a hertz. At the same time, it is difficult to distinguish by ear the deviation of the playback speed of the musical phonogram within ± 2%.

The subjective scale of the perception of sound in terms of frequency is close to the logarithmic law. Based on this, all frequency characteristics of sound transmission devices are plotted on a logarithmic scale. The degree of accuracy with which a person determines the pitch by ear depends on the acuity, musicality and fitness of his hearing, as well as on the intensity of the sound. At high volume levels, sounds with higher intensity appear lower than weak sounds.

With prolonged exposure to intense sound, the sensitivity of hearing gradually decreases and the more, the higher the sound volume, which is associated with the response of hearing to overload, i.e. with its natural adaptation. After a certain time, the sensitivity is restored. Systematic and prolonged listening to music at high volume levels causes irreversible changes in the organs of hearing, especially young people who use headphones (head phones) suffer.

Timbre is an important characteristic of sound. The ability of hearing to distinguish its shades makes it possible to distinguish a variety of musical instruments and voices. Thanks to their timbre coloring, their sound becomes multicolored and easily recognizable. The condition for the correct transmission of timbre is the undistorted transmission of the signal spectrum - a set of sinusoidal components of a complex signal (overtones). The overtones are multiples of the frequency of the fundamental tone and are smaller in amplitude. The timbre of the sound depends on the composition of overtones and their intensity.

The timbre of the sound of live instruments largely depends on the intensity of sound production. For example, the same note played on the piano with a light finger pressure and a sharp one has different attacks and signal spectra. Even an untrained person can easily detect the emotional difference between two such sounds by their attack, even if they are transmitted to the listener using a microphone and are balanced in volume. Sound attack is an initial stage, a specific transient process, during which stable characteristics are established: loudness, timbre, pitch. The duration of the attack of the sound of different instruments ranges from 0 ... 60 ms. For example, for percussion instruments it is in the range of 0 ... 20 ms, for a bassoon - 20 ... 60 ms. The attack characteristics of an instrument are highly dependent on the manner and technique of the musician's playing. It is these features of the instruments that make it possible to convey the emotional content of a piece of music.

The timbre of the sound of a signal source located at a distance of less than 3 m from the listener is perceived as "heavier". Removing the signal source from 3 to 10 m is accompanied by a proportional decrease in volume, while the timbre becomes brighter. With further removal of the signal source, energy losses in the air grow in proportion to the square of the frequency and have a complex dependence on the relative humidity of the air. Energy losses of RF components are maximum at relative humidity in the range from 8 to 30 ... 40% and minimum at 80% (Fig. 1.1). An increase in overtone loss leads to a decrease in timbre brightness.

Amplitude perception

Curves of equal loudness from the hearing threshold to the pain threshold for binaural and monaural listening are shown in Fig. 1.2.a, b, respectively. Perception in amplitude depends on frequency and has a significant variation associated with age-related changes.

Hearing sensitivity to sound intensity is discrete. The threshold for sensing a change in sound intensity depends on both the frequency and the volume of the sound (at high and medium levels it is 0.2 ... 0.6 dB, at low levels it reaches several decibels) and on average is less than 1 dB.

Haas effect

The hearing aid, like any other oscillatory system, is characterized by inertia. Due to this property, short sounds with a duration of up to 20 ms are perceived as quieter than sounds with a duration of more than 150 ms. One of the manifestations of inertia is

the inability of a person to detect distortions in pulses with a duration of less than 20 ms. In the case of 2 identical signals arriving at the ears, with a time interval between them of 5 ... 40 ms, the hearing perceives them as one signal, with an interval of more than 40 ... 50 ms - separately.

Masking effect

At night, in quiet conditions, you can hear the squeak of a mosquito, the ticking of a clock and other quiet sounds, and in noisy conditions it is difficult to make out the loud speech of the interlocutor. Under real conditions, the acoustic signal does not exist in absolute silence. Extraneous noise, which is inevitably present in the listening area, mask to a certain extent the main signal and make it difficult to perceive. Raising the threshold of hearing one tone (or signal) while simultaneously exposed to another tone (noise or signal) is called masking.

It has been experimentally established that a tone of any frequency is masked by lower tones much more effectively than by higher tones, in other words, low-frequency tones mask high-frequency tones more strongly than vice versa. For example, when simultaneously playing sounds of 440 and 1200 Hz with the same intensity, we will hear only a tone with a frequency of 440 Hz, and only after turning it off, we will hear a tone with a frequency of 1200 Hz. The degree of masking depends on the frequency ratio and is complex in nature, associated with curves of equal loudness (Fig. 1.3.α and 1.3.6).

The higher the frequency ratio, the less the masking effect. This largely explains the "transistor" sounding phenomenon. The nonlinear distortion spectrum of transistor amplifiers extends up to the 11th harmonic, while the spectrum of tube amplifiers is limited to 3 ... 5 harmonics. The narrowband noise masking curves differ for tones of different frequencies and their intensity levels. A clear perception of sound is possible if its intensity exceeds a certain threshold of hearing. At frequencies of 500 Hz and below, the excess of the signal intensity should be about 20 dB, at a frequency of 5 kHz - about 30 dB, and

at a frequency of 10 kHz - 35 dB. This feature of auditory perception is taken into account when recording on sound carriers. So, if the signal-to-noise ratio of an analog gramophone record is about 60 ... 65 dB, then the dynamic range of the recorded program can be no more than 45 ... 48 dB.

The masking effect affects the subjectively perceived loudness of the sound. If the components of a complex sound are located close to each other in frequency and their mutual masking is observed, then the volume of such a complex sound will be less than the loudness of its components.

If several tones are located so far in frequency that their mutual masking can be neglected, then their total loudness will be equal to the sum of the loudness of each of the components.

Achieving "transparency" of the sound of all instruments of an orchestra or pop ensemble is a difficult task that is solved by a sound engineer - deliberately highlighting the most important instruments in a given place of work and other special techniques.

Binaural effect

The ability of a person to determine the direction of a sound source (due to the presence of two ears) is called binaural effect... To the ear located closer to the sound source, the sound arrives earlier than to the second ear, which means that it differs in phase and amplitude. When listening to a real signal source, binaural signals (i.e., signals arriving at the right and left ear) are statistically related (correlated). The accuracy of localizing a sound source depends on both the frequency and its location (in front or behind the listener). The hearing organ receives additional information about the location of the sound source (front, back, top) by analyzing the features of the spectrum of binaural signals.

Up to 150 ... 300 Hz, human hearing has a very low directivity. At frequencies of 300 ... 2000 Hz, for which the half-wavelength of the signal is commensurate with the "inter-ear" distance equal to 20 ... 25 cm, phase differences are significant. Beginning with a frequency of 2 kHz, the directivity of hearing decreases sharply. At higher frequencies, the difference in signal amplitudes becomes more important. When the difference in amplitudes exceeds the 1 dB threshold, the sound source appears to be on the side where the amplitude is greater.

With an asymmetric position of the listener relative to the loudspeakers, additional intensity and temporal separations arise, which lead to spatial distortions. Moreover, the further KIZ (apparent sound source) from the center of the base (Δ L> 7 dB or Δτ> 0.8 ms), the less distortion they are. At Δ L> 20 dB, Δτ> 3 ... 5 ms QIZ are transformed into real (loudspeakers) and are not subject to spatial distortion.

It has been experimentally established that there are no spatial distortions (imperceptible) if the frequency band of each channel is limited from above by a frequency of at least 10 kHz, and the high-frequency (above 10 kHz) and low-frequency (below 300 Hz) parts of the spectrum of these signals are reproduced monophonically.

The error in assessing the azimuth of the sound source in the horizontal plane in the front is 3 ... 4 °, behind and in the vertical plane - about 10 ... 15 °, which is explained by the shielding effect of the auricles.

Having considered the theory of propagation and the mechanisms of the occurrence of sound waves, it is advisable to understand how sound is "interpreted" or perceived by a person. A paired organ, the ear, is responsible for the perception of sound waves in the human body. Human ear- a very complex organ that is responsible for two functions: 1) perceives sound impulses 2) performs the role of the vestibular apparatus of the entire human body, determines the position of the body in space and gives the vital ability to maintain balance. The average human ear is capable of picking up fluctuations of 20 - 20,000 Hz, but there are deviations up or down. Ideally, the audible frequency range is 16 - 20,000 Hz, which also corresponds to 16 m - 20 cm of wavelength. The ear is divided into three parts: the outer, middle, and inner ear. Each of these "departments" performs its own function, however, all three departments are closely related to each other and in fact carry out the transmission of a wave of sound vibrations to each other.

Outer (outer) ear

The outer ear consists of the auricle and the external auditory canal. The auricle is a complex-shaped elastic cartilage covered with skin. In the lower part of the auricle, there is a lobe, which consists of adipose tissue and is also covered with skin. The auricle acts as a receiver for sound waves from the surrounding space. The special shape of the structure of the auricle makes it possible to better capture sounds, especially the sounds of the mid-frequency range, which is responsible for the transmission of speech information. This fact is largely due to evolutionary necessity, since a person spends most of his life in oral communication with representatives of his species. The human auricle is practically immobile, in contrast to a large number of representatives of the animal species, which use ear movements to more accurately tune to the sound source.

The folds of the human auricle are designed in such a way that they make corrections (minor distortions) with respect to the vertical and horizontal location of the sound source in space. It is due to this unique feature that a person is able to quite clearly determine the location of an object in space relative to himself, being guided only by sound. This feature is also well known under the term "sound localization". The main function of the auricle is to pick up as many sounds as possible in the audible frequency range. The further fate of the "trapped" sound waves is decided in the ear canal, the length of which is 25-30 mm. In it, the cartilaginous part of the external auricle passes into the bony one, and the skin surface of the ear canal is endowed with sebaceous and sulfur glands. At the end of the ear canal there is an elastic tympanic membrane, to which the vibrations of sound waves reach, thereby causing its reciprocal vibrations. The eardrum, in turn, transmits these received vibrations to the middle ear.

Middle ear

The vibrations transmitted by the eardrum go to an area of ​​the middle ear called the "tympanic region". This is an area about one cubic centimeter in volume, in which three ossicles are located: malleus, incus and stapes. It is these "intermediate" elements that perform essential function: transmission of sound waves to the inner ear and simultaneous amplification. The auditory bones are an extremely complex chain of sound transmission. All three bones are closely connected to each other, as well as to the tympanic membrane, due to which the transmission of vibrations "along the chain" occurs. On the way to the area of ​​the inner ear, there is a vestibule window, which is overlapped by the base of the stapes. To equalize the pressure on both sides of the tympanic membrane (for example, in the case of changes in external pressure), the middle ear area is connected to the nasopharynx through the Eustachian tube. We are all familiar with the ear-popping effect that occurs precisely because of this fine tuning. From the middle ear, sound vibrations, already amplified, enter the region of the inner ear, which is the most complex and sensitive.

Inner ear

The most complex form is the inner ear, called for this reason the labyrinth. The bone labyrinth includes: vestibule, cochlea and semicircular canals, as well as the vestibular apparatus responsible for balance. The cochlea is directly related to hearing in this ligament. The cochlea is a spiral-shaped membranous canal filled with lymphatic fluid. Internally, the canal is divided into two parts by another membranous septum called the "main membrane". This membrane consists of fibers of various lengths (more than 24,000 in total), stretched like strings, each string resonating to its own specific sound. The division of the channel by a membrane is carried out into an upper and lower ladder, communicating at the apex of the cochlea. At the opposite end, the canal connects to the receptor apparatus of the auditory analyzer, which is covered with the smallest hair cells. This hearing analyzer device is also called the "organ of Corti". When vibrations from the middle ear enter the cochlea, the lymphatic fluid filling the canal also vibrates, transmitting the vibrations to the underlying membrane. At this moment, the apparatus of the auditory analyzer comes into play, the hair cells of which, located in several rows, convert sound vibrations into electrical "nerve" impulses, which are transmitted through the auditory nerve to the temporal zone of the cerebral cortex. In such a complex and ornate way, a person will eventually hear the desired sound.

Features of perception and formation of speech

The mechanism of speech formation was formed in humans throughout the entire evolutionary stage. The meaning of this ability lies in the transmission of verbal and non-verbal information. The first carries a verbal and semantic load, the second is responsible for the transfer of the emotional component. The process of creating and comprehending speech includes: the formulation of the message; coding into elements according to the rules of the existing language; transient neuromuscular actions; movement of the vocal cords; emission of an acoustic signal; Then the listener enters into action, carrying out: spectral analysis of the received acoustic signal and the selection of acoustic features in the peripheral auditory system, transmission of the selected features through neural networks, recognition of the language code (linguistic analysis), understanding the meaning of the message.
The apparatus for forming speech signals can be compared with a complex wind instrument, however, the versatility and flexibility of setting and the ability to reproduce the slightest subtleties and details have no analogues in nature. The voice-forming mechanism consists of three inseparable components:

  1. Generator- lungs as an air volume reservoir. The energy of excess pressure is stored in the lungs, then through the excretory canal with the help of the muscular system, this energy is removed through the trachea, which is connected to the larynx. At this stage, the air stream is interrupted and modified;
  2. Vibrator- consists of the vocal cords. Turbulent air jets (create edge tones) and impulse sources (explosions) also affect the flow;
  3. Resonator- includes resonance cavities of complex geometric shape(pharynx, mouth and nasal cavity).

In the aggregate of the individual device of these elements, a unique and individual timbre of the voice of each person is formed separately.

The energy of the air column is generated in the lungs, which create a certain air flow during inhalation and exhalation due to the difference in atmospheric and intrapulmonary pressure. The process of accumulating energy is carried out through inhalation, the process of release is characterized by exhalation. This happens due to the compression and expansion of the chest, which are carried out with the help of two muscle groups: intercostal and diaphragm, with deep, intensified breathing and singing, the muscles of the abdominal press, chest and neck also contract. When you inhale, the diaphragm contracts and falls down, the contraction of the external intercostal muscles raises the ribs and takes them to the sides, and the sternum forward. The enlargement of the chest leads to a drop in pressure inside the lungs (relative to atmospheric), and this space is rapidly filled with air. When you exhale, the muscles relax accordingly and everything returns to its previous state ( rib cage returns to its original state due to its own gravity, the diaphragm rises, the volume of previously expanded lungs decreases, intrapulmonary pressure increases). Inhalation can be described as an energy-consuming process (active); exhalation is a process of energy accumulation (passive). Controlling the process of breathing and the formation of speech occurs unconsciously, but when singing, the setting of breathing requires a conscious approach and long additional training.

The amount of energy that is subsequently spent on the formation of speech and voice depends on the volume of stored air and on the amount of additional pressure in the lungs. The maximum developed pressure for a trained opera singer can reach 100-112 dB. Modulation of the air flow by vibration of the vocal cords and the creation of suboesophageal excess pressure, these processes take place in the larynx, which is a kind of valve located at the end of the trachea. The valve has a dual function: it protects the lungs from foreign objects and maintains high pressure. It is the larynx that acts as a source of speech and singing. The larynx is a collection of cartilage connected by muscles. The larynx has a rather complex structure, the main element of which is a pair of vocal cords. It is the vocal cords that are the main (but not the only) source of voice formation or "vibrator". During this process, the vocal cords move with friction. To protect against this, a special mucous secretion is secreted, which acts as a lubricant. Education speech sounds is determined by vibrations of the ligaments, which leads to the formation of a stream of air exhaled from the lungs, to a certain type of amplitude characteristic. Small cavities are located between the vocal folds, which act as acoustic filters and resonators when required.

Features of auditory perception, listening safety, hearing thresholds, adaptation, correct volume level

As can be seen from the description of the structure of the human ear, this organ is very delicate and rather complex in structure. Taking this fact into account, it is not difficult to determine that this extremely thin and sensitive apparatus has a set of limitations, thresholds, etc. The human auditory system is adapted to the perception of soft sounds, as well as sounds of medium intensity. Prolonged exposure to loud sounds entails irreversible shifts in the auditory thresholds, as well as other hearing problems, up to complete deafness. The degree of damage is directly proportional to the exposure time in a loud environment. At this moment, the adaptation mechanism also comes into force - i.e. under the influence of prolonged loud sounds, the sensitivity gradually decreases, the perceived volume decreases, and the hearing adapts.

Adaptation initially seeks to protect the hearing organs from too loud sounds, however, it is the influence of this process that most often makes a person uncontrollably increase the volume level of the audio system. Protection is realized thanks to the work of the mechanism of the middle and inner ear: the stapes are retracted from the oval window, thereby protecting from unnecessarily loud sounds. But the protection mechanism is not ideal and has a time delay, triggering only 30-40 ms after the start of the sound, and moreover, full protection is not achieved even with a duration of 150 ms. The protection mechanism is activated when the volume level goes over the level of 85 dB, moreover, the protection itself is up to 20 dB.
The most dangerous in this case, we can consider the phenomenon of "shift of the auditory threshold", which usually occurs in practice as a result of prolonged exposure to loud sounds above 90 dB. It can take up to 16 hours for the hearing system to recover from such harmful effects. The shift of the thresholds starts already from the intensity level of 75 dB, and increases proportionally with the increase in the signal level.

The worst thing to be aware of when considering the problem of getting the correct sound intensity level is the fact that hearing problems (acquired or congenital) are virtually untreatable in this age of advanced medicine. All this should lead any sane person to think about caring for their hearing, unless, of course, it is planned to preserve its original integrity and the ability to hear the entire frequency range as long as possible. Fortunately, everything is not as scary as it might seem at first glance, and by observing a number of precautions, you can easily preserve your hearing even in old age. Before considering these measures, it is necessary to recall one important feature of human auditory perception. The hearing aid perceives sounds non-linearly. A similar phenomenon consists in the following: if you imagine any one frequency of a pure tone, for example, 300 Hz, then nonlinearity appears when overtones of this fundamental frequency appear in the auricle according to the logarithmic principle (if the fundamental frequency is taken as f, then the frequency overtones will be 2f, 3f and so on in ascending order). This nonlinearity is also easier to perceive and familiar to many under the name "nonlinear distortion"... Since such harmonics (overtones) do not appear in the original pure tone, it turns out that the ear itself introduces its own corrections and overtones in the original sound, but they can only be determined as subjective distortions. At an intensity level below 40 dB, subjective distortion does not occur. With an increase in intensity from 40 dB, the level of subjective harmonics begins to increase, however, even at the level of 80-90 dB, their negative contribution to the sound is relatively small (therefore, this level of intensity can be conventionally considered a kind of "golden mean" in the musical sphere).

Based on this information, you can easily deduce a safe and acceptable volume level that will not harm the hearing organs and at the same time will make it possible to hear absolutely all the features and details of the sound, for example, in the case of working with a "hi-fi" system. This level of the "golden mean" is approximately 85-90 dB. It is with such an intensity of sound that it is realistic to hear everything that is inherent in the audio path, while the risk of premature damage and hearing loss is minimized. A loudness level of 85 dB can be considered almost completely safe. To understand what is the danger of loud listening and why too low a volume level does not allow you to hear all the nuances of sound, let us consider this issue in more detail. As for low volume levels, the lack of expediency (but more often a subjective desire) of listening to music at low levels is due to the following reasons:

  1. Non-linearity of human auditory perception;
  2. Features of psychoacoustic perception, which will be considered separately.

The non-linearity of auditory perception discussed above has a significant impact at any volume below 80 dB. In practice, it looks like this: if you turn on music at a quiet level, for example, 40 dB, then the mid-frequency range of the musical composition will be heard most clearly, whether it be the vocal of the performer / performer or instruments playing in this range. At the same time, there will be an obvious lack of low and high frequencies, due precisely to the nonlinearity of perception, as well as the fact that different frequencies sound with different loudness. Thus, it is obvious that for a full perception of the entire completeness of the picture, the frequency level of intensity must be maximally aligned to a single value. Despite the fact that even at a loudness level of 85-90 dB, idealized equalization of loudness of different frequencies does not occur, the level becomes acceptable for normal everyday listening. The lower the volume at the same time, the more clearly the characteristic nonlinearity will be perceived by ear, namely the feeling of the lack of the proper amount of high and low frequencies. At the same time, it turns out that with such nonlinearity it is impossible to speak seriously about reproduction of high-fidelity "hi-fi" quality, because the accuracy of the original sound picture will be extremely low in this particular situation.

If you delve into these conclusions, it becomes clear why listening to music at a low volume level, although it is the safest from the point of view of health, is extremely negatively felt by ear due to the creation of obviously implausible images of musical instruments and voice, the lack of scale of the sound stage. In general, quiet music playback can be used as background accompaniment, but it is completely contraindicated to listen to high "hi-fi" quality at low volume, for the above reasons, it is impossible to create naturalistic images of the sound stage, which was formed by the sound engineer in the studio, at the recording stage. But not only low volume introduces certain restrictions on the perception of the final sound, the situation is much worse with increased volume. It is quite easy to damage your hearing and reduce the sensitivity sufficiently if you listen to music at levels above 90 dB for a long time. These data are based on a large number of medical studies, which conclude that sound above 90 dB has real and almost irreparable harm to health. The mechanism of this phenomenon lies in the auditory perception and structural features of the ear. When a sound wave with an intensity greater than 90 dB enters the auditory canal, the organs of the middle ear come into play, causing a phenomenon called auditory adaptation.

The principle of what is happening in this case is as follows: the stirrup is retracted from the oval window and protects the inner ear from too loud sounds. This process is called acoustic reflex... By ear, this is perceived as a short-term decrease in sensitivity, which may be familiar to anyone who has ever attended rock concerts in clubs, for example. After such a concert, a short-term decrease in sensitivity occurs, which, after a certain period of time, is restored to its previous level. However, the restoration of sensitivity will not always be and directly depends on age. Behind all this lies the great danger of loud listening to music and other sounds, the intensity of which exceeds 90 dB. The emergence of an acoustic reflex is not the only "visible" danger of loss of auditory sensitivity. With prolonged exposure to too loud sounds, the hairs located in the area of ​​the inner ear (which respond to vibrations) are deflected very strongly. In this case, the effect occurs that the hair, which is responsible for the perception of a certain frequency, is deflected under the influence of sound vibrations of large amplitude. At a certain moment, such a hair may deviate too much and never come back. This will cause a corresponding loss of sensitivity at a specific specific frequency!

The worst thing in this whole situation is that ear diseases are practically not amenable to treatment, even with the most modern methods known to medicine. All this leads to certain serious conclusions: sound above 90 dB is hazardous to health and is almost guaranteed to cause premature hearing loss or a significant decrease in sensitivity. Even more unpleasant is the fact that the previously mentioned adaptation property comes into play over time. This process in human auditory organs occurs almost imperceptibly, i.e. a person who is slowly losing sensitivity, close to 100% probability, will not notice this until the moment when the people around them themselves pay attention to the constant questioning, like: "What did you just say?" The bottom line is that the bottom line is extremely simple: when listening to music, it is vital not to allow sound intensity levels above 80-85 dB! The positive side lies in the same moment: the volume level of 80-85 dB approximately corresponds to the level of sound recording of music in a studio environment. So the concept of the "Golden Mean" arises, above which it is better not to rise if health issues have at least some meaning.

Even a fairly short-term listening to music at a level of 110-120 dB can cause hearing problems, for example, during a live concert. Obviously, it is impossible or very difficult to avoid this at times, but it is extremely important to try to do this to maintain the integrity of the auditory perception. Theoretically, short-term exposure to loud sounds (not exceeding 120 dB), even before the onset of "auditory fatigue", does not lead to serious negative consequences. But in practice, there are usually cases of prolonged exposure to sound of such intensity. People stun themselves, not realizing the full extent of the danger in the car when listening to an audio system, at home in similar conditions, or in the headphones of a portable player. Why is this happening, and what makes the sound louder and louder? There are two answers to this question: 1) The influence of psychoacoustics, which will be discussed separately; 2) The constant need to "shout out" the volume of the music some external sounds. The first aspect of the problem is quite interesting, and will be discussed in detail later, but the second side of the problem leads more to negative thoughts and conclusions about the erroneous understanding of the true foundations of correct listening to the sound of "hi-fi" class.

Without going into particular, the general conclusion about listening to music and the correct volume is as follows: listening to music should take place at sound intensity levels not higher than 90 dB, not lower than 80 dB in a room in which extraneous sounds from external sources are strongly muffled or completely absent (such like: conversations of neighbors and other noise, outside the wall of the apartment; street noises and technical noises if you are in the car, etc.). I would like to emphasize once and for all that it is precisely in the case of compliance with such, probably stringent requirements, that the long-awaited balance of loudness can be achieved, which will not cause premature unwanted damage to the auditory organs, and will also bring true pleasure from listening to your favorite music with the smallest details of sound at high and low frequencies and precision, which is pursued by the very concept of "hi-fi" sound.

Psychoacoustics and peculiarities of perception

In order to most fully answer some important questions concerning the final human perception of sound information, there is a whole section of science that studies a huge variety of such aspects. This section is called "psychoacoustics". The fact is that auditory perception does not end only with the work of the auditory organs. After the direct perception of sound by the organ of hearing (ear), then the most complex and poorly studied mechanism for analyzing the information received comes into play, the human brain is entirely responsible for this, which is designed in such a way that, during operation, it generates waves of a certain frequency, and they are also indicated in Hertz (Hz). Different frequencies of brain waves correspond to certain conditions of a person. Thus, it turns out that listening to music contributes to a change in the frequency tuning of the brain, and this is important to consider when listening to musical compositions. On the basis of this theory, there is also a method of sound therapy by directly influencing the mental state of a person. Brain waves are of five types:

  1. Delta waves (waves below 4 Hz). Corresponds to a state of deep sleep without dreams, while the sensations of the body are completely absent.
  2. Theta waves (waves 4-7 Hz). A state of sleep or deep meditation.
  3. Alpha waves (waves 7-13 Hz). Relaxation and relaxation during wakefulness, drowsiness.
  4. Beta waves (waves 13-40 Hz). The state of activity, everyday thinking and mental activity, arousal and cognition.
  5. Gamma waves (waves above 40 Hz). A state of intense mental alertness, fear, excitement, and awareness.

Psychoacoustics, as a branch of science, seeks answers to the most interesting questions concerning the final perception of sound information by a person. In the process of studying this process, a huge number of factors are revealed, the influence of which invariably occurs both in the process of listening to music, and in any other case of processing and analyzing any sound information. Psychoacoustics studies almost all the variety of possible influences, starting with the emotional and mental state of a person at the time of listening, ending with the structural features of the vocal cords (if we are talking about the peculiarities of the perception of all the subtleties of vocal performance) and the mechanism for converting sound into electrical impulses of the brain. The most interesting, and most importantly, important factors (which are vital to take into account every time you listen to your favorite musical compositions, as well as when building a professional audio system) will be considered below.

The concept of consonance, musical consonance

The device of the human auditory system is unique, first of all, by the mechanism of sound perception, the nonlinearity of the auditory system, the ability to group sounds by height with a fairly high degree of accuracy. The most interesting feature of perception can be noted the nonlinearity of the auditory system, which manifests itself in the form of additional non-existent (in the fundamental tone) harmonics, especially often manifested in people with musical or absolute pitch. If we stop in more detail and analyze all the subtleties of the perception of musical sound, then the concept of "consonance" and "dissonance" of various chords and sound intervals is easily distinguished. Concept "consonance" defined as a consonant (from French word"consent") sound, and, accordingly, vice versa, "dissonance"- discordant, discordant sound. Despite the diversity different interpretations These concepts are the characteristics of musical intervals, it is most convenient to use the "musical-psychological" interpretation of the terms: consonance defined and felt by a person as a pleasant and comfortable, soft sound; dissonance on the other hand, it can be characterized as a sound that causes irritation, anxiety and tension. This terminology is slightly subjective, as well as, for the history of the development of music, completely different intervals were taken as "consonant" and vice versa.

Nowadays, these concepts are also difficult to perceive unambiguously, since there are differences in people with excellent musical preferences and tastes, and there is also no generally recognized and agreed concept of harmony. The psychoacoustic basis for the perception of various musical intervals as consonant or dissonant directly depends on the concept of a "critical band". Critical band- this is a certain width of the band, within which the auditory sensations change sharply. The width of the critical bands increases proportionally with increasing frequency. Therefore, the feeling of consonance and dissonance is directly related to the presence of critical bands. The human auditory organ (ear), as mentioned earlier, plays the role of a bandpass filter at a certain stage in the analysis of sound waves. This role is assigned to the basilar membrane, on which 24 critical bands with a frequency-dependent width are located.

Thus, consonance and inconsistency (consonance and dissonance) directly depends on the resolution of the auditory system. It turns out that if two different tones sound in unison or the frequency difference is zero, then this is a perfect consonance. The same consonance occurs if the frequency difference is greater than the critical band. Dissonance occurs only when the frequency difference is from 5% to 50% of the critical band. The highest degree of dissonance in a given segment is heard if the difference is one quarter of the critical bandwidth. Based on this, it is easy to analyze any mixed musical recording and combination of instruments for consonance or dissonance of sound. It is not hard to guess what a big role the sound engineer, the recording studio and other components of the final digital or analogue original of the soundtrack play in this case, and all this even before attempting to reproduce it on sound reproducing equipment.

Sound localization

The binaural hearing and spatial localization system helps a person to perceive the fullness of the spatial sound picture. This perception mechanism is realized through two hearing receivers and two auditory canals. The sound information that comes through these channels is subsequently processed in the peripheral part of the auditory system and is subjected to spectral-temporal analysis. Further, this information is transmitted to the higher parts of the brain, where the difference between the left and right sound signals is compared, and a single sound image is also formed. This described mechanism is called binaural hearing... Thanks to this, a person has such unique opportunities:

1) localization of sound signals from one or more sources, while forming a spatial picture of the perception of the sound field
2) separation of signals coming from different sources
3) highlighting some signals against the background of others (for example, separating speech and voice from noise or the sound of instruments)

Spatial localization is easy to observe on simple example... At a concert, with a stage and a number of musicians on it at a certain distance from each other, you can easily (if you wish, even closing your eyes) determine the direction of arrival of the sound signal of each instrument, assess the depth and spaciousness of the sound field. In the same way, a good hi-fi system is appreciated, which can reliably "reproduce" such effects of spatiality and localization, thereby actually "deceiving" the brain, making you feel the full presence of your favorite performer at a live performance. Localization sound source usually have three main factors: temporal, intensity and spectral. Regardless of these factors, there are a number of patterns by which one can understand the basics regarding sound localization.

The greatest localization effect perceived by human hearing organs is in the mid-frequency region. At the same time, it is practically impossible to determine the direction of sounds of frequencies above 8000 Hz and below 150 Hz. The latter fact is especially widely used in hi-fi and home theater systems when choosing the location of the subwoofer (low-frequency link), the location of which in the room, due to the lack of localization of frequencies below 150 Hz, practically does not matter, and the listener in any case has a holistic image of the sound stage. The localization accuracy depends on the location of the source of radiation of sound waves in space. Thus, the greatest accuracy of sound localization is observed in the horizontal plane, reaching a value of 3 °. In the vertical plane, the human auditory system determines the direction of the source much worse, the accuracy in this case is 10-15 ° (due to the specific structure of the auricles and complex geometry). The localization accuracy varies slightly depending on the angle of the objects emitting sound in space at the angles relative to the listener, and the final effect is also influenced by the degree of diffraction of sound waves from the listener's head. It should also be noted that wideband signals are localized better than narrowband noise.

Much more interesting is the situation with determining the depth of directional sound. For example, a person can determine the distance to an object by sound, however, this happens to a greater extent due to a change in sound pressure in space. Typically, the further the object is from the listener, the more attenuation of sound waves occurs in free space (the influence of reflected sound waves is added in the room). Thus, we can conclude that the localization accuracy is higher in a closed room precisely due to the occurrence of reverb. Reflected waves occurring in closed rooms make it possible for the appearance of such interesting effects as expansion of the sound stage, enveloping, etc. These phenomena are possible precisely due to the susceptibility of three-dimensional localization of sounds. The main dependencies, which determine the horizontal localization of sound: 1) the difference in the time of arrival of the sound wave in the left and right ear; 2) the difference in intensity arising from diffraction at the listener's head. To determine the depth of sound, the difference in sound pressure level and the difference in spectral composition are important. Localization in the vertical plane is also strongly dependent on diffraction in the auricle.

The situation is more complicated with modern surround sound systems based on dolby surround technology and analogs. It would seem that the principle of building home theater systems clearly regulates the way to recreate a fairly naturalistic spatial picture of 3D sound with the inherent volume and localization of virtual sources in space. However, not everything is so trivial, since the mechanisms of perception and localization of a large number of sound sources are usually not taken into account. The transformation of sound by the organs of hearing involves the process of adding signals from different sources that have come to different ears. Moreover, if the phase structure of different sounds is more or less synchronous, such a process is perceived by ear as a sound emanating from one source. There are also a number of difficulties, including the peculiarities of the localization mechanism, which complicate the accuracy of determining the direction of the source in space.

In view of the above, the most difficult task is to separate sounds from different sources, especially if these different sources are playing a similar amplitude-frequency signal. And this is exactly what happens in practice in any modern surround sound system, and even in a conventional stereo system. When a person listens to a large number of sounds emanating from different sources, first, it is determined that each specific sound belongs to the source that creates it (grouping by frequency, pitch, timbre). And only the second stage is the hearing trying to localize the source. After that, the incoming sounds are divided into streams based on spatial characteristics (difference in the time of signal arrival, difference in amplitude). Based on the information received, a more or less static and fixed auditory image is formed, from which it is possible to determine where each specific sound comes from.

It is very convenient to trace these processes using the example of a regular scene, with musicians fixedly located on it. At the same time, it is very interesting that if the vocalist / performer, occupying an initially defined position on the stage, begins to smoothly move around the stage in any direction, the previously formed auditory image will not change! The definition of the direction of the sound emanating from the vocalist will remain subjectively the same, as if he is standing in the same place where he was before moving. Only in the case of a sharp change in the location of the performer on the stage will the sound image be split. In addition to the problems considered and the complexity of the processes of localizing sounds in space, in the case of multichannel surround sound systems, the reverb process in the final listening room plays a rather large role. This dependence is most clearly observed when a large number of reflected sounds come from all directions - the localization accuracy is significantly deteriorated. If the energy saturation of the reflected waves is greater (prevails) than the direct sounds, the localization criterion in such a room becomes extremely blurred, it is extremely difficult (if not impossible) to talk about the accuracy of determining such sources.

However, in a highly reverberant room, localization theoretically occurs; in the case of broadband signals, the hearing is oriented according to the parameter of the intensity difference. In this case, the direction is determined by the high-frequency component of the spectrum. In any room, the localization accuracy will depend on the time of arrival of reflected sounds after direct sounds. If the interval of the gap between these sound signals is too small, the "law of the direct wave" begins to work to help the auditory system. The essence of this phenomenon: if sounds with a short time delay interval come from different directions, then the localization of the whole sound occurs according to the first arriving sound, i.e. hearing to some extent ignores the reflected sound if it comes too short time after the direct one. A similar effect is also manifested when the direction of sound arrival in the vertical plane is determined, but in this case it is much weaker (due to the fact that the sensitivity of the auditory system to localization in the vertical plane is noticeably worse).

The essence of the precedence effect is much deeper and has a psychological rather than physiological nature. A large number of experiments were carried out, on the basis of which the dependence was established. This effect arises mainly when the time when the echo appears, its amplitude and direction coincide with some "expectation" of the listener from how the acoustics of this particular room forms a sound image. Perhaps the person has already had the experience of listening in this room or similar, which forms the predisposition of the auditory system to the emergence of the "expected" effect of precedence. To get around these limitations inherent in human hearing, in the case of several sound sources, various tricks and tricks are used, with the help of which, in the end, a more or less plausible localization of musical instruments / other sound sources in space is formed. By and large, the reproduction of stereo and multichannel sound images is based on a lot of deception and the creation of an auditory illusion.

When two or more speakers (for example, 5.1 or 7.1, or even 9.1) reproduce sound from different points in the room, the listener hears sounds coming from non-existent or imaginary sources, perceiving a certain soundstage. The possibility of this deception lies in the biological characteristics of the structure of the human body. Most likely, a person did not have time to adapt to recognizing such a deception due to the fact that the principles of "artificial" sound reproduction appeared relatively recently. But, although the process of creating an imaginary localization turned out to be possible, the implementation to this day is far from perfect. The fact is that the ear really perceives the source of sound where it actually does not exist, but the correctness and accuracy of the transmission of sound information (in particular, timbre) is a big question. By the method of numerous experiments in real reverberation rooms and in muffled chambers, it was found that the timbre of sound waves differs from real and imaginary sources. This mainly affects the subjective perception of the spectral loudness, the timbre in this case is modified in a significant and noticeable way (when compared with a similar sound reproduced by a real source).

In the case of multichannel home theater systems, the level of distortion is noticeably higher for several reasons: 1) Many sound signals similar in amplitude-frequency and phase characteristics come simultaneously from different sources and directions (including reflected waves) to each ear canal. This leads to increased distortion and comb filtering. 2) Strong spacing of loudspeakers in space (relative to each other, in multichannel systems this distance can be several meters or more) contributes to the growth of timbre distortions and color of sound in the region of the imaginary source. As a result, we can say that tone coloration in multichannel and surround sound systems in practice occurs for two reasons: the phenomenon of comb filtering and the influence of reverb processes in a particular room. If more than one source is responsible for the reproduction of sound information (this also applies to a stereo system with 2 sources), the appearance of the "comb filtering" effect is inevitable, caused by different times of arrival of sound waves to each auditory channel. Particular unevenness is observed in the upper midrange of 1-4 kHz.

The human auditory analyzer is a specialized system for the perception of sound vibrations, the formation of auditory sensations and the recognition of sound images. The accessory apparatus of the peripheral part of the analyzer is the ear (Figure 15).

Distinguish the outer ear, which includes the auricle, the external auditory canal and the tympanic membrane; the middle ear, consisting of a system of interconnected auditory ossicles - the malleus, incus and stapes, and the inner ear, which includes the cochlea, where the receptors that receive sound vibrations are located, as well as the vestibule and semicircular canals. The semicircular canals represent the peripheral receptor part of the vestibular analyzer, which will be discussed separately.

The outer ear is designed in such a way that it delivers sound energy to the eardrum. With the help of the auricles, a relatively small concentration of this energy occurs, and the external auditory canal ensures the maintenance of constant temperature and humidity as factors that determine the stability of the sound-transmitting apparatus.

The eardrum is a thin septum, about 0.1 millimeter thick, made up of fibers running in different directions. The function of the tympanic membrane is well reflected in its name - it begins to oscillate when sound vibrations of air fall on it from the side of the external auditory canal. Moreover, its structure allows it to transmit almost without distortion all the frequencies of the audio range. The ossicular system transfers vibrations from the eardrum to the cochlea.

The receptors that provide the perception of sound vibrations are located in the inner ear - in the cochlea (Figure 16). This name is associated with the spiral shape of this formation, consisting of 2.5 turns.

In the middle canal of the cochlea on the main membrane is the organ of Corti (named after the Italian anatomist Corti, 1822-1888). In this organ, the receptor apparatus of the auditory analyzer is located (Figure 17).

How does the formation of sensations of sound take place? A question that is currently attracting close attention of researchers. For the first time (1863) a very convincing interpretation of the processes in the inner ear was presented by the German physiologist Hermann Ludwig Ferdinand Helmholtz, who developed the so-called resonance theory. He drew attention to the fact that the main membrane of the cochlea is formed by fibers running in the transverse direction. The length of such fibers increases towards the apex of the cochlea. Hence, the analogy of the work of this organ with the harp is understandable, in which different keys are achieved by different lengths of strings. According to Helmholtz, when exposed to sound vibrations, a certain fiber, which is responsible for the perception of this frequency, comes into resonance. A theory that is very captivating in its simplicity and completeness, but which, alas, had to be abandoned, since it turned out that the strings - fibers - in the main membrane are too few to reproduce all the frequencies audible to a person, these strings are too weak, and besides, they are isolated hesitation is impossible. These difficulties for the resonance theory turned out to be insurmountable, but they served as an impetus for subsequent research.

According to modern concepts, the transmission and reproduction of sound vibrations are due to the frequency-resonance properties of all environments of the cochlea. With the help of very ingenious experiments, it was found that at low vibration frequencies (100-150 hertz, maybe slightly higher, but not more than 1000 hertz), the wave process covers the entire main membrane, all receptors of Corti's organ located on this membrane are excited. With an increase in the frequency of sound waves, only a part of the main membrane is involved in the oscillatory process, and the less, the higher the sound. In this case, the resonance maximum shifts towards the base of the cochlea.

However, we have not yet considered the question of how the transformation of the energy of mechanical vibrations into the process of nervous excitation occurs. The receptor apparatus of the auditory analyzer is represented by peculiar hair cells, which are typical mechanoreceptors, that is, for which mechanical energy, in this case, oscillatory movements, serves as an adequate stimulus. A specific feature of hair cells is the presence of hairs at their apex, which are in direct contact with the integumentary membrane. In the organ of Corti, one row (3.5 thousand) internal and 3 rows (12 thousand) external hair cells are distinguished, which differ in the level of sensitivity. More energy is required to excite internal cells, and this is one of the mechanisms of the auditory organ to perceive sound stimuli in a wide range of intensities.

When an oscillatory process occurs in the cochlea, as a result of movements of the main membrane, and with it the organ of Corti, deformation of the hairs that abut against the integumentary membrane occurs. This deformation serves as a starting point in the chain of phenomena leading to the excitation of receptor cells. In a special experiment, it was found that if, during the delivery of a sound signal, biocurrents are removed from the surface of the hair cells and then, amplifying them, brought to a loudspeaker, then we will find a fairly accurate reproduction of the sound signal. This reproduction applies to all frequencies, including the human voice. Isn't that a close enough analogy with a microphone? Hence the name - microphone potential. It has been proven that this bioelectric phenomenon is the receptor potential. Hence, it follows that the hairy receptor cell quite accurately (up to a certain limit in intensity) through the parameters of the receptor potential reflects the parameters of sound exposure - frequency, amplitude and shape.

During the electrophysiological examination of the fibers of the auditory nerve, which come directly to the structures of the organ of Corti, nerve impulses are recorded. It is noteworthy that the frequency of such impulses depends on the frequency of the acting sound vibrations. At the same time, up to 1000 hertz, almost their coincidence is noted. Although higher frequencies in the nerve are not recorded, a certain quantitative relationship between the frequencies of the sound stimulus and afferent impulses remains.

So, we got acquainted with the properties of the human ear and the mechanisms of functioning of the receptors of the auditory analyzer when exposed to sound vibrations of the air. But transmission is possible and not only through air, but through the so-called bone conduction. In the latter case, vibrations (for example, of a tuning fork) are transmitted by the bones of the skull and then, bypassing the middle ear, fall directly into the cochlea. Although in this case the method of supplying acoustic energy is different, the mechanism of its interaction with receptor cells remains the same. True, the quantitative relations are also somewhat different. But in both cases, the excitement, which initially arose in the receptor and carries certain information, is transmitted through the nervous structures to the higher auditory centers.

How is information about such parameters of sound vibrations as frequency and amplitude encoded? First, about the frequency. You, obviously, drew attention to a kind of bioelectric phenomenon - the microphone potential of a snail. After all, it essentially testifies that in a significant range of fluctuations in the receptor potential (and they reflect the work of the receptor both in perception and subsequent transmission) almost exactly correspond in frequency to sound vibrations. However, as already noted, in the fibers of the auditory nerve, that is, in those fibers that receive information from receptors, the frequency of nerve impulses does not exceed 1000 vibrations per second. And this is much less than the frequencies of perceived sounds in real conditions. How is this problem solved in the auditory system? Earlier, when we examined the work of Corti's organ, we noted that at low frequencies of sound exposure, the entire main membrane vibrates. Consequently, all receptors are excited, and the vibration frequency is transmitted unchanged to the fibers of the auditory nerve. At high frequencies, only a part of the main membrane is involved in the oscillatory process and, therefore, only a part of the receptors. They transmit the excitation of the corresponding part of the nerve fibers, but already with a transformation of the rhythm. In this case, a certain part of the fibers corresponds to a certain frequency. This principle is referred to as a spatial coding method. Thus, frequency information is provided by frequency space coding.

However, it is well known that the overwhelming majority of real sounds perceived by us, including speech signals, are not regular sinusoidal oscillations, but processes that have a much more complex form. How, in this case, is the transfer of information ensured? Back in the early 19th century, the outstanding French mathematician Jean Baptiste Fourier developed an original mathematical method that allows any periodic function to be represented as the sum of a series of sinusoidal components (Fourier series). It is proved by rigorous mathematical methods that these components have periods equal to T, T / 2, T / 3, and so on, or, in other words, have frequencies that are multiples of the fundamental frequency. And the German physicist Georg Simon Ohm (whom everyone knows very well for his law in electrical engineering) in 1847 put forward the idea that just such a decomposition takes place in the organ of Corti. This is how another Ohm's law appeared, which reflects a very important mechanism of sound perception. Due to its resonant properties, the main membrane decomposes a complex sound into its components, each of which is perceived by the corresponding neuro-receptor apparatus. Thus, the spatial pattern of excitation carries information about the frequency spectrum of a complex sound vibration.

To transmit information about the intensity of sound, that is, the amplitude of vibrations, the auditory analyzer has a mechanism that is also different from the way other afferent systems work. Most often, information about the intensity is transmitted by the frequency of nerve impulses. However, in the auditory system, as follows from the processes just considered, such a method is impossible. It turns out that in this case, the principle of spatial coding is used. As already noted, the inner hair cells have a lower sensitivity than the outer ones. Thus, a different combination of excited receptors of these two types corresponds to different sound intensities, that is, a specific form of the spatial pattern of excitation.

In the auditory analyzer, the question of specific detectors (as it is well expressed in the visual system) is still open, nevertheless, there are mechanisms here that make it possible to single out more and more complex signs, which ultimately ends with the formation of such a pattern of excitation, which corresponds to a certain subjective image, recognizable by the corresponding "standard".