Understanding the Multi-Mass Model and Sound Generation of Vocal Fold Oscillation

When a speaker speaks, the vocal fold oscillates, generating a voice. The voice resonating in the vocal tract and in the mouth is converted to speech when the speaker changes the shapes of the mouth and tongue. When the vocal fold oscillates, a voice is generated because the vocal fold oscillation vibrates the air in succession, and the vocal fold oscillation triggers the generation of the fundamental frequency of the vocal fold as well as that of the harmonic sound at the same time. It is not easy to understand these sound generation principles acoustically, however, unless one is equipped with deep knowledge in the fields of physics and acoustics. In this paper, therefore, the vocal fold will be simplified as a multi-mass model, and a way to easily understand the principle that the fundamental frequency and harmonic sound are generated simultaneously by vocal fold oscillation will be presented.


Introduction
The sound wave is usually propagated by air and is dissipated through viscous damping. On the other hand, the acoustic energy can be harvested rather than dissipated. A scavenge device is utilized to convert the ambient environmental energy into electrical energy.
Their study, a helix structure was proposed to achieve low frequency acoustic energy harvesting. At acoustic resonance frequency 175 Hz, 100 dB sound pressure level excitation working condition, the measured experimental data showed that the harvested power could be up to 7.3 μW. The overall structure was compact, indicating that it could be organized into arrays for large-scale application. The proposed structure could also be used as a low frequency acoustic notch filter for noise control application [1].
The drawback of acoustic energy is that its power is low. On the other hand, with the rapid development of modern electrical technologies, the power requirements of embedded chips are www.videleaf.com substantially decreasing, so that the harvested power can be supplied to low-power devices.
In acoustic energy harvesting (AEH), acoustic energy is converted into electrical energy, and this method has potential applications for the Internet of Things.
Their article summarized the mechanisms of AEH, which included the Helmholtz resonator approach, the quarterwavelength resonator approach, and the acoustic meta material approach. It was demonstrated that the AEH technique would become an essential part of the environmental energy-harvesting research field [2].
Human voice is generated by the oscillations of two lateral opposing vocal folds located in the larynx. The vocal fold oscillations are excited by air flow through the trachea generated by the lung. Recent research has revealed the 3D dynamics of the medial surface of the vocal folds across a number of laboratory models, including the excised human larynx, the in vivo canine larynx, and physical models. The 3DM(3D-multi-mass-model) was developed to help conceptualize these data into one coherent model of vocal fold vibration. Preliminary data from the 3DM demonstrate its ability to oscillate in patterns which are similar to available experimental data. A firmer connection may be established between biomechanical tissue properties and the resultant vocal fold dynamics [3].
Many instances of voice abuse can lead to benign or malignant voice disorders. The efficiency of computational simulations of vocal fold vibrations is thus associated with the accuracy of the input mechanical properties. The study of tissue biomechanics in pathological states of the larynx may help in understanding their etiology.
The phonatory functions of the vocal folds depend on their mechanical properties. Their article presented a review of various mechanical testing methods and constitutive models that are currently in use for the characterization of mechanical properties of the vocal fold tissue. And their work intended to www.videleaf.com review the existing mechanical testing methods and constitutive models to help researchers to choose better characterization methods to evaluate tissue mechanical properties [4].
As such, the field of acoustics has developed to the extent that it can change sound into electrical energy and can medically treat the vocal fold.
When beginners in acoustics and phonetics and non-majors in physics start to study acoustics, however, they do not know the principle of the generation of harmonics that, when the vocal fold or guitar strings generate sound, they simultaneously generate not only the basic frequency voice but also the harmonic sound. This principle, however, is the basics of acoustics and it is very important but difficult to understand.
When a speaker speaks, the air comes out of the vocal fold (the glottis) while the vocal fold oscillates due to the air pressure coming out. The oscillating vocal fold vibrates the surrounding air to generate sound. At this time, the vocal fold vibrate, generating sound, and simultaneously spreading not only the basic frequency sound but also harmonic sound into the air. This paper thus aims to simplify the vocal fold into a multi-mass model so as to offer an easy-to-understand explanation about the principle of the vocal fold generating harmonic sound thus helping understand the principle of harmonic sound generation.
And the voice resonating in the vocal tract is changed into speech and comes out of the mouth. While the voice is being resonated in the vocal tract and mouth, the speaker changes the resonance condition by changing the shapes of his or her mouth and tongue to form various consonants and vowels [8].

Vocal Fold Oscillation and Resonance
Resonance When a swing is pushed, if the swing moves to the right, one would push it to the right, and if the swing moves to the left, one would push it to the left. In other words, force is applied in the direction towards which the swing tends to move because if the swing is pushed as such, it can be easily swung, with minimum www.videleaf.com force. This principle is called "resonance," which is also observed in vocal fold oscillation [8]. Figure 1 shows a resonating mechanism of the swing when one rides it.

Resonance of the Vocal Fold
When the glottis is closed and pressure is applied by the airflow under it, the air pressure pushes the vocal fold outwards, thereby opening the glottis and pushing out the air in the vocal tract, and then accelerating the air. When the glottis is opened, air pressure is applied in the direction of the vocal fold opening.
The accelerating air velocity in the vocal tract pulls the vocal fold inward and closes the glottis. When the glottis is closed, the air pressure exerts a force on the vocal fold, in the closing direction. As such, when the vocal fold oscillates, the pressure generated by the airflow exerts a force in the direction of the vocal fold movement, thereby allowing the observation of the principle of resonance [8]. Figure 2 shows two vocal fold oscillating by resonance.
where m: mass, F: force acting on m, k: spring constant, x: variable length of the spring, and f: oscillation frequency in m. If the spring constant is nk and the mass (m/n) oscillates, oscillation frequency f n can be expressed as shown below. Figure  4 shows mass oscillations hanging from a spring.

Length and Constant of Rubber Band
Young's modulus, which is coefficient (E) for the tensile and compressive stresses, is known as shown below [5]. Figure 5 shows an object being stretched by tension.

Lateral Oscillation of a Mass Hanging from a Rubber Band
A mass m is suspended between rigid supports by means of two identical springs [11].
The springs each have zero mass, spring constant k, and relaxed length a 0 . They each have length a(= l/2) at the equilibrium position of mass m [ Figure 6.(a)].We Consider the motion of the mass along the y-direction (perpendicular to the axis of the springs) only.
At equilibrium each spring exerts tension T 0 = k (aa 0 ). In the general configuration( Figure 6(b)) each spring has length s and tension T = k (sa 0 ) which is exerted along CA or CB. The ycomponent of this force is -T sin θ. Each spring contributes a return force T sin θ in the y-direction. Using Newton's second law, we have m ̈ = -2T sinθ = -2k(sa 0 )y/s. … (4) The x-components of the two forces due to two springs balance each other so that there is no motion along the x-direction. Thus, we have The above equation is not exactly in the form that gives rise to simple harmonic motion.
we neglect (y/a) 3   As rubber band is elastic, like a spring, oscillation frequency f can be expressed as follows when the mass hanging from both rubber strips is also laterally oscillated. Figure 7 shows a mass oscillating laterally hanging from both rubber bands.

Multi-Mass Lateral Oscillation Hanging from a Rubber Band
Assuming that the oscillation frequency of the mass is f l when the length of both elastic bands is , the suspended mass is m and the constant is k because, where m: mass, k: spring constant, and T: tension.
As, oscillation frequency f n can be expressed as follows when the mass of each rubber band has a length of l/n, and when a mass of m/n oscillates laterally. Figure 8 shows a multi-mass oscillating on both rubber bands.

Oscillation Frequency of Strings
The frequency of strings is known as [5].
where v: velocity, : string length, T: tension, : linear density, and m: mass of the strings.
Therefore, when a string with a length of l/n oscillates, the mass is m/n; as such, oscillation frequency f n can be expressed as follows: where m: mass, k: spring constant, and F: force acting on m. Figure 9 shows the oscillation of strings from the first to the fourth harmonic mode.

Multi-Mass Model of the Vocal Fold
If the first harmonic mode oscillation frequency is f 1 in the oscillation of strings, the oscillation frequency is 2f 1 in the second harmonic mode. If the oscillation frequency is f 1 when there is one mass hanging from both rubber strings, the oscillation frequency becomes 2f 1 when the string oscillates.
Meanwhile, if the length of the string is l, and the mass is m in the first harmonic mode, then the string oscillates with a string length of l/2 and a mass of m/2 in the 2-harmonic mode compared to the multi-mass model.
Similarly, it is easy to understand that the n th harmonic oscillation frequency is nf 1 in the n th harmonic mode compared with the multi-mass model.

Vocal Fold Oscillation and Sound Generation Sound Generation by the Oscillation of Objects
When an object oscillates, it is known that air is being compressed and diluted repeatedly, thereby generating sound through the oscillating air. As the object vibrates the air, the oscillation frequency of the object becomes equal to the oscillation frequency of the air, spreading the voice with the same oscillation frequency of the object far out to the air [10]. Figure 11 shows the oscillation and sound generation of a tuning fork. www.videleaf.com

Sound Generation by Vocal Fold Oscillation
All waveforms are known to be made up of overlapping harmonic modes [5], and the oscillation frequency of the vocal fold is also known be made up of overlapping harmonic modes consisting of numerous harmonics (i.e., f1, f2, f3, f4 ... ...) [8]. Figure 12 shows the vocal fold oscillation in the multi-mass model oscillating from both rubber bands and the vocal fold oscillations with overlapping harmonic modes. This string is vibrating only with one string as shown in the top right picture. However, this string has overlapped with myriads of strings such as f1, f2, f3, f4 … …. And, the vibration of numerous strings can be easily understood when compared to the multi-mass models. As such, a single vocal fold vibrates, generating the vibration of air, and thus generating not only the basic frequency voice but also numerous harmonic sounds simultaneously.  Figure 13 shows the normal mode of the vocal fold cover ribbon model [8].
For the most often observed mode, the amplitude of vibration is maximum in the middle of the ribbon and decreases gradually toward the end points (Figure 13a).
When the ribbon vibrates in its second mode, its length encompasses two half wavelengths, with the center of the ribbon not vibrating at all (Figure 13b). An integer m can identify the mode (m= 1, 2, 3...). The m = 3 mode is shown in Figure 13c.
Here the middle moves opposite to the anterior and posterior positions. The vocal fold oscillates with numerous harmonics, and it is easy to understand its oscillation mechanism when vocal fold oscillation is compared with the oscillation of the multi-mass models of numerous harmonics. This is the main content of this paper.
When the vocal fold oscillates, the sound consisting of harmonics like the harmonics of the vocal fold is propagated into the air. Figure 14 shows an idealized amplitude spectrum of the glottal flow waveform. Each frequency component is entered as a vertical line. Relative amplitude is plotted vertically, and frequency is plotted horizontally. Frequency is labeled in kHz and relative amplitude in dB (decibels) as Relative amplitude = where A is the amplitude of any component and is a common reference amplitude. The reference amplitude is totally arbitrary and serves only to scale the vertical axis into convenient logarithmic units. www.videleaf.com Figure 14: Log magnitude spectrum of a glottal airflow pulse whose fundamental frequency is 120 Hz.

Filtering of Speech by Resonance in the Vocal Tract and Mouth
The voice generated from the glottis is filtered by resonance in the vocal tract and mouth before speech sounds like consonants and vowels are generated outside the mouth [8,9].
The spectrum of a vowel is the spectrum of the glottal source filtered by the vocal tract. The process is illustrated in Figure 15. Assume that we have a line spectrum for the glottal source, as shown in Figure 15a. This line spectrum has a fundamental frequency and a series of harmonics that are multiples of . The spectrum of the vocal tract filter (Figure 15b) is a continuous spectrum with multiple peaks, the formants. Multiplication of the two spectra (for every frequency) yields the vowel spectrum in Figure 15c. In effect, the formant structure is superimposed onto the source spectrum. If the spectra are expressed in dB, the multiplication literally becomes a superposition (addition) because ( ) where a is the source spectrum and b is the filter spectrum. Note that the vowel spectrum in Figure 15c has the same frequency lines as the source spectrum, but the envelope reflects the formant structure.

Conclusions
When a speaker speaks, the vocal fold oscillates to vibrate the surrounding air and generate sound, and the sound is propagated outside the mouth after being resonated in the vocal tract and mouth.
If the vocal fold oscillates, it generates the harmonics of the vocal fold as well as its fundamental frequency. In this study, the vocal fold was simplified into a multi-mass model representing the vocal fold oscillation to facilitate the understanding of the mechanism.