The development of the interactive Schoeps Film Sound Application. The study of location dialogue recording

Bachelor Thesis, 2014
81 Pages



Die vorliegende Arbeit beschreibt die Entwicklung eines interaktiven Online-Werkzeugs, das Filmtonmeister über Mikrofone und deren geeigneten Einsatz für die Dialogaufnahme am Drehort umfangreich informiert. Dabei wird zunächst auf den Charakter der menschlichen Stimme als primäre Tonquelle eingegangen. Des Weiteren werden die technischen Eigenschaften der einzelnen Mikrofone dargestellt und ihre geeignete Anwendung in unterschiedlichen, sowohl typischen als auch speziellen Situationen, ausgeführt. Ein wichtiger Teil der Arbeit bildet die Erstellung von fünf Kurzfilmen zur praktischen Demonstration der theoretisch erklärten Aspekte. Mit modernem HTML5 wurde dazu in einem aufwändigen Verfahren ein individueller Videoplayer konstruiert, der es dem Benutzer erlaubt, während der Wiedergabe zwischen mehreren Tonspuren des Videos störungsfrei umzuschalten, um so einen gezielten Einblick über die unterschiedliche Funktion der Mikrofone und ihr Verhalten zu erlangen.

Die Arbeit wurde für die Rubrik „Anwendungen“ des Internetauftritts der Firma Schoeps erstellt und bildet das Fundament eines in Zukunft wachsenden Projekts zum Thema Filmton, an dem zahlreiche Autoren beteiligt sein werden.


This thesis describes the development of an interactive online tool which comprehensively informs production sound mixers about microphones and their proper application for professional dialogue recording on location. The first chapter contemplates the fundamental characteristics of the human voice as the primary sound source. The technical specifications of all used microphones are going to be introduced and their suitable deployment in different both typical and rare situations are being described in detail. An important part of the thesis is the creation of five short films in order to give a practical demonstration of all theoretical aspects. For presentation purposes a complex individually designed video player was built, using up-to-date HTML5 technology. The player allows the user to smoothly switch between multiple audio tracks while playing the video to gain an in-depth view of the function of the different microphones and their behavior.

The thesis was made available for the web presence of Schoeps under the category “applications”. It is the groundwork of a new resource for film sound; an ongoing and growing project with lots of different authors in the future.



The Human Voice … 01
Steps to Create Voice … 02
Timbre, Formants & Loudness … 03
Male vs. Female Voice … 04
Intelligibility and Directivity behavior of Sound Field around the Human Talker … 05

Microphones Used for Location Sound … 08
Transducer principle, frequency response, polar pattern … 08
Fulfill requirements … 16
Boom Microphones … 16
Lavalier Microphones … 18
Planted Microphones … 19

Placing the Microphone … 22
Boom Positions and Techniques … 23
Lavalier Positions and Techniques … 26
Planted Positions and Microphones Techniques … 31
Sound Improvement … 35

Field Tests … 38
Test #1 … 38
Test #2 … 40
Test #3 … 42
Test #4 … 44
Test #5 … 46

Creating the Film Sound Application … 50
Elements … 52
Using the Elements … 55
The Application as a whole … 63

Conclusion … 65

Appendix … i

Bibliography … a


In film business it is a common opinion that the whole topic of production sound and dialogue recording on location rarely brings radical new findings and unprecedented news these days. Although it’s a fairly new craft, people have been recording dialogue successfully since the late 1920s. Since then the challenge has always been the same: Capture clean, consistent and intelligible audio.

For this reason, science continued to help invent better equipment that is used today and provides new standards and very applicable tools so sound mixers can get the best results in difficult recording situations and noisy sets.

The microphone manufacturer “Schoeps” has sensed the necessity of research that needed to be done in the environment of recording sound for picture and pursued to develop high-end condenser microphones, both analog and digital, including all accessories that are now used on numerous film sets all around the world.

One of their prime concerns is to not only provide their customers with the best equipment, but also enhance their service by adding useful tools showing their products in action.

In 2008 four students of the University of Media in Stuttgart created the Schoeps Showroom, an interactive application that shows Schoeps microphones in use for piano, ensemble, and vocal recordings. The advantage lies in the approach of how the recordings were done and the way they are being presented: The user can now switch through a number of microphones and microphone set ups and listen to the different ways they sound while the file is playing. It also presents information about the company itself, their microphones etc.

Since the showroom has been the only application on the Schoeps website for years, it was the company’s intention to create yet another application, dedicated to the area of film sound. The original idea was to deliver information similar to the already existing showroom while keeping it a whole new and separate application.

The function of this thesis is now to serve as kick-off for this project that is way overdue, according to Schoeps themselves.

It is the development of an online tool to not only give international customers the chance to learn about Schoeps products for film sound in a very interactive and demonstrative way, but also to help them understand all the details, specifications, odds and ends that there are to film sound. This thesis starts the project by covering the topic of microphone selection and placement for professional dialogue recording on location.

The application is designed and implemented in latest HTML5 technology which supports the interchangeability of content and also allows to be expanded whenever needed. The Schoeps film sound application is going to be a hands on and growing project with lots of authors sharing their knowledge, tips and experience with sound mixers worldwide.

The Human Voice

Recording production sound really is only about one thing: “Capture clean, consistent and intelligible audio” . Production sound primarily consists of dialogue recordings. It has often been said that dialogue in a film should never distract the audience from the action on screen , so it has to come across as naturally and realistically as possible. For this reason it is very important to discover and understand the source of sound that is to be recorded primarily: the human voice.

What seems to be trivial and obvious at first really becomes vital and deep by taking a closer look at what the human voice actually is and does. It’s not a question of how to record just words that are put into sentences, it’s rather a question of how to record phonemes, words and phrases that are thoughtfully put together, arranged and performed by the talent in order to communicate emotion, sentiment and passion. It’s the sound mixer’s part to get exactly that on audiotape, or HDD nowadays:

„Das Verständlichste an der Sprache ist nicht das Wort selber, sondern Ton, Stärke, Modulation, Tempo, mit denen eine Reihe von Wörtern gesprochen wird, kurz, die Musik hinter den Worten, die Leidenschaft hinter dieser Musik, die Person hinter dieser Leidenschaft: Alles das also, was nicht geschrieben werden kann.“ – Friedrich Nietzsche

In this first chapter we are going to look at how the voice is being evolved. We will then break it down to the differences of male and female voices. The directivity behavior of the sound field around the human talker will sum it all up.

1. Steps to Create Voice

The human voice originates in the larynx. It contains cartilages and ribbons, but is mostly hollow on the inside. Only the vocal folds are located there. These are elastic ribbons about 15 – 20mm in length. While the talent is breathing, the vocal folds are completely relieved. Now when the talent decides to speak, the vocal muscles force the vocal folds to close. But since air from the lungs pushes against the vocal folds, trying to open the crack (glottis) between them, they begin to oscillate.

Depending on length and size, they oscillate at about 120 Hertz (men) or 220 Hertz (women).

[This is a preview. Figures and tables are not included.]

Figure #1

This is called the neutral pitch of a voice. The amplitude of the oscillation is at maximum which means that the talent can speak very powerfully and over a long period of time at this pitch. All regular speech is nearby this frequency and definitely in the same octave, the lower two-thirds of the whole speech range. Changes in pitch are achieved by straining the muscles in and around the larynx (for higher pitch), or by relieving strain to lower the pitch.

The glottis produces the fundamental frequency at roundabout 100 Hz (200 Hz respectively) and many more harmonics at 200Hz, 300Hz, 400Hz etc.

2. Timbre, Formants & Loudness

The area above the glottis, the vocal tract (all the way up to oral and nasal cavity), is now significant for building the final vocal tone. Each and every vocal tract has its own natural resonance to make for intensifying on harmonics or dampen them – in this way formants, “the spectral peaks of the sound spectrum of the voice” , are formed giving each voice its unique kind and character.

Different vowels can change the form of the vocal tract and ultimately its resonating behavior. Basically, vowels are considered emphases on formants. High frequencies (vowel “e”, emphasis on 200-400Hz & 3-3.5kHz) resonate in the head area, mid-frequencies cause the nasal cavity to vibrate and lower frequencies are perceived further down in throat and mouth cavity, all the way down into the chest area (vowel “u”, 200-400Hz). Consonants are mostly unvoiced (frequency above 500Hz) with a few exceptions depending on words and language, e.g. “vanilla”, “they”, “zero”.

Another component is the loudness of a voice. Since high pressure air is being moved through the glottis in order to produce a loud sound, molecules in adjacent resonating cavities are more likely to fall into their own oscillation process, too. The reason is because the vocal folds now oscillate at higher amplitude (not frequency). As a result the molecules inside the vocal tract have a higher sound particle velocity (thus higher sound pressure ⇒ higher sound intensity) compared to low pressure air, causing more other molecules to resonate. This of course changes the sound of the voice. Additionally, when the voice is raised “the emphasis in the speech spectrum shifts one or two octaves towards higher frequencies” , Appendix figure #1.

The distinct sound of a voice is therefore defined by fundamental frequency, number and amplitude of harmonics and loudness.

3. Male vs. Female Voice

The frequency range can be from F1( 43Hz) up to e4(2607Hz, sound of a newborn). The following picture shows the average ranges for male (basso, baritone, tenor) and female voices (alto, mezzosoprano, soprano).

[This is a preview. Figures and tables are not included.]

Figure #2

The frequency ranges shown above do not refer to the average fundamental frequencies but rather the average timbre of a voice.

As mentioned before, the usual speech range doesn’t exceed the area from a quint to octave around the fundamental frequency. The whole range is between two and maximum three octaves. The mid pitch of the voice (always nearby fundamental frequency) and all other important parts determining the difference of voices can be estimated as follows:

[This is a preview. Figures and tables are not included.]

The whole physiological dynamic range of the human voice is 50-55dB. The loudness of regular speech ranks at 70-80dB , other literature states at roundabout 60-65dB at a listening distance up to 1m . For comparison, a clean and intelligible voice record requires a level of at least 50dB. As speech can have a dynamic range of 27dB for adults (5-12dB for children) over a longer period of time, the momentary speech level alternates too, approximately 5dB.

4. Intelligibility and Directivity behavior of Sound Field around the Human Talker

On one hand, the vowels show the true sound of a human voice, on the other hand, especially in Western languages, it’s the consonants that are crucial for speech intelligibility. Both are to be captured as best as possible. Therefore it’s important to know what the dispersion of speech looks like. Tests show that applying a high-pass filter (e.g. at 500Hz) indeed reduces the speech energy, but the signal remains understandable. On the flipside, a low-pass filter (cutting at 1 kHz) sucks in about 60% of the intelligibility. One should avoid to cover or completely mask the frequencies 1-4 kHz by background noise in order to obtain a good ease of understanding. Figure #4 shows the formants of speech and their frequencies.

[This is a preview. Figures and tables are not included.]

Figure #4 Figure #5

Also, too much reverb will be perceived as noise and worsen intelligibility. Therefore a high signal-to-noise-ratio is key. Background noise below 40dB(A) is not considered a problem if speech level is constant (referring to a listening distance of 1m). If the noise level increases to 40dB(A) and above, optimum speech level shows a s/n-ratio of 15dB or more. An increase of the speech level in postproduction will have a significant impact on how noise is being perceived as its level increases too. As figure #5 displays, no matter if s/n-ratio is positive or negative, the ideal speech level is identified at 60~75dB. Noise will seem louder and more interfering at higher levels.

While frequency and level influence the comprehensiveness of the voice, it is very much important to also take a look at the directivity behavior of the human talker and the effect that not only the vocal tract has on it, but also the body and head. Below there are two polar patterns showing the differences of directivity in all directions. The level decreases almost exactly 7dB comparing front to back. Front to side is about 3dB less. The little boost at 330° vertical is interesting and has to be ascribed to the reflection off of the chest. The overall behavior is similar for both male and female performers.

[This is a preview. Figures and tables are not included.]

Figure #6, levels are A-weighed, measured at 1m

The frequency dependent polar pattern then gives a better idea of how the directivity increases as frequency increases – for higher frequencies the polar pattern becomes more the shape of a cardioid, revealing a drop of 18dB from 160Hz to 8 KHz at 180°.Appendix figure #2

As proven in a recent and very accurate test by the National Center for Voice and Speech, University of Utah, the directivity indeed grows further for the highly important HFE , especially concerning male voices . Another yet very subtle effect (starting at 1 kHz) is that loud speech becomes more directional (~3dB deviation) versus soft speech.

1m of distance gives a good overall perspective on directivity and frequency behavior. But for film sound the microfone most likely gets placed even closer to the mouth. Eddy B. Brixen investigated the positions at 10, 20, 40 and 80cm from the mouth at 0°, 45° and 315° and presented his results at the AES convention 104. If the change of distance and direction (later referred to as distance and angle) had no effect on the frequency spectrum, the curves would be straight lines. However the charts teach us that the deviation at 45° is pretty narrow which means that the distance of the microfone doesn’t effect the sound too much at this angle.

0° on-axis shows that the spectrum changes as the distance changes in a passable manner. The chart from a microphone position at 315° clearifies that the spectrum is effected tremendously by varying the microphone position – the influence of the body reflection gives a boost at 1400Hz.

[This is a preview. Figures and tables are not included.]

Figure #7

Now that we’ve discovered the fundamentals, physical aspects and ways that underlie the behavior patterns of how the human voice ‘works’, in the following chapters we are going to come back to this information in order to see how the knowledge about the human voice as a source of sound effects the work of the production sound mixer.


Excerpt out of 81 pages


The development of the interactive Schoeps Film Sound Application. The study of location dialogue recording
Catalog Number
ISBN (eBook)
ISBN (Book)
File size
2944 KB
schoeps, film, sound, application
Quote paper
Markus Rebholz (Author), 2014, The development of the interactive Schoeps Film Sound Application. The study of location dialogue recording, Munich, GRIN Verlag,


  • No comments yet.
Read the ebook
Title: The development of the interactive Schoeps Film Sound Application. The study of location dialogue recording

Upload papers

Your term paper / thesis:

- Publication as eBook and book
- High royalties for the sales
- Completely free - with ISBN
- It only takes five minutes
- Every paper finds readers

Publish now - it's free