Electronic Journal «Technical Acoustics» http://webcenter.ru/~eeaa/ejta/
2005, 15
Sylvio R. Bistafa1*, Milton V. Granado, Jr.2
1 Department of Technology, School of Architecture and Urban Planning, University of Sao Paulo, 05508-080, Sao Paulo, SP, Brazil
2 School of Architecture and Urban Planning, Mackenzie Presbyterian University, 01302-907, Sao Paulo, SP, Brazil
A survey of the acoustic quality for speech in auditoriums
Received 02.04.2005, published 27.05.2005
A sample of auditoriums in the city of Sao Paulo was surveyed in terms of the acoustic quality for speech. The sample was selected based on interviews with well-known actors and actresses of the Brazilian drama community, and with other professionals of the scenic arts as well. Sound decays, the strength and objective measures of speech intelligibility were measured throughout the auditoriums, which were then ranked based on position-averaged values. The variations in the values of the measured parameters with the measuring position are discussed and correlations between them are established. An attempt was made to validate the Objective Support (ST1-Gade) as an objective measure of support to actors on stage. The preliminary diagnosing results with ray tracing simulations are presented. A guideline of less than 5 cubic meters per seat, to achieve reverberation times of less than 1 s in auditoriums for drama has emerged from the results of the measurements.
INTRODUCTION
Although it is common to characterise room acoustics in terms of reverberation time, there is much more detail to be considered to understand how rooms influence speech perception. In a room, listeners hear the direct sound of the talker followed by many delayed reflections of the speech. Although what is typically heard is the combined effect of many thousands of reflections, all reflections are not equal, and they do not all affect the listener in the same way. Early-arriving reflections within about 50 ms after the direct sound are particularly important because the hearing system integrates them with the direct sound making it seem louder. Thus increased early-reflection energy is expected to increase intelligibility. Later-arriving speech sounds are not integrated and cause one speech sound to mask the next, decreasing the intelligibility of speech.
Although the importance of early-arriving reflections is not widely appreciated, it is not a new concept and Joseph Henry explained the key points in the 1850s [1]. By conducting simple experiments in which he listened to hand claps at various distances from a large reflecting wall outdoors, he determined that early reflections were only separately identifiable if they arrived more than about 50 ms after the direct sound. He also went on to design rooms for speech that shaped the room to make the most use of early-reflection energy. Somehow
*Corresponding author, e-mail: [email protected]
this information seems to have been lost for nearly 100 years and now it is usual to attribute the understanding of the importance of early reflections for speech in rooms, to Haas [2], which is referred to the Haas Effect. In practice a number of publications indicate a good understanding of the effects of early reflections before Haas’ work [3]. In 1935 Aigner and Strutt [4] published a remarkable paper essentially proposing the useful-to-detrimental sound ratio concept that was later re-invented by Lochner and Burger in the 1960s [5].
Several studies showed that our hearing system integrates early arriving reflections with the direct sound. Reflections delayed by 10 to 20 ms can be as much as 10 dB greater than the direct sound and still be perceived as no more than equal to the direct sound. Perhaps one of the reasons why the benefits of early-arriving reflections are not widely appreciated may be that much of the early work focussed on when they become disturbing as echoes rather than on when they are beneficial. For example, Haas examined how the disturbance caused by early reflections varied with speech rate, reflection level, tone quality, and reverberation time. However, in typical rooms reflections arriving within about 50 ms after the direct sound will usually all be beneficial.
The concept that the ratio of early-arriving to late-arriving speech sounds, commonly referred to C50, would relate to speech intelligibility developed from work by Thiele who proposed the “Deutlicheit” (Definition, D50) measure [6]. Several room acoustics measures can be calculated from measured or predicted impulse responses. These include, D50, C50, C80 and the Centre Time, TS. They all build on our understanding of the different perceived effects of early-arriving and later-arriving speech sounds in rooms. They are also usually very highly correlated with each other [7]. C80 has been used to relate to the perceived “clarity” of musical sounds, whereas similar to D50, the early-to-late sound ratio, C50, has been used to relate to speech intelligibility in rooms.
Some acoustical measures have been developed that combine the effects of signal-to-noise ratios and room acoustics into a single quantity and hence can more completely assess conditions in rooms for speech such as theatres for drama. Lochner and Burger developed the concept of useful-to-detrimental sound ratios [5], where “useful” is the combination of the direct and early-reflected sound and “detrimental” is the sum of the late-arriving speech sounds plus the ambient noise. Latham [8] successfully tested this concept in a number of theatres. U50 is one such useful-to-detrimental ratio where early-reflection energy is summed over the first 50 ms.
The Speech Transmission Index, STI, is a more recent measure that combines both room acoustics and signal-to-noise ratio aspects into a single measure [9]. It is based on the presumption that both room acoustics and noise degrade the natural amplitude modulations of speech and hence lead to reduced speech intelligibility. STI values are determined from modulation transfer functions and are calculated from a complete matrix of values at combinations of acoustical frequencies and speech modulation frequencies. Although the STI measure is quite complex and appears to be very different than the useful-to-detrimental sound ratio, U50, the two measures are actually very closely related [10].
The objective measures of speech intelligibility above discussed, together with other measures that have been proposed, were compared by Bistafa and Bradley [11] under the
diffuse sound field assumption with exponential decays, to estimate optimum reverberation times and maximum background-noise level for classrooms.
The present work proposes to analyse the acoustical quality for speech in auditoriums using C50 and STI as objective measures of speech intelligibility.
1. THE AUDITORIUMS ANALYSED
With the objective of selecting a sample of auditoriums in the city of Sao Paulo for the investigation, interviews were conducted with well-known actors and actresses of the Brazilian drama community, and with professionals of the scenic arts. Six actors, two actresses, four directors and one scenographer were interviewed. They were asked to describe the overall (acoustic) impression of auditoriums, in which they have worked or performed with certain regularity, trying to focus on what they feel was the audience response. Because these interviews served mainly as a guide for the selection of a representative sample of the city auditoriums, no attempt was made in organizing the subjective evaluations, by setting up a questionnaire of acoustical qualities, as is usually done in subjective studies in halls for music. The main reason is that in rooms for speech, the acoustic quality is unidimensional, and concerns mainly to speech intelligibility.
Several auditoriums with good and less than good acoustical impression were identified. On mentioning attributes of sound quality, actors and actresses often revealed another acoustical subjective dimension that should be considered in the acoustical design of theatres
— support to actors on stage.
Only theatres with proscenium type of stage (the most common type of stage among the city’s auditoriums) were included in the sample. Other criteria for choosing the auditorium sample included: audience sizes that would normally allow drama performances without sound reinforcement systems, and the importance of the auditorium (its popularity among public and performers).
Eight auditoriums were chosen — three auditoriums with “good” acoustical impression for speech (almost unanimously identified by the interviewees) and five auditoriums with “less than good” (on average) acoustical impression for speech. These auditoriums range in volume from 2693 to 13473 cubic meters. It should be pointed out that none of the chosen auditoriums has more than one balcony. Basic auditorium characteristics are listed in Table 1. This table also shows what seems to be a consensus among the interviewees, on the acoustic impression for speech, from the point of view of the audience, for each auditorium.
The shapes in plan and view of the selected auditoriums can be seen in Figure 1. The variety of sizes and shapes included in the sample, as well as the different sound impressions revealed by the interviewees, allowed for an extensive investigation of the acoustic quality for speech in these auditoriums.
2. MEASUREMENT PROCEDURE, MEASURED PARAMETERS AND CRITERIA OF ACCEPTABILITY
The measurements were conducted with the commercially available Symphonie® measuring system, a dual channel acquisition unit, which transfer data in real time to a notebook computer. The accompany software package, dBBATI32®, uses the Maximum Length Sequence MLS signal to obtain the impulse response, and calculates most of the room criteria related to speech intelligibility, as well as the traditional room acoustical parameters such as the reverberation time and the early decay time.
According to Barron [12], the sound source used for measurements in theatres must have the correct directivity for meaningful results. To reproduce the directivity of a human speaker he suggests mounting a small loudspeaker in an enclosure approximating the human head and torso. For the measurements to be reported here, a loudspeaker in a cabinet was used, however without the torso. The loudspeaker directivity was obtained from the manufacturer, and a calibration was performed to obtain the free field sound pressure over a reflecting plane at 1 m straight ahead, for various input signal levels. This information was needed for the calculation of the strength at the measuring positions, and for the source input in the computer simulations.
The source was placed at the center of the proscenium, just under the (open) curtain, with the center of the loudspeaker at the height of the human mouth. Two source orientations were chosen, one facing the audience and another facing across stage. These will be called central and lateral source positions, respectively.
Table 1. Basic characteristics of eight auditoriums in the city of Sao Paulo and acoustical impression for speech (on average) of actors, actresses and professionals of the scenic arts,
from the point of view of the audience
Auditorium Shape Volume(a) m3 Number of Seats Volume per Seat (m3/seat) Acoustic Impression
Stalls Balcony
Alfa pseudo- rectangular 7333 1182(b) 6.05 fair
858 324
Esther Mesquita circular fan 6676 1151 5.80 good
1151 -
Jardel Filho rectangular 2693 665 4.05 poor
417 248
Paulo Autran rectangular 3965 649 6.11 good
649 -
Paulo Eiro elongated fan 3534 600 5.89 fair
418 182
Sergio Cardoso pseudo- horseshoe 5406 842 6.42 fair
635 207
Sesc Vila Mariana rectangular 3778 647 5.84 good
487 160
Simon Bolivar fan 13473 862 15.63 poor
862 -
(a) Volume is of auditorium only, excluding the fly tower.
(b) 30 seats on boxes.
Jardel Filho
n
\ /
/ \
A /
/ \
Paulo Eiró
Paulo Autran
Sesc Vila Mariana
Simon Bolivar
Figure 1. Shapes in plan and view of the auditoriums
The measurements were taken in the unoccupied auditoriums, and the number of measuring positions was between six and ten, according to the size of the auditorium. About two thirds of the measuring positions were in the stalls and one third in the balconies. These were usually chosen on one side of the axis of symmetry of the auditorium. The seats were upholstered in all auditoriums considered here.
The Reverberation Time, RT, and the Early Decay Time, EDT, were both computed by the software of the measuring system in octave frequency bands. In the present work, RT and EDT are given as average values at mid- frequencies (500, 1000 and 2000 Hz). The usual recommendation for reverberation times in theatres is 1 s, however; shorter reverberation times down to 0.7 s may be desirable [12]. Shorter reverberation times are usually beneficial to speech intelligibility.
The early-to-late sound ratio, C50, was calculated as an objective measure of speech intelligibility. It was obtained from Definition, D50, which is computed by the software of the measuring system, in octave frequency bands, using the relation C50 = 10 log [D50/(1-D50)]. However, only the 1-kHz band results were used because of a recent publication [13], which provides a more general applicable regression equation relating speech intelligibility and useful-to-detrimental sound ratio at 1 kHz.
It was found during measurements that the auditoriums were silent, and that the typical acceptable ambient noise level criterion corresponding to the NC-25 curve was met in all auditoriums. The sound strength measured in the auditoriums and the low ambient noise levels were sufficient to provide a signal-to-noise ratio high enough that the useful-to-detrimental sound ratio, U50, reduces to the measured early-to-late sound ratio, C50. This means that speech intelligibility in these silent auditoriums is determined by the acoustical characteristics of the enclosures, meaning the room’s reflections, echoes and reverberation. According to [13], a 1-kHz U50 value of + 2 dB would provide conditions for “very good” speech communication in rooms.
Both, the Speech Transmission Index, STI, and its simplified version RaSTI, were also measured. They were computed by the software of the measuring system, by Fourier transforming the impulse responses. Only the STI values will be reported here since it better reflects than RaSTI the frequency dependence of relevant parameters on speech intelligibility and because, nowadays, the measurement of the full STI poses no more problems. The STI has a well known subjective intelligibility scale [14]: STI values between 0.75 and 1.00 are considered “excellent”, between 0.60 and 0.75 are considered “good”, between 0.45 and 0.60 are considered “fair”, between 0.32 and 0.45 are considered “poor” and below 0.32 are considered “bad”.
The strength is the stationary sound pressure relative to the direct sound level at 10 m expressed in dB. When measured in theatres, Barron [12] called it the Speech Sound Level, S. This author, based on signal-to-ambient noise ratio of 12 dB and on an ambient noise level of 27 dB — corresponding to the mean level of the NC-25 curve at mid-frequencies — arrived at a minimum acceptable value for the Speech Sound Level, S, of 0 dB for theatres. In the present work, S was calculated by subtracting, from the measured sound pressure levels, the direct sound pressure level at 10 m from the sound source. The latter value was available from the sound source calibration results described earlier. Barron [12] reported mean S values for theatres at mid-frequencies. In the present work S values are given for the 1-kHz band only.
3. EXPERIMENTAL RESULTS
3.1. Position-averaged value results
To have an overall idea of each parameter in each auditorium, a single number was obtained by averaging the results from different measuring positions. Position-averaged values for the five parameters and for both source orientations are shown in Table 2. The criterion of acceptability of each parameter was the following: RT < 1.0 s [12], EDT < 1 s, C50 > 2 dB [13], STI > 0.75 [14] and S > 0 dB [12]. In Table 2, whenever the value of the parameter meets the criterion of acceptability it was marked bold.
Table 2 also shows the number of points that the auditoriums scored by applying to each parameter, whenever the criteria of acceptability were met, the following scoring procedure: 1 point for RT average value < 1 s; 1 point for EDT average value < 1 s; 2 points for C50 average value > 2 dB; 2 points for STI average value > 0.75 and 1 point for STI values in the range 0.60 < STI < 0.75; and finally 1 point for a S average value > 0 dB. Otherwise no points were given. The reason why more points were given to the speech measures STI and C50 than to RT and EDT is because the speech objective measures are considered to be better descriptors of the room effects on speech intelligibility than the measures of sound decay.
To have an overall idea on the acoustic quality for speech of the auditoriums analyzed, they were then ranked based on the total score and by applying the following ranking scale: 7
- “excellent”, 5 and 6 - “good”, 3 and 4 - “fair”, 2 and below - “poor”. Table 2 also shows the ranking of the auditoriums for both source orientations.
Table 2 shows that the Jardel Filho is the only auditorium with an average RT value that meets the criterion of acceptability ( < 1.0 s). Table 1 shows that this auditorium has the lower volume per seat ratio of 4.05 m3/seat. When the volume per seat ratio of six British proscenium theatres is calculated from the work of Barron [12], it is found that they range from 3.40 to 4.82 m3/seat, with reverberation times ranging from 0.7 to 1.1 s. The results of both works seem to reveal that to achieve the criterion of acceptability of reverberation times in auditoriums for speech (RT < 1.0 s), it would be desirable to have a volume per seat ratio of less than 5 m3/seat.
It can be seen in Table 2, for the central source position, that most auditoriums have met the criterion of acceptability for C50 ( > 2 dB), whereas none reached “excellent” speech intelligibility conditions according to STI (STI > 0.75). This shows that the criterion of acceptability for C50 is more easily met than that of STI. For the lateral source position, conditions for speech intelligibility deteriorate considerably, and none meets the criterion of acceptability according to both measures, STI and C50.
For the central source position, all auditoriums have STI values characteristic of “good” speech intelligibility (0.60 < STI < 0.75), with the exception of the Esther Mesquita auditorium, with a STI value of 0.59. For the lateral source position, only the Jardel Filho and the Simon Bolivar auditoriums, both with a STI of 0.63, maintain characteristics of “good” speech intelligibility, whereas the speech intelligibility conditions for the other auditoriums are characterized by “fair” values only (0.45 < STI < 0.60).
Table 2. Position-averaged values of the measured parameters, the score and the objective rank for the acoustic quality for speech of each auditorium
^\Auditorium Parameter^\ Alfa Esther Mesquita Jardel Filho Paulo Autran Paulo Eiro Sérgio Cardoso Sesc Vila Mariana Simon Bolivar
Central Source Position RT, s 1.42 1.48 0.93 1.21 1.68 1.64 1.26 1.68
EDT, s 1.16 1.24 0.65 1.02 1.51 0.91 0.91 1.29
C50 (1kHz), dB 3.6 1.8 5.0 2.3 1.4 3.9 2.7 3.9
STI 0.66 0.59 0.71 0.66 0.60 0.72 0.67 0.69
S (1 kHz), dB 5.1 6.9 5.8 7.2 8.9 5.1 6.6 3.0
Score 4 1 6 4 2 5 5 4
Rank Fair Poor Good Fair Poor Good Good Fair
Lateral Source Position RT-s 1.54 1.45 0.96 1.50 1.77 2.73 1.39 1.69
EDT-s 1.50 1.45 0.93 1.31 1.65 2.02 1.23 1.74
C50 (1kHz), dB 1.0 - 2.8 0.8 0.0 - 3.1 - 0.7 0.7 0.1
STI 0.59 0.50 0.63 0.56 0.52 0.56 0.58 0.63
S (1 kHz), dB 1.0 2.5 - 0.1 3.9 4.9 - 1.2 3.1 - 3.4
Score 1 1 3 1 1 0 1 1
Rank Poor Poor Fair Poor Poor Poor Poor Poor
The Jardel Filho auditorium has met the criterion of acceptability of all parameters, with the exception of STI (although the positioned-average value of 0.71 is characteristic of “good” speech intelligibility in the STI subjective scale). This justifies the two different scores applied to STI — two points would be scored in auditoriums with average STI > 0.75, and one point would be scored in auditoriums with average STI values in the range
0.60 < STI > 0.75. In case that the Jardel Filho auditorium would have met the criterion of average STI > 0.75, it would have “excellent” speech intelligibility according to the STI subjective intelligibility scale, scoring a total of seven points instead of six, and here it would have been ranked as an “excellent” auditorium. This basically translates to an auditorium being ranked as “excellent” whenever the STI average value is > 0.75. A STI average value in this range seems to guarantee that the criteria of acceptability of all the other parameters would be met as well.
The criterion of acceptability for the Speech Sound Level, S, was easily met for the central source position, with values well above 0 dB in all auditoriums. For the lateral source position, S values below 0 dB were measured in the Jardel Filho, Sérgio Cardoso and Simon Bolivar auditoriums.
When comparing, for the central source position, the objective (Table 2) and subjective (Table 1) evaluations, it can be seen that they are in agreement in the case of the Alfa and the Sesc Vila Mariana auditoriums; and that the agreement is “reasonable” in the case of the
Paulo Autran, Paulo Eiro, Sergio Cardoso and Simon Bolivar auditoriums. However, there is no agreement at all, in the case of the Esther Mesquita and Jardel Filho auditoriums. Surprisingly enough is the fact that the Esther Mesquita auditorium is very much praised by the professionals of the scenic arts, which was here objectively evaluated as “poor” only. The Jardel Filho auditorium, with the best score among the auditoriums analyzed, was on average judge as being inadequate by the interviewees. These discrepancies seem to show that other subjective dimensions, even non-acoustic ones, might have permeating the acoustic impression revealed during the interviews.
3.2. Objective Support results
It is speculated that when revealing the acoustic impression of the auditoriums, actors and actresses might have taking into consideration the acoustical support that the auditorium gives to their performance on stage. This subjective requirement was verbalized during the interviews with expressions such as: “...I need to have the sensation that my voice has filled the auditorium ... ”; “...I know when my voice reaches the last row of seats... ” and, in one case, “.it is important to feel that my voice has slapped back to the stage...” Would this last impression be in favour of reflections on stage from the remote parts of the auditorium? When this happen, the performers don’t have to strain their voices, the performance is more natural and less weary.
Objective Support (ST1-Gade) is a measure of the musician’s ability to hear himself to maintain a proper balance of one’s instrument and useful sound from coplayers. The ST1 is defined as the energy of the impulse response between 20 and 100 ms relative to the “direct” energy up to 10 ms, for a microphone placed at 1 m from the source, and expressed in dB. It is here hypothesized that ST1 could provide an objective measure of the auditorium support to the actor on stage.
Table 3 shows values of ST1 measured in the auditoriums that were analyzed. Although, at this point, the adequacy of ST1 as suggest here cannot be ascertain, even more precarious, with the available information, would be the establishment of range of values that would guarantee the adequate support to actors on stage.
According to Gade [15], the optimum range for ST1 for ensemble in symphony orchestras is -12 dB ± 1 dB, allowing higher values for smaller groups. The measured values seem to suggest that for actors, higher ST1 values than those recommended for musicians might be desirable. This indication comes from the fact that auditoriums subjectively classified as “good” by the interviewees have ST1 values greater than -9.5 dB, and the Esther Mesquita auditorium which, as mentioned earlier, is very much praised by the performers, has a ST1 value of -8.6 dB.
Table 3. Values of ST1-Gade measured on the stage of the auditoriums
Auditorium Alfa Esther Mesquita Jardel Filho Paulo Autran Paulo Eiro Sesc Vila Mariana Simon Bolivar
ST1-Gade, dB - 12.4 - 8.6 - 9.1 - 6.6 - 6.7 - 9.5 - 9.6
3.3. Variation of the measured parameters with position
Table 4 shows, for the central source position, and for each auditorium, the relative variation of the measured parameters with position. These were calculated as the ratio of the standard deviation and the mean value. The averages (across auditoriums) are also shown at the far hand-right column of Table 4.
Table 4. Relative variation of the measured parameters with position
\Auditorium Parametei^x Alfa Esther Mesquita Jardel Filho Paulo Autran Paulo Eiro Sérgio Cardoso Sesc Vila Mariana Simon Bolivar Average
Relative Variation With Position (%) RT 3.9 2.7 6.5 2.9 2.8 6.7 2.5 6.0 4.3
EDT 28.3 25.9 36.1 31.2 6.2 47.7 26.7 29.1 28.9
C50 65.6 236.3 77.4 62.0 149.3 72.9 58.9 173.5 112.0
STI 7.0 17.0 8.7 7.1 3.5 6.4 6.1 16.5 9.0
S 34.3 54.2 26.1 12.3 23.2 25.2 33.4 124.9 41.7
It can be seen in Table 4 that the variation of reverberation time with position is small, with an average relative variation of only 4.3%. This confirms the reverberation time as being a parameter that characterizes the sound decay in the space as a whole, whereas the early decay time is more position dependent, with an average relative variation of 28.9%. Therefore, as far as a measure of sound decay is concerned, EDT is more sensitive to “local” conditions than RT.
Table 4 also shows that C50 is very much position dependent, with an average relative variation of 112.0%. The variation of STI with position is much smaller, with an average relative variation of only 9.0%. Therefore, similar to RT, the STI seems to be a descriptor of the adequacy for speech intelligibility of the space as a whole, whereas C50, similar to EDT, is more sensitive to local conditions.
Table 4 shows that S varies considerably with position, with an average relative variation of 41.7%.
3.4. Correlations between the measured parameters
Table 5 shows correlation coefficients between pairs of measured parameters. These were generated based on 64x4 matrix of data — 64 values obtained at different positions in the eight auditoriums, for each of the 4 parameters, namely RT, EDT, C50 and STI. Correlation coefficients between pairs of parameters were obtained based on 64 values for each parameter in the pair.
Table 5. Correlation coefficients between pairs of parameters
RT:EDT RT : C50 EDT : C50 RT : STI EDT : STI C50 : STI
0.46 - 0.18 - 0.60 - 0.15 - 0.68 0.87
It can be seen in Table 5 that the strongest correlation occurs between the measures of speech intelligibility, namely between C50 and STI, with a correlation coefficient of 0.87. This, as mentioned earlier, confirms that despite being based on different formulations, C50 and STI are in fact strongly correlated.
As far as the correlation between the measures of sound decays and those of speech intelligibility is concerned, EDT is stronger correlated with C50 than RT, with correlations coefficients of -0.60 between EDT and C50, and -0.18 between RT and C50; and similarly with STI, with correlations coefficients of - 0.68 between EDT and STI, and - 0.15 between RT and STI.
4. DIAGNOSING TWO AUDITORIUMS
4.1. Computer simulations
Local acoustical conditions in a room can only be predicted by means of physical or computer simulations. Simulations with ray-tracing type computer programs are very popular nowadays. The possibility of exporting to ray-tracing type programs the geometry generated by AutoCAD® is very appealing to the designer who wishes to simulate the acoustic behavior of a room.
The AutoCAD® interface for CATT-Acoustics® consists of a set of AutoLISP® procedures that create commands to be used within AutoCAD®. The interface allows the drawing of the auditorium to be created inside AutoCAD®, including all surface planes, source and receiver positions. AutoLISP® then writes a geometry-file, a source-file and a receiver-file used by CATT-Acoustics® to acoustically model the auditorium.
It was found the need of a good comprehension of the output types and formats for interpreting the results generated by the different prediction modules of the computer program. There is scarce information on the absorption coefficients of lining materials to be used in computer programs (particularly those of the audience area), but it is on the choice of the scattering coefficients of the room surfaces where the greatest uncertainties lie. Because these input data have considerable impact on predicted values, the authors decided not to include here the numerical values outputted by the program until further testing and verification is done.
However, the output of one of the prediction modules — the audience area coverage mapping — revealed a qualitative agreement with the measured speech intelligibility parameters, and was found useful on identifying architectural design characteristics which determine local acoustical conditions.
Figures 2 and 3 shows the audience area color mapping, for the central source position, and for the Jardel Filho and Paulo Autran auditoriums, respectively, over a grid covering the audience planes in the stalls of both auditoriums. The mappings for C50 are on the left-hand side and those for STI are on the right-hand side of these figures. In the bars over the tops of Figures 2 and 3, a qualitative scale “high” and “low” substitutes numerical values outputted by the program, because of the above-mentioned decision of not reporting numerical values here.
Figure 2. Audience area color mapping for the central source position, for C50 (on the left) and STI (on the right), over a grid covering the audience plane in the stalls of the
Jardel Filho auditorium
Figure 3. Audience area color mapping for the central source position, for C50 (on the left) and STI (on the right), over a grid covering the audience plane in the stalls of the
Paulo Autran auditorium
4.2. Auditorium design features influential to speech intelligibility
The plan view of the Jardel Filho auditorium is shown in Figure 4 (top - stalls, bottom -balcony), and only the plan view of the stalls of the Paulo Autran auditorium is shown in Figure 5 because this auditorium has no balcony. The measuring points and the values of the measured parameters, for the central source position, at each measuring point are also shown in these figures.
In the stalls of the Jardel Filho auditorium (Fig. 4 - top), the measured values of C50 and STI at the position R7 reveal that this position is the worst to speech intelligibility that was measured in this auditorium, due to the poor directivity of the source to this direction, and the lack of reflecting surfaces that would redirect sound to this position.
R7 R6 R5 R4
C-50 -1,56 C-50 7,80 C-50 6,06 C-50 10,51
ST 0,62 ST 0,77 ST 0,74 ST 0,78
RaST 0,61 RaST 0,78 RaST 3,75 RaST 0,81
RT 0,92 RT 0,92 RT 0,82 RT 0,90
EDT 1,00 EDT 0,77 EDT 0,53 EDT 0,31
S 3,90 S 8,50 S 5,55 S 6,70
Stalls
R3
C-50 2,61 STI 0,65
RaSTI 0,64 RT 1,00 EDT 0,79 S 5,30
Balcony
Figure 4. The Jardel Filho auditorium plan view, the measuring points and the values of the measured parameters at each point, for the central source position
At the R6 position, the measured values of C50 and STI are characteristics of “excellent” speech intelligibility conditions. At the R5 position, which is further away from the source and already under the balcony, the conditions for speech intelligibility deteriorate a little, with a STI value just below the criterion of “excellent”. However, conditions for speech intelligibility improve considerably deep down under the balcony, at the last row of seats in the stalls, as reveal the values of C50 and STI at the R4 position. Here, the values of these parameters are the best that were measured in the Jardel Filho auditorium. Positions deep under the balcony are shielded from late sound, which is detrimental to speech intelligibility. Position R4, being at the one of the back corners of the auditorium, also benefits from early-reflected sound due to the proximity of these reflecting surfaces.
In the balcony of the Jardel Filho auditorium (Fig. 4 - bottom), the values of C50 and STI at the R1 position, which is at the last row of seats near the back wall, are the best that were measured in the balcony of the Jardel Filho auditorium because, likewise the R4 position in the stalls, it is surrounded by reflecting surfaces.
The EDT values shown in Figure 4 reveal that this parameter is much more sensitive to local conditions than the reverberation time. This is clearly shown by comparing the values of EDT and RT at the R7 and R4 positions. At these positions, the RT values are about the same
(0.92 s at the R7 position and 0.90 s at the R4 position), whereas the EDT value at position R4 of 0.31 s would reveal that conditions for speech intelligibility here are much better than at the R7 position with an EDT value of 1.00 s. This is in fact confirmed by the values of C50 and STI at these positions.
The color mappings of the stalls of the Jardel Filho in Figure 2 show good qualitative agreement with the measured values of C50 and STI of Figure 4 (top): the effects of the source directivity, which near the stage is responsible for high values straight ahead of the source (position R6) and low values at the sides (position R7); the lower values in the audience intermediate zone under the balcony (position R5), which is probably due to the lack of early reflections to this region; and the subsequent improvement in the values as one goes further under the balcony (position R4) because, as mentioned earlier, the deep overhang shields most of the late reflected sound, and particularly in the seats near the walls, also because of the proximity of these reflecting surfaces.
The measured values of C50 and STI in the Paulo Autran auditorium (Fig. 5) at the R3 position reveal that the poor directivity of the source to this direction is not compensated by the benefits generated by the proximity of a reflecting surface.
C-5 0 2,53 C-5 0 0,90 C-5 0 1,33 C-5 0 1,45 C-50 5,57
STI 0,69 STI 0,58 STI 0,62 STI 0,65 STI 0,72
RaSTI 0,70 RaSTI 0,56 RaSTI 0,62 RaSTI 0,67 RaSTI 0,75
RT 1,27 RT 1,25 RT 1,20 RT 1,23 RT 1,21
EDT 1,43 EDT 1,35 EDT 1,20 EDT 0,85 EDT 0,53
S 7,70 S 5,40 S 6,90 S 6,90 S 8,40
Stalls
Figure 5. The Paulo Autran auditorium plan view, the measuring points and the values of the measured parameters at each point, for the central source position
At the R2 position in the Paulo Autran auditorium (Fig. 5), the effect of the source directivity is not so pronounced, and the values of C50 and STI are higher than at the R3 position. Despite being positions further away from the source, the proximity of the auditorium walls is responsible for higher values of C50 and STI at the R4, R5 and R6 positions than at the R2 position. Particularly at the R5 position, which is at the last row of seats close to the back wall, the values of C50 and STI are the highest that were measured in this auditorium due to local conditions. This position suffers less from the detrimental effects of late sound due to the proximity of a heavy curtain hanging at the rear door of the
auditorium. This is confirmed by the EDT value of only 0.53 s — the shortest that was measured in this auditorium.
The color mappings for C50 and STI of the Paulo Autran auditorium in Figure 3 shows a good qualitative agreement with measurements, since they confirm that regions close to the back corners of the auditorium would have speech intelligibility conditions almost as good as those close to the stage. These mappings, and those of the Jardel Filho auditorium (Fig. 1) show clearly what Barron called the “exposed midriff phenomenon” [12] — in the stalls intermediate zone the early-reflected sound is lacking, which would be responsible for the relatively poorer speech intelligibility conditions in this zone.
CONCLUSIONS
The main conclusions related to the objective parameters used to analyze the acoustic quality for speech in the auditoriums are: a) the criterion of acceptability for C50 is more easily met than that of STI; b) a STI average value > 0.75 seems to guarantee that the criteria of acceptability of RT, EDT and C50 would be met as well; c) Objective Support, ST1, values for actors higher than those recommended for musicians might be desirable; d) RT and STI seems to be descriptors of the adequacy for speech intelligibility of the auditorium as a whole, whereas C50 and EDT are more sensitive to local conditions; e) EDT is stronger correlated than RT with C50 and STI.
As far as the use of computer simulations are concerned, it was found that experience regarding the proper choice of the room scattering coefficients in the computer simulations (and to a certain extent the absorption coefficients as well, particularly those of the audience), was found necessary for meaningful comparisons of the simulated and measured values. However, one form of the output of the simulations — the audience area coverage mapping
— revealed a qualitative agreement with the objective measures of speech intelligibility, and was found useful on identifying architectural design characteristics which determine local acoustical conditions in the auditoriums.
The required overall philosophy for the acoustical design of the proscenium type of theatre has been revealed based on the results of the measurements in existing auditoriums. One should first focus on reducing the long section, which could be compensated, in terms of the number of seating, with deep overhangs, since both approaches would keep the auditorium volume small, giving the desirable short reverberation time. Deep balcony overhangs prove not to be a great problem with speech, since the principal absence below an overhang is of late sound, which is itself detrimental to intelligibility. A tight design in long section is also desirable to allow the audience to see the facial expressions of the performers, which is part of the theatrical experience.
It is also recommended acoustically reflected suspended ceilings, to provide early-reflected sound to the stalls intermediate zone to avoid the relatively poorer speech intelligibility conditions in this zone, which was clearly detected by the computer simulations.
A guideline has emerged for the volume per seat ratio of less than 5 m3/seat, to achieve reverberation times of less than 1 s in auditoriums for drama.
ACKNOWLEDGEMENTS
The authors wish to thank actors, actresses and the professionals of the scenic arts for the
interviews; 01 dB-Stell for the use of the Symphonie® measuring system; CATT-Acoustics®
for the computer program used in the simulations; managers and staff of the auditoriums
analyzed by this work.
REFERENCES
1. R. S. Shankland. Architectural acoustics in America to 1930. J. Acoust. Soc. Am., 1977, 61(2), 250-254.
2. H. Haas. Uber den einfluss des einfachechos auf die horsamkeit von sprache. Acustica, 1951, 1, 49-58.
3. R. D. Fay, W. M. Hall. Historical notes on the Hass effect. J. Acoust. Soc. Am., 1956, 28, 131-132.
4. F. Aigner, M. J. O. Strutt. On the physiological effect of several sources of sound on the ear and its consequences in architectural acoustics. J. Acoust. Soc. Am., 1935, 6, 155-159.
5. J. P. A. Lochner, J. F. Burger. The influence of reflections on auditorium acoustics. J. Sound Vibr., 1964, 1(4), 426-454.
6. R. Thiele. Richtungsverteilungs und zeitfolge der schallruckewurfe in raumen. Acustica, 1953, 3, 291-302.
7. J. S. Bradley. Relationships among measures of speech intelligibility in rooms. J. Audio Eng. Soc., 1998, 46, 396-405.
8. H. G. Latham. The signal-to-noise ratio for speech intelligibility - An auditorium acoustics design index. App. Acoust., 1979, 12, 253-320.
9. IEC Std. 60268-16. Objective rating of speech intelligibility by the Speech Transmission Index. 2nd ed., International Electrotechnical Commission, Geneva, Switzerland (199803).
10. S. R. Bistafa, J. S. Bradley. Revisiting algorithms for predicting the articulation loss of consonants Alcons. J. Audio Eng. Soc., 2000, 48, 531-544.
11. S. R. Bistafa, J. S. Bradley. Reverberation time and maximum background-noise level for classrooms from a comparative study of speech intelligibility metrics. J. Acoust. Soc. Am., 2000, 107(2), 861-875.
12. Barron M. Auditorium Acoustics and Architectural Design. London, E & FN SPON, 1993.
13. J. S. Bradley, S. R. Bistafa. Relating speech intelligibility to useful-to-detrimental sound ratios (L). J. Acoust. Soc. of Am., 2002, 112(1), 27-29.
14. T. Houtgast, H. J. M. Steeneken. A multi-language evaluation of the RaSTI-Method for estimating speech intelligibility in auditoria. Acustica, 1984, 54, 185-199.
15. A. C. Gade. Investigations of Musicians’ Conditions in Concert Halls. II: Field Experiments and Synthesis of Results. Acustica, 1989, 69, 249-262.