Orban Loudness Meter 2.0 for Win/Mac
Loudness is subjective: it is the intensity of sound as perceived by the ear/brain system. No simple meter, whether peak program meter (PPM) or VU, provides a reading that correlates well to perceived loudness. A meter that purports to measure loudness must agree with a panel of human listeners.
The Orban Loudness Meter receives a two-channel stereo signal from any Windows sound device and measures its loudness and level. It can simultaneously display instantaneous peaks, VU, PPM, CBS Technology Center loudness, and ITU BS.1770 loudness. The meter includes peak-hold functionality that makes the peak indications of the meters easy to see.
Jones & Torick (CBS Technology Center) Meter
The CBS meter is a “short-term” loudness meter that displays the details of moment-to-moment loudness with dynamics similar to a VU meter. It uses the Jones & Torick algorithm developed at the CBS Technology Center [Bronwyn L. Jones and Emil L. Torick, “A New Loudness Indicator for Use in Broadcasting,” J. SMPTE September 1981, pp. 772-777]. Created using Orban-developed modeling software, the DSP implementation typically matches the original analog meter within 0.5 dB on sinewaves, tone bursts and noise.
The Jones & Torick algorithm improves upon the original loudness measurement algorithm developed by CBS researchers in 1967. Its foundation is psychoacoustic studies done at CBS Laboratories over a two year period by Torick and the late Benjamin Bauer, who built on S. S. Stevens’ ‘50s-era work at Harvard University.
After surveying existing equal-loudness contour curves (like the famous Fletcher-Munson set) and finding them inapplicable to measuring the loudness of broadcasts, Torick and Bauer organized listening tests that resulted in a new set of equal-loudness curves based on octave-wide noise reproduced by calibrated loudspeakers in a semireverberant 16 x 14 x 8 room, which is representative of a room in which broadcasts are normally heard. They published this work in “Researches in Loudness Measurement,” IEEE Transactions on Audio and Electroacoustics, Volume AU-14, Number 3, September 1966, pp. 141-151, along with results from other tests whose goal was to model the loudness integration time constants of human hearing. These studies concentrated on the moderate sound levels typically preferred by people listening to broadcasts (60 to 80 phons) and did not attempt to characterize loudness perception at very low and high levels. (The phon is a unit of perceived loudness, equal in number to the intensity in decibels of a 1 kHz tone judged to be as loud as the sound being measured. [The American Heritage® Science Dictionary, 2005])
According to this research and its predecessors, the four most important factors that correlate to the subjective loudness of broadcasts are these:
1. The power of the sound.
2. The spectral distribution of the power. The ear’s sensitivity depends strongly on frequency. It is most sensitive to frequencies between 2 and 8 kHz. Sensitivity falls off fastest below 200 Hz.
3. Whether the power is concentrated in a wide or narrow bandwidth. For a given total sound power, the sound becomes louder as the power is spread over a larger number of critical bands (about 1/3 octave). This is called “loudness summation.”
4. Temporal integration: As its duration increases, a sound at a given level appears progressively louder until its duration exceeds about 200 milliseconds, at which point no further loudness increase occurs.
Bauer and Torick used the results of this research to create a loudness meter with eight octave-wide filters, each of which covers three critical bands. (B & T did not use one filter per critical band because this would have made the meter, which was realized using analog circuitry, prohibitively expensive.) Each filter feeds a full-wave rectifier and each rectifier feeds a nonlinear lowpass filter that has a 10 ms attack time and a 200 ms release time, somewhat like the sidechain filter in an AGC. This models the “instantaneous loudness” perception mechanism in the ear. Instantaneous loudness is not directly perceived but is an essential part of the total loudness model.
To map the instantaneous loudness to perceived short-term loudness, the outputs of each of the nonlinear lowpass filters are arithmetically summed with gains chosen to follow the 70 phon equal-loudness curves of the ear. The sum is applied to a second, slower nonlinear lowpass filter. This has an attack time of 120 ms and a release time of 730 ms. Along with the eight nonlinear lowpass filters following the individual filters, this filter models temporal integration and maps it to the visual display. Meanwhile, the arithmetic addition models loudness summation.
The accepted unit of subjective loudness is the sone. With a sinewave, 40 phons = 1 sone. A doubling of sones corresponds to a doubling of loudness. However, because broadcasters were accustomed to working in decibel units, J & T chose to map loudness on a display encompassing –20 to +5 dB in 0.5 dB increments, with the understanding that the perceived loudness doubles every 10 dB at loudness levels typically heard by broadcast audiences. A reasonable calibration level is 0 dB = 75 phons = 11.3 sones.
In V2 of the Orban meter, we modified the CBS meter scales to be the same as the scales specified in the EBU – TECH 3341 standard. (Refer to the description of the “Meter Scale” and “Meter Range” controls above.) The purpose of this change was to allow the readings of the CBS and BS.1770 meters to be compared easily.
The J & T meter is monophonic. Psychoacoustic studies indicate that when multiple acoustic sources are present in a room, loudness is most accurately expressed by summing the power in the sources: Driving two loudspeakers with identical program produces 3 dB higher loudness than a single speaker produces. Therefore, to extend the J & T algorithm to multichannel reproduction, we implement one eight-filter filterbank for each channel and compute RMS sums of the outputs of corresponding filters in each channel before these sums are applied to the eight nonlinear lowpass filters. As in the monophonic J & T algorithm, the sum of these lowpass filters drives a second nonlinear filter, which drives the display.
BS.1770 Loudness Meter
In 2006, the ITU-R published Recommendation ITU-R BS.1770: “Algorithms to measure audio programme loudness and true-peak audio level.” In 2011, this was updated to BS.1770-2, which adds gating so that the meter ignores silence and is weighted toward louder program material, which contributes most to a listener’s perception of loudness. BS.1770-2 indicates only sounds that fall within a floating window that extends from the loudest sounds within the preset integration period to sounds that are 10 dB quieter than the loudest sounds.
Developed by G.A. Soulodre, the original BS.1770 loudness meter uses a frequency-weighted RMS measurement intended to be integrated over several seconds — perhaps as long as an entire program segment. As such, it is considered a “long-term” loudness measurement because it does not take into account the loudness integration time constants of human hearing, as does the CBS meter.
A major disadvantage of the BS.1770-1 meter is that it weights silence and low-loudness material the same as high loudness material. This will cause the meter to under-read program material (like dialog) having substantial pauses that contain only low-level ambience because louder program material contributes most to a listener’s perception of overall program loudness.
To address this problem, the BS.1770-2 algorithm adds gating to the BS.1770-1 algorithm. There are two steps in the gating process: first, an absolute gate removes silent passages; second, a relative gate weights louder parts of the program more heavily that quieter parts.
A more detailed explanation of the algorithm is this:
1. Using the BS.1770-1 algorithm, (i.e., a K-weighting filter followed by RMS summation and averaging), calculate the RMS value in a 400 ms time window. One number is computed for every 400 ms time window. Start computing a new 400ms window every 100 ms, so there is 75% time overlap between windows. Continue computing the RMS values of new 400ms windows throughout the entire duration of the measurement and store all of these results — one number for each 400ms window.
2. If any 400ms window has a value below –70 LKFS, throw it away.
3. Compute the average of the remaining windows over the total time period of the measurement. If any window is less than 10 dB below this average, throw it away.
4. Compute the average of the remaining windows. Display this reading on the meter.
Experimental CBS Long-Term Loudness Measurement
The V1 Orban meter offered an experimental long-term loudness indication (first developed in 2008) by post-processing the CBS algorithm’s output. This is unchanged in the V2 meter. Displayed by a single cyan bar on the CBS loudness meter, this uses a relatively simple algorithm and we welcome any feedback on its perceived usefulness. This algorithm attempts to mimic a skilled operator’s mental integration of the peak swings of a meter with “VU-like” dynamics. The operator will concentrate most on the highest indications but will tend to ignore a single high peak that is atypical of the others. This algorithm can be seen to share certain characteristics with the floating gate introduced in EBU R 128.
The algorithm displays the average of the peak indications of the meter over a user-determined period. The average is performed before dB conversion. All peak indications within the period are weighted equally with the following exceptions:
•If the maximum peak in the window is more than 3 dB higher than the second highest peak, it is discarded.
•All peaks more than 6 dB below the maximum (or second-to-maximum, if the maximum peak was discarded) are also discarded.
Comparison of the CBS and BS.1770-2 Meters
The BS.1770-2 “Momentary” meter, which uses a 400 ms integration time with equal time weighting throughout the interval, is closest in spirit to the CBS meter, as both were designed for use by operators for real-time production loudness monitoring. The “Momentary” meter does not directly model the multiple loudness integration time constants of human hearing, all of which are faster than 400 ms. In addition, because it is a simple integrated power measurement, it does not model “loudness summation” (which was described in the section above on the CBS meter).
In the subjective testing to validate the BS.1770 meter, there were outliers as large as 6 dB (i.e., the meter disagreed with human subjective perception by as much as 6 dB.) The subjective testing to validate the CBS meter found outliers of up to 3 dB, although fewer items were used in this testing. We hypothesize that the fact that the worst-case error of the BS.1770 meter was substantially larger than that of the CBS meter is caused by the BS.1770’s meter’s not modeling loudness summation or the loudness integration time constants of human hearing in any detail.
BS.1770-2 states: “It should be noted that while this algorithm has been shown to be effective for use on audio programmes that are typical of broadcast content, the algorithm is not, in general, suitable for use to estimate the subjective loudness of pure tones.” We have noted that the meter tends to over-indicate the loudness of program material that had been subject to large amounts of “artistic” dynamic compression, as is often done for commercials and promotional material — in other words, the meter over-indicates the loudness of program material having an unusually low peak-to-average ratio, which, at the limit, approaches the peak-to-average ratio of a pure tone. We have encountered complaints by mixers and producers who stated that such material, when automatically matched to the surrounding program material via the BS.1770 meter, can air up to 3 dB quieter in subjective terms. In turn, this has constrained the ability of producers to specify the type of audio processing they had previously used to give this material excitement and punch. We hypothesize that this problem is related to the fact that BS.1770 does not accurately indicate the loudness of pure tones.
In addition, BS.1770 specifically excludes the LFE channel, as attempts to add it using the existing algorithm caused disparities of approximately 10 dB between listening tests and the meter indication. Some researchers believe that the K-weighting curve used in BS.1770 is incorrect below 80 Hz [Cabrera, Densil; Dash, Ian; Miranda, Luis, “Multichannel Loudness Listening Test,” AES Convention Paper 7451, 124th AES Convention, Amsterdam 2008]. In addition, LFE material typically has a low peak-to-average ratio. We hypothesize that both of these issues contribute to the inability of the current BS.1770 algorithm to work with LFE material. Work continues on incorporating LFE into BS.1770, but as of this writing, it is not yet finished.
The CBS meter, as implemented in this software and in Orban’s TV loudness controllers, includes the LFE channel. Orban’s implementation of the CBS Loudness Controller, which uses the CBS loudness meter as a reference, sounds smooth and natural even when the program has substantial LFE energy. This suggests that the CBS meter does not make gross errors in estimating the loudness of material having substantial LFE energy. This satisfactory performance may relate to the fact that the swept sinewave frequency response of CBS meter rolls off much faster at low frequencies than the response of BS.1770 and because the CBS meter’s psychoacoustic model is sophisticated enough to allow it to perform well on program material having a low peak-to-average ratio.
Based on a lot of listening and observation, we believe that the peak reading of the CBS meter locks onto the “anchor element” (as defined in ATSC A/85, and which is usually dialog) more accurately than the BS.1770 Short-Term and Integrated meters do. In turn, this allows the CBS meter to be more effective as the core measurement in an automatic loudness controller — unlike the BS.1770 meter, the CBS meter does not over-indicate dialog level in the presence of relatively loud underscoring and effects, so it does not push down the dialog level unnaturally when used in a loudness controller.
Orban’s white paper, “Using the ITU BS.1770-2 and CBS Loudness Meters to Measure Loudness Controller Performance,” provides more information on comparing the meters.
Of course, the above comparison is only our opinion. The availability of both meters in the Orban software allows you to make your own tests and form your own opinion on these issues.
In their original publications and standards, each of the meters implemented in the Orban Loudness Meter has a different specified scale and range. To best allow users to compare the indications of the VU, PPM, and Reconstructed Peak meters under dynamic program conditions, we chose to present their indications on identical linear-dB scales extending from +5 to –30 dB with respect to digital full-scale. (In V2, the scale was extended from 0 dB to +5 dB to accommodate the Reconstructed Peak meter, which can indicate higher than 0 dBFS as explained above.)
The loudness meters obey the EBU – TECH 3341 standard, which specifies a different scale than the remaining meters.
The CBS and VU meters have gain adjustments that allow users to choose their preferred lineup level. In V2, the range of the VU Gain control has been extended so that you can use –18 dBFS (EBU) or –20 dBFS (SMPTE) line-up level.
Conformance to Published Standards
Our implementation of the PPM can be switched for 5 ms or 10 ms attack times, because there are standards for both variations. The “10 ms attack” mode follows EBU Tech. 3205-E as closely as possible. In V2, we have oversampled the meter’s detector to 384 kHz to prevent the meter from significantly under-reading 0.5 ms bursts. Nevertheless, the meter does not meet all dynamic response specifications of 3205-E, falling outside the tolerance specified in 3205-E by about 1 dB with certain tone bursts. We surmise that this is because 3205-E was created for a meter having a combination of pulse-shaping electronics and a mechanical movement (which causes the meter to have complex, multi-time-constant dynamics), while our PPM has a single time-constant attack and release characteristic. (We would welcome information on a more complex model of PPM dynamics obeying 3205-E.)
Our implementation of the VU meter reaches 99% (–0.09 dB) of steady-state when presented with a 1 kHz tone burst with an “on” duration of 300 ms and an “off” duration of 500 ms or more. In concordance with the standard, the meter has an overshoot of 1%. Because its reading is presented on a dB-linear scale instead of a standard VU “A” or “B” scale, we believe that this is the closest we could come to the spirit of this meter.
V2 provides two true peak-reading meters. The red bar appearing in the VU and PPM meters reads the peak values of the internal 48 kHz digital samples within the meter. By oversampling 8x, the Reconstructed Peak meter extrapolates the peaks of the signal after D/A conversion, as specified in the BS.1770 standard.
Because sample rate conversion changes the value of peaks in the sample-data domain, the “sample peak” meter will not indicate the true peak sample values of material not originally at 48 kHz sample rate. However, the Reconstructed Peak meter’s reading is essentially independent of original material’s sample rate, having a maximum error approximately ± 0.2 dB compared to the true peak output of an ideal D/A converter and reconstruction filter.