Methods of Sound Data Compression Comparison of Different Standards
Norbert Nowak, Wojciech Zabierowski
Abstract — The following article is about the methods of sound data compression. The technological progress has facilitated the process of recording audio on different media such as CD-Audio. The development of audio data compression has significantly made our lives easier. In recent years, much has been achieved in the field of audio and speech compression. Many standards have been established. They are characterized by more better sound quality at lower bitrate. It allows to record the same CD-Audio formats using "lossy" or lossless compression algorithms in order to reduce the amount of data surface area at almost noticeable difference in the quality of the recording. In order to compare methods of sound data compression I have used Adobe Audition 3.0 software and computer program of the sound compression system from manufacturers’ side. To illustrate the problem, I have used the graphs of the spectrum and musical composition spectrograms. The comparison has been done on the basis of uncompressed music track from the original CD-Audio.
Index Terms—sound data compression, mp3, FLAC, comparison.
I. Introduction
Nowadays, it is possible to store audio data on various media such as hard drive or portable flash memory. Due to the technological progress, it has been noticed that the audio data takes up too much memory space. Moreover, it has been stated that if various data can be compressed, it is also possible to diminish audio files without much loss in quality by rejecting unwanted frequencies, inaudible to human ears.
Placing various audio files without using compression algorithms on the Internet would be useless. What is more cell phones without compression are not capable of communication in better quality. It is noticeable how fast the data compression has become ubiquitous in our lives and yet it has been an interest of only small group of engineers and scientists for many years.
Data compression is a change of recording information in such a way to reduce the volume of the collection.
Manuscript received November 8, 2011.
Norbert Nowak M.Sc., Wojciech Zabierowski, Ph.D. TUL, Department of Microelectronics and Computer Science, ul. Wolczanska 221/223 90924 Lodz, POLAND, e-mail: [email protected].
Therefore, it is a shift of the same set of information using fewer bits. The use of compression can be found in multimedia devices, DVD movies, digital television, data transmission, the Internet, etc.
ii. Definition
Modeling and coding
One's requirements decide what type of compression he applies. However, the choice between lossy or lossless method also depends on other factors. One of the most important is the characteristics of data that will be compressed. For instance, the same algorithm, which effectively compresses the text may be completely useless in the case of video and sound compression.
It is worth remembering that compression is an experimental science. The best option is chosen depending on the nature of the redundancy present in the data. Designing of algorithms’ compression for different data is divided into two stages. The first stage is called modeling. Due to this, the information of any redundancy occurring in the data is described by a model. The next step is encoding the description of the model and the description that informs about the differences in data related to the model. This process is done by using the binary alphabet. The dissimilarity between data and the model is called a deviation (Fig. 1).
Lossless compression algorithms
Lossless data compression is does not allow the loss of information. There are certain types of files that can be compressed only by lossless method. This data must be accurately opened later in the process of decompression such as text files, program code files or audio and image files in professional applications. If the text data is compressed by lossy method, it would cause a loss of some information, namely the adverse and unexpected effect of letter substitution, mistakes in words, or even dropping entire sentences. Also audio data for professional applications, where the sound is often subjected to subsequent treatment, requires the staunchest reconstruction after decompression. Moreover, there is data that is difficult or even impossible to compress, such as streams of random numbers, or the data already compressed using the same algorithm.
A lossless compression algorithm handles the data correctly, where there is redundancy of information. The most commonly used methods are vocabulary that find occurrences of the string and replace the shorter number of bits than is needed to encode and statistical that use fewer bits for repeatedly occurring symbols. It is obvious that there are many situations where it is necessary to use compression to ensure that the data before and after decompression (reconstruction) is identical.
Lossy compression algorithms
Lossy compression reduces the number of bits needed to express a particular information. Reconstructed information usually is not identical to the original. There is some loss of information and distortion. However, a better compression ratio is gained than in lossless compression. An inability of an exact reconstruction is not an obstacle. In some applications this is not a must, for example sending the speech signal does not require the exact value of each sample. Assuming a certain quality of reconstruction, diverse distortions and differences in relation to the original are allowed. If, for instance, the speech quality signal has to be the phone quality, some loss of information can be permitted. When there is a need to receive the speech signal of CD quality, some loss of information (relatively small) is also acceptable.
While designing algorithms for lossy compression, some methods are needed to measure its quality. Due to the different areas of applications, a number of concepts has been introduced to describe and measure the compression quality.
Measures of quality compression
Compression algorithms can be assessed using different criteria, for example, measuring the complexity of the algorithm, speed of action, memory, which is required for the algorithm implementation , the degree of compression and data similarity after decompression to the original data.
The degree of compression is a measure of how effectively an algorithm can be compressed. It is the ratio of the number of bits needed to represent the data before compression to the number of bits that is needed to represent data after the process.
Using a lossy compression, the data obtained after decompression differ from the original. To determine the effectiveness of the algorithm, some ways are needed to measure these differences. Such differences are called the distortion. Lossy compression is usually used to compress data, which originally took the form of the analog, for instance audio, video sequence. Encoding analog signal is often referred to as a continuous wavelet encoding.
The ultimate arbiter, that can assess the quality of sound signal waveform encoding, is a man. Due to the fact that such assessments are difficult to reproduce mathematically, some models are applied. One of these schemes is the psycholinguistic model. Further terms such as fidelity and quality are used to detect differences between the original and decompressed signal. If the fidelity or quality of decompression (reconstruction) is large, it means that such data does not differ significantly from the original data.
III. Analysis And Comparison Of Audio Data Compression Standards
In my analysis I used four systems of lossy compression and two of lossless compression. I have compared every described standard with the uncompressed source file, deriving from the original CD. I have based my analysis of selected files on a specific criterion. I took into account the psychoacoustic qualities, therefore a human hearing.
For the study I have used Adobe Audition 3.0. demo version that is a professional music program for processing and analysis of an audio sound. The results are presented using the two most important tools: a graph showing the spectrum of acoustic signal (Fig. 2) and the spectrogram that is the signal amplitude spectrum diagram (Fig. 3).
The main problem in lossy compression systems was weak transfer of high frequencies. This effect occurred at lower data rates because the algorithms use the filters, which cut the high frequency band depending on the bandwidth, such as 16kHz upwards. The lower the rate, the less bandwidth system offers us.
The best lossy compression system has turned out to be little known Musepack, offering exemplary sound quality at 210kbps bit rate. As far as lossless compression is concerned, Monkey's Audio has been the top-quality system offering comperssion grade of 67.39%.
Fig. 2. The spectrum of a musical composition after applying MP3 compression at 320kbps
Fig. 3. Spectrogram of a musical composition after applying MP3 compression at 320 kbps
The following table shows the results of compression using chosen standards that apply lossy compression. Comparative criterion is the degree of compression, the compressed file size and sound quality after compression of the original WAV file size 64.1 MB (Table 1).
Next table presents the results of compression using given standards that apply lossless compression. In this case, the comparison criterion is the degree of compression and file size after compression of the original WAV file size 64.1 MB. The sound quality after decompression in all cases is the same, consistent with the original (Table 2).
TABLE 1
Compression using chosen standards that apply lossy
COMPRESSION
Compression system Compression ratio Compressed size Sound quality
MP3 320 kbps 22,62% 14,5 MB very good
MP3 128 kbps 9% 5,81 MB good
MP3 96 kbps 6,79% 4,35 MB low
WMA 320 kbps 22,62% 14,5 MB very good
WMA 128 kbps 9,13%. 5,85 MB good
WMA 96 kbps 6,86% 4.4 MB low
Ogg Vorbis 320 kbps 22,78%. 14,6 MB high
Ogg Vorbis 128 kbps 9,2% 5,9 MB very good
ogg Vorbis 96 kbps 6,9% 4,42 MB good
ogg Vorbis 64 kbps 4,6% 2,95 MB low
Musepack 210 kbps 15% 9,62 MB high
Musepack 180 kbps 12,8% 8,22 MB very good
Musepack 130 kbps 9,6% 6,18 MB good
Musepack 90 kbps 6,72% 4.31 MB acceptable
TABLE 2
Compression using given standards that apply lossless
COMPRESSION
Compression system Compression ratio Compressed size
Monkey's Audio tryb “Extra High” 67,39% 43,2MB
Monkey's Audio tryb “High” 67,86% 43,5MB
Monkey's Audio tryb “Normal” 68,02% 43,6MB
Monkey's Audio tryb “Fast” 69,89% 44,8MB
FLAC tryb “8” 69,58% 44,6MB
FLAC tryb “5” 70,20% 45,0MB
FLAC tryb “0” 75,19% 48,2MB
iv. Summary
Over the past year, a lot of achievements have been made in the field of audio and speech compression. Many standards have been created that are characterized by increasingly higher sound quality at lower data rates. Their efficiency and capabilities have increased significantly. A big space of available memory gives a possibility to save a huge amount of music compressed by different codecs using a lossy method, such as MP3, WMA, Musepack, and lossless method, such as increasingly popular standard for FLAC. Indeed, without compression large amounts of audio data could be moved. However, by using the compression, saving the data is 10 times more efficient with a slight, almost imperceptible loss of quality.
After this analysis, I conclude that using the audio compression that uses systems applying the lossless compression, allows to reduce the audio data without any loss in quality by 30%. In this way a perfect copy of the original is received. using lossy compression schemes, one can obtain the file size of about 90% smaller than the original, with an appreciable loss of quality. Thus obtained files, thanks to their small size, suit perfectly for transmission over the Internet. The second option is to get the file decreased about 80% with obtaining high-quality music recording, with no noticeable differences by an average listener.
References
[1] K. Sayood, Data compression. Introduction, Publisher RM, Warsaw
2002
[2] A. Krupiczka, Multimedia: compression algorithms and standards / edited by Wladyslaw Skarbek, Academic Publishing House PLJ, Warsaw 1998
[3] W. Buryn, Digital audio. Multichannel Systems, WKiL, Warsaw 2004
[4] www.naukowy.pl
Wojciech Zabierowski (Assistant Professor at Department of Microelectronic and Computer Science Technical University of Lodz) was bom in Lodz, Poland, on April 9, 1975. He received the M.Sc. and Ph.D. degrees from the Technical University of Lodz in 1999 and 2008, respectively. a He is an author or co-author of more than 70
x1l ^^^fcpublications: journals and most of them - papers in 7 ' .^^^Hintemational conference proceedings. He was B&l L^^^Hreviewer in six international conferences. He I^^^^Bsupervised more than 90 Msc theses. He is focused on internet technologies and automatic generation of music. He is working in linguistic analysis of musical structure.