AUdIoCoUrSeS

Joined: 31 Oct 2002
Posts: 2014
|
| Week 8 - Perceptual coding systems |
|
|
Perceptual coding systems
Just 13 questions here means elongated answers please, I want you to cover each topic in as much depth as you can, yet concisely.
1. Describe and explain the following perceptual coding systems:
• MPEG, including AAC
• Dolby Digital and AC3
• ATRAC
• Windows Media
• Real Media
• Other systems of current relevance
2. Why is perceptual coding necessary?
3. Describe briefly the use of perceptual coding in the following:
Internet audio
Film sound
DVD-Video
Digital television
Personal stereo / iPOD
4. What can be done, other than perceptual coding, to reduce bitrate?
5. What is masking?
6. How do perceptual coding systems handle signals that are probably going to be masked by other audio?
7. What is Huffman coding?
8. What is the typical bitrate for an MP3 file intended for Internet distribution?
9. What is the bitrate of Dolby AC3 as used in film sound?
10. What is metadata?
11. Explain the principles of predictive coding.
12. Explain the basic difference between downloading text & graphics files as compared to the streaming of a sound file over the internet.
13. Describe the process of creating a DVD master. _________________ It's all in the ears. - Learn the concepts not the software.
Audio Courses is a way into the music business for you
|
Mon Apr 25, 2005 8:40 am |
|
|
Rico1210
Joined: 03 Aug 2004
Posts: 39
Location: Newcastle, UK |
|
|
|
Hi,
answers to week 8
1. Describe and explain the following perceptual coding systems:
• MPEG, including AAC
MPEG Layer-3(MP3) and MPEG-4 AAC are modern perceptual coding techniques that are used in internet audio. These techniques exploit the limitations of the human ear and the perception of sound to reduce the bit rate with little or no perceptible loss of quality. Advanced Audio Coding(AAC) is also used in Apple's iPOD and iTunes products. AAC was designed as an improved-performance codec relative to MP3 and MPEG-2 Audio, depending on the AAC profile and the MP3 encoder, 96 kbit/s AAC can give nearly the same or better perceptional quality as 128 kbit/s MP3. AAC can sample frequencies from 8 kHz to 96 kHz, whereas the official MP3 is 16 kHz to 48 kHz. MPEG declared AAC to be the international standard in April 1997.
• Dolby Digital and AC3
Dolby Digital is a digital audio encoding system from Dolby used in movie and home theaters. First used in 1995, Dolby Digital employs Dolby's AC-3 (Audio Coding-3) coding and compression technology and provides six channels of audio, known as 5.1 for front left, front right, front center, rear left, rear right and subwoofer. Dolby AC-3 is used in the cinema at a bit rate of 640 kbps, and is also the recognised coding system in DVD players. THX is also in 95% of the cases based on AC-3 installations, and is used on laserdiscs and DVDs at a bit rate of 384 kbps
• ATRAC
The ATRAC(Adaptive Transform Acoustic Coding) system compresses compact disc audio to approximately 1/5 of the original data rate with virtually no loss in sound quality. ATRAC encodes the audio data, making a substantial saving in bit rate. It does this by splitting the incoming signal into three bands (below 5.5kHz, 5.5-11kHZ and above 11kHZ) and individually analyzes the frequency content and level of each. The ATRAC coding system can be found in Sony's MiniDisc players, memory sticks and also in SDDS(Sony Dynamic Digital Sound), SDDS is used widely in cinema audio.
• Windows Media
Windows Media Audio (WMA) perceptual codec reduces 5.1 surround sound data rates to as low as 128 kbps, which is lower than the original bandwidth of nearly 3,600 kbps for six tracks of compact disc audio (16-bit/44.1 kHz). This means it achieves a compression ratio of approximately 28:1 with acceptable mid-fidelity music.
• Real Media
The RealMedia 5.0 G2 codec is the perceptual codec used with Real Player. It's compression ratio and sound quality are not currently up to the standard of other perceptual coding systems such as the WMA system, or the QuickTime system.
• Other systems of current relevance
QDesign Music Codec 2 is the perceptual codec used with the QuickTime player. It can reduce a compact disc quality (16-bit/44.1 kHz) 11mb audio file to as small as 150kb, which is a compression ration of 73:1. It could also reduce a 28mb audio file to as small as 200kb, which is a compression ratio of 140:1.
2. Why is perceptual coding necessary?
Perceptual coding is necessary to reduce the bit rate of a signal and improve the representation . Perceptual coding removes parts of the audio signal that cannot be heard by the human ear. These parts are unnecessary as they are beyond the limitations of human perception.
3. Describe briefly the use of perceptual coding in the following:
Internet audio
MPEG Layer-3(MP3) and MPEG-4 AAC(Advanced Audio Coding) are modern perceptual coding techniques that are used in internet audio. These techniques exploit the limitations of the human ear and the perception of sound to reduce the bit rate with little or no perceptible loss of quality.
Film sound
The ATRAC(Adaptive Transform Acoustic Coding) system is a perceptual coding system that can be found in SDDS(Sony Dynamic Digital Sound), SDDS is used widely in cinema audio. ATRAC encodes the audio data, making a substantial saving in bit rate. It does this by splitting the incoming signal into three bands (below 5.5kHz, 5.5-11kHZ and above 11kHZ) and individually analyzes the frequency content and level of each. Dolby Digital (AC-3) is another perceptual coding technique used in film sound. Dolby AC-3 is used in the cinema at a bit rate of 640 kbps.
DVD-Video
The perceptual coding systems used in DVD-Video are: Dolby Digital AC-3, DTS (Digital Theatre Sound) and MPEG-2. THX is also in 95% of the cases based on AC-3 installations, and is used on laserdiscs and DVDs at a bit rate of 384 kbps. All NTSC DVD players will replay Dolby Digital AC-3, and newer models will also replay DTS. DVD players in Europe may use MPEG-2 multi channel audio for encoding up to six channels.
Digital television
The MPEG 2 perceptual coding system is the standard in the digital television industry. It is used to broadcast digital video accross cable, satellite and other channels. High Definition Television(HDTV) uses the AC-3 perceptual coding system.
Personal stereo / iPOD
Sony's MiniDisc player uses a form of perceptual coding in its personal stereos. The ATRAC(Adaptive Transform Acoustic Coding) system compresses compact disc audio to approximately 1/5 of the original data rate with virtually no loss in sound quality. ATRAC encodes the audio data, making a substantial saving in bit rate. It does this by splitting the incoming signal into three bands (below 5.5kHz, 5.5-11kHZ and above 11kHZ) and individually analyzes the frequency content and level of each. (PASC)Precision Adaptive Sub-band Coding is a perceptual coding system used in Phillips Digital Compact Cassette(DCC) that achieves 75% data reduction. Apple's iPOD and iTUNES products support Advanced Audio Coding(AAC) technology.
4. What can be done, other than perceptual coding, to reduce bitrate?
Predictive coding is an alternative method of reducing bit rate. In all signals, part of the signal is obvious from what has gone before and what may come after. Most signals have a degree of predictability, for example a sign wave is highly predictable because all cycles look the same. A signal can be transmitted with parts omitted(encoded), provided that there is a suitable decoder that can predict the omissions from the previous and next data. All encoders must contain a model of the decoder to be safe in the knowledge that the information will be correctly re-created. Predictive codecs contain two identical predictors, one in the coder and one in the decoder. These predictors examine the previous data values and estimate what the next value will be. The estimated value is then subtracted from the actual next value to produce a prediction error(residual) that is transmitted from the encoder. The decoder receives the prediction error and adds it to its own prediction, which produces the output code value. There will be no loss of information provided the prediction error(residual) is transmitted intact. However, not all signals can be correctly predicted. All codecs have difficulties with noise, as noise is totally unpredictable.
5. What is masking?
The human hearing system is not equally sensitive at all frequencies; it is less so at low and high frequencies. The perception threshold of hearing is raised at a particular frequency in the presence of another sound at a similar frequency. In simpler terms, when two sounds of a similar frequency are played but only one can be heard, the first sound 'masks' the second. The perception threshold is raised, so a sound must be louder to be heard. Masking has its uses in audio engineering and is widely used in noise reduction. For instance, low-level noise that exists in the same frequency band as a high-level music signal will be masked by the music. Masking is also used in digital compression systems, as it allows the use of lower resolution in frequency bands where the noise will be masked by the signal.
6. How do perceptual coding systems handle signals that are probably going to be masked by other audio?
Perceptual coding systems remove the parts of the signal that are unperceivable to the human ear. The remaining signal is then processed with more available bits, as there is no wastage on masked parts of the signal . This process is called dynamic bit allocation. In effect, perceptual coding systems ignore the signals that are probably going to be masked by other audio, as they do not need to be coded.
7. What is Huffman coding?
The Huffman Code is designed to be used with a data source that has known statistics and is similar to morse code. Variable-length coding is used in which frequently used values are allocated short codes and values that occur infrequently are allocated long codes. The probability of the code value is studied, and the most frequent values are transmitted with short wordlength symbols. As the probability of the value falls, it will be allocated a longer wordlength. For example: in morse code, the letter E is a single dot. This is because it is the most frequently used letter in the English language. The letter Z is allocated the long pattern of dash, dash, dot, dot. As it is one of the most infrequently used letters in the English language.
8. What is the typical bitrate for an MP3 file intended for Internet distribution?
The typical bit rate for an MP3 file intending for internet distribution is 128kbit, although the bit rate of 192kbit is becoming popular in file sharing networks. The bit rates of 160, 256 and 320 pop up now and again in file sharing networks also.
9. What is the bitrate of Dolby AC3 as used in film sound?
Dolby AC-3 is used in the cinema at a bit rate of 640 kbps. THX is also in 95% of the cases based on AC-3 installations, and is used on laserdiscs and DVDs at a bit rate of 384 kbps. Dolby AC-3 supports 64 to 448 kbit/s with 384 kbit/s being the normal rate for 5.1 channels and 192 kbit/s the normal rate for stereo. According to the AC-3 standard, the maximum bit rate is 32 to 640 kbit/s.
10. What is metadata?
Meta is a prefix that means an underlying definition or description, therefore metadata can be described as data about data. Metadata is the background information which describes the content, quality, condition, and other appropriate characteristics of the data. For instance, the title, subject, author, location, and size of a file is metadata, and the index of a book or the indexes of books in a library are metadata.
11. Explain the principles of predictive coding.
In all signals, part of the signal is obvious from what has gone before and what may come after. Most signals have a degree of predictability, for example a sign wave is highly predictable because all cycles look the same. A signal can be transmitted with parts omitted(encoded), provided that there is a suitable decoder that can predict the omissions from the previous and next data. All encoders must contain a model of the decoder to be safe in the knowledge that the information will be correctly re-created. Predictive codecs contain two identical predictors, one in the coder and one in the decoder. These predictors examine the previous data values and estimate what the next value will be. The estimated value is then subtracted from the actual next value to produce a prediction error(residual) that is transmitted from the encoder. The decoder receives the prediction error and adds it to its own prediction, which produces the output code value. There will be no loss of information provided the prediction error(residual) is transmitted intact. However, not all signals can be correctly predicted. All codecs have difficulties with noise, as noise is totally unpredictable.
12. Explain the basic difference between downloading text & graphics files as compared to the streaming of a sound file over the internet.
The basic difference between downloading text & graphics files compared to streaming audio over the internet is that a file being streamed does not have to be downloaded completely before it can be played. When a file is streamed over the internet, it is heard as it is being downloaded. Data is sent in real time, rather than wait until it is completed downloaded. The file is buffered until the minimal amount of packets are received to start streaming. This works because a stub file is used in place of the real file, which contains the url of the actual file. The navigator downloads the stub file and passes it to the player, which parses the stubfile, and downloads (and plays) the actual file.
13. Describe the process of creating a DVD master
The process of creating a DVD master, or DVD authoring is as follows: The first step is to plan how the various elements of the project are going to work together. The source materials to be used are then gathered, these may include video or stereo/multichannel audio plus menus. The source material is then encoded into a format that is compliant with DVD, and navigational instructions are added for end-user control. The encoded material and navigational instructions are then multiplexed into a DVD-compliant stream, which is then burned to a DVD video or DVD Audio master.
The Art of Digital Audio - John Watkinson
Sound and Recording - Francis Rumsey, Tim McCormick
http://www.microsoft.com/windows/windowsmedia/howto/articles/surroundsoundcodecs.aspx
http://www.5ddesign.com/html/audioS.html
http://www.mastermix.com/dvdhome |
Mon May 02, 2005 12:11 pm |
|
|
|
|

|
|
All times are GMT. The time now is Fri May 16, 2008 4:31 pm
|
|
|
|
| |