Download Speaker: Anthony Levandowski

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
Transcript
IEOR 170: Interaction and Experience Design for Engineers
Lecture- 04.16.2003
Text, Image, and Sound Compression
Speaker: Anthony Levandowski
TA for IEOR 170
Email: [email protected]
Introduction to Formats
-There are many ways information can be obtained. It can be through texts/data, images,
sound, movies, smell, taste, force, texture, and etc.
-Storing files can either be in binary (1, 0) or in plain text formats. Propriety types on the
other hand are mixture of both plain text and binary formats.
Example of ‘propriety types’
-MS Word – 75% text, 25% binary
-PDF – 99% binary (file modification is not allowed)
-Windows Media Player
Bytes vs. bits
- 1 byte = 8 bits
- Bit is a 0 or 1 (off or on voltage)
3 steps in file compression (how we store/transmit files):
1. Analyze files
2. Look for redundancy or pattern
 the more repetitive the pattern the better
 the larger the pattern the better
3. Create (LZ) dictionary and replace
 use the pattern found in 2 as a dictionary
 substitute the pattern with dictionary index
Example of file compression
1. Analyze files
"Ask not what your country can do for you -- ask what you can do for your
country.“
IEOR 170: Interaction and Experience Design for Engineers
Lecture- 04.16.2003
–17 words
–61 letters
–16 spaces
–1 dash
–1 period
2. Look for redundancy or pattern
"ask" appears two times
"what" appears two times
"your" appears two times
"country" appears two times
"can" appears two times
"do" appears two times
"for" appears two times
"you" appear two times
3. Create LZ dictionary and replacement
1.ask
2.what
3.your
4.country
5.can
6.do
7.for
8.you
"1 not 2 3 4 5 6 7 8 -- 1 2 8 5 6 7 3 4."
"Ask not what your country can do for you -- ask what you can do for your country.“
Example using patterns:
‘ou’ appears in “your” and “country”
"can do for" is also repeated
"your country" vs. "r country," and "you,"
tradeoffs dictionary length vs. document length. Case for replacing “ou”
1.
2.
3.
4.
5.
ask_
what_
you
r_country
_can_do_for_you
Thus, the compression is now smaller.
IEOR 170: Interaction and Experience Design for Engineers
Lecture- 04.16.2003
Question: How well does compression work on plain text files?
Answer: 8KB  4KB (a 50% reduction); typically a 10-20x reduction on large files
Log files: +100x, why?
Frequency
Zipf’s Law
Exponential Function
a, the, etc
200
words
d
Words hardly ever used
Question: What about on PDF? How well does compression work on PDF files?
Answer: 16KB  12.1KB; typically 2-5x reduction on large files.
Audience’s Question: What are the most popular words will first be taught in ESL
school?
Answer: basic words such as a, the, etc (a handful of words that fall on the very elft side
of the graph)
-Dr. Seuss books are based off these words
Type of Images
 Vector – an image defined by mathematical lines (i.e. set of lines, points, curves,
equations). Adobe Illustrator has vector format.
IEOR 170: Interaction and Experience Design for Engineers
Lecture- 04.16.2003

Raster – an image made up by small dots, known as pixels in different colors (i.e.
matrix, pixels, grid values). If we scale grid, image gets bad (i.e. Photoshop). We
can’t precisely make image with different shades and colors.
Note: There are many types of image formats from .art to .wpg
Image Comparison:
•lossy vs. lossless
•.gif image compression (compuserve) 1980’s
–color and run length
–good for “flat” images / rectangular images / images with little color change
•.jpg image compression
(Joint Photographic Expert Group) 1980’s & 1990’s
–extract an 8x8 pixel block from the picture
–calculate the discrete cosine transform for each element in the block
–a quantizer rounds off the discrete cosine transform (DCT) coefficients according to the
specified image quality (this phase is where most of the original image information is
lost, thus it is dubbed the lossy phase of the JPEG algorithm)
–the coefficients are compressed using an encoding scheme such as Huffman coding or
arithmetic coding
- good for pictures
IEOR 170: Interaction and Experience Design for Engineers
Lecture- 04.16.2003
SOUND
Streaming
- Example: WMA
Music is sampled 44,100 times per second. The samples are 2 bytes (16 bits) long.
Separate samples are taken for the left and right speakers in a stereo system
caped bit-rate
- 128kbps (CD quality)
- 64kbps (tape quality)
- 16kbps (low quality)
Non-streaming
Example: - Wav
- MP3
-optimize compression algorithm (i.e. gets rid of sounds you can’t hear)
MP3
MP3 compression (Moving Picture Experts Group audio Layer-3)
-Gets rid of sounds ear cannot hear
- 44,100 samples/second
x 16 bits/sample
x 2 channels
= 1,411,200 bits per second
-32 MB per 3 min
Reduce to 3MB per 3 min by tinkering with the facts that:
There are certain sounds that the human ear cannot hear.
There are certain sounds that the human ear hears much better than others.
If there are two sounds playing simultaneously, we hear the louder one but cannot hear
the softer one
How does the ear translate Sound:
-Sound waves work by propagation through a medium such as air;
1) Sound waves go through outer ear (ear lobe, canal, etc. but this only amplifies a
small amount like 4x)
IEOR 170: Interaction and Experience Design for Engineers
Lecture- 04.16.2003
2) Middle Ear (ear drum, bones, hammer, anvil, stirrup) much amplification occurs; the
ear drum is vibrated by sound waves
3) Inner Ear (see pics below): hair cells vibrate and transmit signals to brain
Note: The exact process sound is heard is still unknown