Download A Brief Introduction to Information Theory

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
Transcript
A Brief Introduction to
Information Theory
12/2/2004
陳冠廷
Outline





Motivation and Historical Notes
Information Measure
Implications and Limitations
Classical and Beyond
Conclusion
Motivation: What is information?





By definition, information is certain
knowledge about certain things, which may
or may not be conceived by an observer.
Example: music, story, news, etc.
Information means.
Information propagates.
Information corrupts…
Digression: Information and Data
Mining



Data Mining is the process to make data
meaningful; i.e., to get the statistical
correlation of data in certain aspects.
In some sense, we can view this as a
process to generate some bases to present
the information content in the data.
Information is not always meaningful!
Motivation: Information Theory tells us…



What exactly information is
How they are measured and presented
Implications, limitations, and applications
Historical Notes



Claude E. Shannon
(1916-2001) himself in
1948, has established
almost everything we
will talk about today.
He was dealing with
communication aspects.
He first used the term
“bit.”
How do we define information?

The Information within a source is the
uncertainty of the source.
Example




Every time we row a dice, we get points from 1
through 6. The information we get is larger than we
throw a coin or row an “infair” dice.
The less we know, the more the information
contained!
For a source (random process) that is known to
generate a sequence 01010101… with 0 right after 1
and 1 right after 0, though has an average chance of
50% to get either 0 or 1, but the information is the
same as a fair coin.
If knowing for sure, nothing gains.
Information Measure:
behavioral definitions




H should be maximized when the object is
most unknown.
H(X)=0 if X is determined.
The information measure H should be
additive for independent objects; i.e., with 2
information sources which has no relations
with each other, H=H1+H2.
H is the information entropy!
Information Measure: Entropy



Entropy H(X) of a random variable X is defined by
H(X)= -∑ p(x) log p(x)
We can verify that the measure H(X) satisfies the
three criterion stated.
If we choose the logarithm in base 2, then the
entropy may be claimed to be in the unit of bits; the
use of the unit will be clarified later.
Digression: Entropy in thermodynamics or
statistical mechanics



Entropy is the measure of disorder of a
thermodynamic system.
The definition is identical with the information
entropy, but the summation now runs on all
possible physical states.
Actually, entropy is first introduced in
thermodynamics and Shannon found
out his measure is just entropy
in physics!
Conditional Entropy and
Mutual Information





If the objects (for examples, random variables) are not
independent with each other, then the total entropy does not
equals to the sum of all individual entropy.
Conditional entropy:
H(Y|X)= ∑x p(x) H(Y|X=x)
H(Y|X=x)= -∑y p(y|x) log p(y|x)
Clearly, H(X,Y)=H(X)+H(Y|X)
Mutual Information
I(X;Y)=H(X)-H(X|Y)=H(Y)-H(Y|X)
=H(X)+H(Y)-H(X,Y),
represents the common information in X and Y.
Mutual Information is the overlap!
Applications


The Source Coding Theorem
The Channel Coding Theorem
The Source Coding Theorem



To encode a random information source X
into bits, we need at least H(X) (log base 2)
bits.
That is why H(X) base 2 is in the unit of bits.
the possibility of lossless compression
The Channel Coding Theorem



The Channel is characterized by its input X
and output Y, with its capacity C=I(X;Y).
If the coding rate <C, we can transmit without
error; if coding rate>C, then error is bounded
to occur.
limit the ability of a channel to
convey information
Classical and Beyond


Quantum entanglement and its probable
application: quantum computing
How do they relate to the classical information
theory?
Quantum vs. Classical



Qubit vs. bit
Measure and collapse, noncloneable
property
Parallel vs. sequential access
Aspects of Quantum Information:
transmitting
Conclusion



Information is uncertainty.
Information Theory tells us how to measure
information, and the possibility of transmitting
the information, which may be counted in bits
if wanted.
Quantum information offers a new intriguing
possibility of information processing and
computation.

Thanks for your attention!