Download 0 0 0 0 0

One-Minute Survey  What is the muddiest point in binary image compression including data compression basics?  What do you think of the working load of this course so far? – A) Too heavy B) normal C) too light  Do you think the TA’s grading of your HWs and CAs is fair? If not, please give an example (which assignment? what is your reason?) EE465: Introduction to Digital Image Processing A MATLAB Tour of Data Compression  Entropy calculation – Given pdf of a discrete source – Given observation data  Prefix codes and Huffman coding – Encoding: codebook, coding efficiency – Decoding: codeword boundary  Run length coding of binary images EE465: Introduction to Digital Image Processing Shannon’s Entropy Formula N H ( X )   pi log 2 pi (bits/sample) i 1 or bps % p is the vector containing p1,…,pN >sum(p) ans=1 >H=sum(-p.*log2(p)); EE465: Introduction to Digital Image Processing Source of Infinite Symbols   k 1 H ( X )   pk log 2 pk   k    2bps k 1 k 1  2  Solution 1: H=1/2+2/4+3/8+4/16+…. 2H=1+2/2+3/4+4/8+5/16+… (2)-(1) H=1+1/2+1/4+1/8+1/16+…=2bps Solution 2: k=1:100;H=sum(k.*(0.5).^k); EE465: Introduction to Digital Image Processing (1) (2) How to Calculate the Entropy of Image Source?  Short answer: Impossible  Long answer: can’t be done unless we know the true probability distribution function of images; given only one image (sample), impossible to estimate the pdf  Then why do we still often see statement such as image A has a higher entropy than image B? EE465: Introduction to Digital Image Processing Image Entropy (Approximation Only)  The entropy can only be calculated for a given pdf  Although we don’t know the true pdf of image source, there are approximations such as firstorder entropy, fourth-order entropy, conditional entropy*  Those entropy values are calculated based on additional assumptions about the source which actual observation data might NOT satisfy EE465: Introduction to Digital Image Processing First-order Entropy 3000 2500 2000 1500 1000 X 500 0 H(X)=4.9bps 0 0.5 Histogram EE465: Introduction to Digital Image Processing 1 Zero-frequency Problem Question: in a given grayscale image, there is no occurrence of symbol 200, does that mean P(X=200)=0? Zero probability means infinite amount of self-information Answer: No, because any single image is only one realization of the source (randomness plays the role) freq(find(freq==0))=a; a>0 EE465: Introduction to Digital Image Processing High-order Entropy (Vector Case) We are calculating the number of bits per 4-pixels EE465: Introduction to Digital Image Processing Example Checkerboard image; First-order entropy: P(X=0)=P(X=1)=1/2, H(X)=1bpp Fourth-order entropy: H(X)<<1bpp Question: if “1”s are randomly located, then the difference between first-order and fourth-order entropy won’t be significant, why? EE465: Introduction to Digital Image Processing A MATLAB Tour of Data Compression  Entropy calculation – Given pdf of a discrete source – Given observation data  Prefix codes and Huffman coding – Encoding: codebook, coding efficiency – Decoding: codeword boundary  Run length coding of binary images EE465: Introduction to Digital Image Processing Codebook: collection of codewords symbol k S N E pk 0.5 0.25 0.125 W 0.125 fixed-length codeword 00 01 10 variable-length codeword 0 10 110 11 111 Easy for Difficult for MATLAB MATLAB implementation implementation Codebook = struct('codeword',{{'0','10','110','111'}},'length',[1 2 3 3]); EE465: Introduction to Digital Image Processing Encoding Function Basically it is the table look-up operation (easy for hardware implementation). EE465: Introduction to Digital Image Processing Decoding Function Note that artificial codeword boundary exists due to the use of cell structure in storing coded bit-stream, but the prefix condition will guarantee a unique match exists even when codeword boundary is gone EE465: Introduction to Digital Image Processing Huffman Coding Procedure Stage 1: Source reduction -List all probabilities in a descending order -Merge the two symbols with smallest probabilities into a new compound symbol EE465: Introduction to Digital Image Processing Huffman Coding Procedure (Con’t) Stage 2: Codeword assignment -Start from the smallest source and work back to the original source -Each merging point to Digital Image corresponds EE465: Introduction Processing to a node in A MATLAB Tour of Data Compression  Entropy calculation – Given pdf of a discrete source – Given observation data  Prefix codes and Huffman coding – Encoding: codebook, coding efficiency – Decoding: codeword boundary  Run length coding of binary images EE465: Introduction to Digital Image Processing Binary Image Modeling EE465: Introduction to Digital Image Processing Run-length Coding: Case 1 X Y 0000 111 000000 11111 00 4 3 6 5 -“0” run-length (red) and “1” run-length (green) alternates -run-lengths are positive integers EE465: Introduction to Digital Image Processing 2 Run-length Coding: Case 2 00000100000001000011000000001… 5 7 4 0 8 run-length -all coded symbols are “0” run-lengths -run-length is a nonnegative integer EE465: Introduction to Digital Image Processing Binary Image Compression Summary  Theoretical aspect – Shannon’s source entropy formula tells us the lower bound for coding a memoryless discrete source – To achieve the source entropy, we need to use variable length codes (i.e., long codeword assigned to small-probability event and vice versa) – Huffman’s algorithm (generating the optimal prefix codes) offers a systematic solution  Practical aspect – Data in the real world contains memory (dependency, correlation …) – We don’t know the probability distribution function of the source – Tricky point: the goal of image compression (processing) is not to make your algorithm work for one image, but for a class of images EE465: Introduction to Digital Image Processing Data Compression=Source Modeling binary image X source modeling Y entropy coding P(Y) probability estimation binary bit stream Q: What kind of model should we use? A: The model that matches the probabilistic characteristics of the given data EE465: Introduction to Digital Image Processing The Art of Source Modeling  How do I know a model matches or not? Model: Bernoulli distribution (P(X=black)=0.1) EE465: Introduction to Digital Image Processing Data A Matched Example Synthesis Model EE465: Introduction to Digital Image Processing Good Models for Binary Images discrete source X Transformation by Lempel-Ziv run-length parsing counting Y Huffman coding P(Y) probability estimation binary bit stream Y is the sequence of run-lengths from which X dictionary entries from which X can be recovered losslessly EE465: Introduction to Digital Image Processing Remaining Questions  How to choose different compression techniques, say RLC vs. LZC? – Look at the source statistics – For instance, compare binary image (good for RLC) vs. English texts (good for LZC)  What is the best binary image compression technique? – It is called JBIG2 – an enhanced version of JBIG  What is the entropy of binary image source? – We don’t know and the question itself is questionable – A more relevant question would be: what is the entropy for a probabilistic model which we believe generates the source? EE465: Introduction to Digital Image Processing MATLAB Programming Session  Program development – Planning the program – Using Pseudo-code – Selecting the right data structures – General coding procedures – Naming a function uniquely – The importance of comments  Optimizing for speed – Vectorizing your code EE465: Introduction to Digital Image Processing Planning the Program  Modular programming – Break the problem down into a series of smaller independent tasks – Implement each task as a separate function  When do I decide to implement a task as a new function? – Thumb rule: if this task will be performed more than once (e.g., I might want to use this function in other programs) EE465: Introduction to Digital Image Processing Using Pseudo-code For each pixel x { - We take a window centered in x and size 2t+1 x 2t+1, A(x,t). - We take a window centered in x and size 2f+1 x 2f+1, W(x,f). wmax=0; For each pixel y in A(x,t) && y different from x { - We compute the difference between W(x,f) and W(y,f), d(x,y). - We compute the weight from the distance d(x,y), w(x,y). w(x,y) = exp(- d(x,y) / h); - If w(x,y) is bigger than wmax then wmax = w(x,y); - We compute the average average + = w(x,y) * u(y); - We carry the sum of the weights totalweight + = w( x, y); } EE465: Introduction to Digital Image Processing Selecting the Right Data Structure  Usually data structure selection is easy, but sometimes it gets tricky – Scenario 1: handle data with varying length (e.g., variable length code) – Scenario 2: handle a large amount of data (due to memory constraint) – Scenario 3: running speed considerations (e.g., uint8 vs. double) EE465: Introduction to Digital Image Processing General Coding Practices  Use descriptive function and variable names to make your code easier to understand  Order subfunctions alphabetically in an M-file to make them easier to find  Precede each subfunction with a lock of help text describing what that subfunction does  Don’t extend lines of code beyond the 80th column  Make sure you have saved your function in a directory recognized by MATLAB path setting EE465: Introduction to Digital Image Processing Naming a Function Uniquely  It appears a simple rule; yet it is not easy to obey (we simply can’t remember them all)  >which -all <function name>  Doesn’t long function time take more time to type? – Yes, but you only need to do it once (you can recall the command using up/down arrow key) – TAB key can also help you finish the typing automatically EE465: Introduction to Digital Image Processing The Importance of Comments  I can never emphasize its importance enough, even though I often do a poor job on commenting my own MATLAB programs  It makes it easier for both you and others to maintain the program  Add comments generously, explain each major section and any smaller segments of code that are not obvious  Know how to remove or insert commenting % EE465: Introduction to Digital Image Processing Optimizing for Speed  How to vectorize your codes? – Avoid using for loop as much as you can – Functions Used in Vectorizing (see page 58 of programming_tips.pdf) • all: True if all elements of a vector are nonzero. • any: True if any element of a vector is a nonzero number or is logical 1 (TRUE). • Sum/prod/cumsum (more general than sum) • end: END can also serve as the last index in an indexing expression. • find (you will find it the most useful) • squeeze: remove singleton dimensions. • permute,/ipermute, reshape, sort, repmat, shiftdim, ndgrid EE465: Introduction to Digital Image Processing Vectorize Your Code  Even if loop is inevitable, try to reduce the number of iterations  Example: histogram calculation for a given grayscale image sized 512-by-512 – Scheme 1: loop over row and column indexes (512×512=262144 iterations) – Scheme 2: loop over intensity values 0-255 (256 iterations) – Scheme 2 is better! EE465: Introduction to Digital Image Processing

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download 0 0 0 0 0