Download image compression and decompression using

IMAGE COMPRESSION AND DECOMPRESSION USING NEURAL NETWORKS A PROJECT REPORT Submitted in partial fulfillment of the requirements for the award of the degree of BACHELOR OF TECHNOLOGY IN ELECTRONICS AND COMMUNICATION ENGINEERING BY This is to certify that this project work entitled “IMAGE COMPRESSION AND DECOMPRESSION USING NEURAL NETWORKS” is being submitted in partial fulfillment for the award of degree of BACHELOR OF TECHNOLOGY in ELECTRONICS AND COMMUNICATION ENGINEERING, Jawaharlal Nehru Technological University and is a record of bonafide work carried out by INDEX 1. ABSTRACT 1 2. INTRODUCTION 2 3. THEORY 4 3.1 NEURAL NETWORKS 4  Artificial neural networks  The Analogy to the Brain  The Biological Neuron  The Artificial Neuron  Design  Layers  Communication and types of connections  Learning laws  Applications of Neural Networks 3.2 IMAGE PROCESSING 17  Image Compression  Principles of Image Compression  Performance measurement of image Compression  Compression Standards 4. IMAGE COMPRESSION WITH NEURAL NETWORK 27  Back-Propagation image Compression  Hierarchical Back-Propagation Neural Network  Adaptive Back-Propagation Neural Network  Hebbian Learning Based Image Compression  Vector Quantization Neural Networks;  Predictive Coding Neural Networks. 5. PROPSED IMGAE COMPRESSION USING NEURAL NETWORK  Levenberg Marquardt Algorithm  Training Procedure 41 6. IMPLEMEMNTATION OF IMAGE COMPRESSION 46 AND DE-COMPRESSION USING MATLAB  Matlab code  Functions used in MATLAB program  Results 7. CONCLUSION 57 8. FUTURE SCOPE 58 9. BIBILIOGRAPHY 59 ABSTRACT Uncompressed multimedia (graphics, audio and video) data requires considerable storage capacity and transmission bandwidth. Despite rapid progress in mass-storage density, processor speeds, and digital communication system performance, demand for data storage capacity and data-transmission bandwidth continues to outstrip the capabilities of available technologies. The recent growth of data intensive multimedia-based web applications has not only sustained the need for more efficient ways to storage and communication technology. Apart from the existing technology on image compression represented by series of JPEG, MPEG and H.26x standards, new technology such as neural networks and genetic algorithms are being developed to explore the future of image coding. Successful applications of neural networks to basic propagation algorithm have now become well established and other aspects of neural network involvement in this technology. Here we present an extensive survey on the development of neural network for image compression. One of the most promising ways to utilize the power of neural network for image compression lies on (a) Selection of efficient multi layered network (b) Selection of training methods (c) Test vector. Based on this criteria network are trained and implemented. In this project a literature survey has been carried out to find and efficient multilayered neural network and suitable and tested using MATLAB for a test case of image of size 64:64, the trained weight and biases have been recorded. 1 INTRODUCTION:Neural networks are inherent adaptive systems, they are suitable for handling nonstationaries in image data. Artificial neural network can be employed with success to image compression. The advantages of realizing a neural network in digital hardware are:  Fast multiplication, leading to fast update of the neural network.  Flexibility, because different network architectures are possible.  Scalability, as the proposed hardware architecture can be used for arbitrary large network, considered by the no. of neurons in one layer. The greatest potential of neural networks is the high speed processing that is provided through massively parallel VLSI implementations. The choice to build a neural network in digital hardware comes from several advantages that are typical for digital systems: 1. Low sensitivity to electric noise and temperature. 2. Weight storage is no problem. 3. The availability of user-configurable, digital field programmable gate arrays, which can be used for experiments. 4. Well-understood design principles that have led to new, powerful tools for digital design The crucial problems of neural network hardware are fast multiplication, building a large number of connections between neurons, and fast memory access of weight storage or nonlinear function look up tables. The most important part of a neuron is the multiplier, which performs high speed pipelined multiplication of synaptic signals with weights. As the neuron has only one multiplier the degree of parallelism is node parallelism. Each neuron has a local weight ROM (as it performs the feed-forward phase of the back propagation algorithm) that stores, 2 as many values as there are connections to the previous layer. An accumulator is used to add signals from the pipeline with the neuron’s bias value, which is stored in an own register. The aim is to design and implement image compression using Neural network to achieve better SNR and compression levels. The compression is first obtained by modeling the Neural Network in MATLAB. This is for obtaining offline training. 3 3. THEORY 3.1 NEURAL NETWORKS 3.1.1 Artificial Neural Networks An Artificial Neural Network (ANN) is an information-processing paradigm that is inspired by the way biological nervous systems, such as the brain, process information. The key element of this paradigm is the novel structure of the information processing system. It is composed of a large number of highly interconnected processing elements (neurons) working in unison to solve specific problems. ANNs, like people, learn by example. An ANN is configured for a specific application, such as pattern recognition or data classification, through a learning process. Learning in biological systems involves adjustments to the synaptic connections that exist between the neurones. This is true of ANNs as well. Artificial Neural Network is a system loosely modeled on the human brain. The field goes by many names, such as connectionism; parallel distributed processing, neurocomputing, natural intelligent systems, machine learning algorithms, and aritificial neural networks. It is an attempt to simulate within specialized hardware or sophisticated software, the multiple layers of simple processing elements called neurons. Each neuron is linked to certain of its neighbors with varying coefficients of connectivity that represent the strengths of these connections. Neural networks, with their remarkable ability to derive meaning from complicated or imprecise data, can be used to extract patterns and detect trends that are too complex to be noticed by either humans or other computer techniques. A trained neural network can be thought of as an "expert" in the category of information it has been given to analyze. This expert can then used to provide projections given new situations of interest and answer "what if" questions. 4 Other advantages include: Adaptive learning: An ability to learn how to do tasks based on the data given for training or initial experience. Self-Organization: An ANN can create its own organization or representation of the information it receives during learning time. Real Time Operation: ANN computations may be carried out in parallel, and special hardware devices are being designed and manufactured which take advantage of this capability. Fault Tolerance via Redundant Information Coding: Partial destruction of a network leads to the corresponding degradation of performance. However, some network capabilities may be retained even with major network damage 3.1.2 The Analogy to the Brain Neural networks process information in a similar way the human brain does. The network is composed of a large number of highly interconnected processing elements (neurons) working in parallel to solve a specific problem. Neural networks learn by example. They cannot be programmed to perform a specific task. The examples must be selected carefully otherwise useful time is wasted or even worse the network might be functioning incorrectly. The disadvantage is that because the network finds out how to solve the problem by itself, its operation can be unpredictable. On the other hand, conventional computers use a cognitive approach to problem solving; the way the problem is to solve must be known and stated in small unambiguous instructions. These instructions are then converted to a high-level language program and then into machine code that the computer can understand. These machines are totally predictable; if anything goes wrong is due to a software or hardware fault. Neural networks and conventional algorithmic computers are not in competition but complement each other. There are tasks are more suited to an algorithmic approach like arithmetic operations and tasks that are more suited to neural networks. Even more, a large number of tasks require systems that use a combination of the two approaches (normally a conventional computer is used to supervise the neural network) in order to perform at maximum efficiency. 5 The most basic components of neural networks are modeled after the structure of the brain. Some neural network structures are not closely to the brain and some does not have a biological counterpart in the brain. However, neural networks have a strong similarity to the brain and therefore a great deal of the terminology is borrowed from neuroscience. 3.1.3 The Biological Neuron The most basic element of the human brain is a specific type of cell, which provides with the abilities to remember, think, and apply previous experiences to our every action. These cells are known as neurons; each of these neurons can connect with up to 200000 other neurons. The power of the brain comes from the numbers of these basic components and the multiple connections between them. All natural neurons have four basic components, which are dendrites, soma, axon, and synapses. Basically, a biological neuron receives inputs from other sources, combines them in some way, performs a generally nonlinear operation on the result, and then output the final result. The figure below shows a simplified biological neuron and the relationship of its four components. In the human brain, a typical neuron collects signals from others through a host of fine structures called dendrites. The neuron sends out spikes of electrical activity through a long, thin stand known as an axon, which splits into thousands of branches. At the end of each branch, a structure called a synapse converts the activity from the axon into electrical effects that inhibit or excite activity in the connected neurons. When a neuron receives excitatory input that is sufficiently large compared with its inhibitory input, it sends a spike of electrical activity down its axon. Learning occurs by changing the effectiveness of the synapses so that the influence of one neuron on another changes 6 Fig 3.1 BIOLOGICAL NEURON 3.1.4 The Artificial Neuron The basic unit of neural networks, the artificial neurons, simulates the four basic functions of natural neurons. Artificial neurons are much simpler than the biological neuron; the figure below shows the basics of an artificial neuron. Figure 3.2 SINGLE NEURON 7 The various inputs to the network are represented by the mathematical symbol, x(n). Each of these inputs are multiplied by a connection weight, these weights are represented by w(n). In the simplest case, these products are simply summed, fed through a transfer function to generate a result , and then output. Even though all artificial neural networks are constructed from this basic building block the fundamentals may vary in these building blocks and there are differences. 3.1.5 Design The developer goes through a period of trial and error in the design decisions before coming up with satisfactory design. The design issues in neural networks are complex and are the major concerns of system developers. Designing a neural network consists of: 8  Arranging neurons in various layers.  Deciding the type of connections among neurons for different layers, as well as among the neurons within a layer.  Deciding the way a neuron receives input and produces output.  Determining the strength of connection within the network by allowing the network learns the appropriate values of connection weights by using a training data set. The process of designing a neural network is an iterative process. 3.1.6 Layers Biologically, neural networks are constructed in a three dimensional way from microscopic components. These neurons seen capable of nearly unrestricted interconnections. This is not true in any man-made network. Artificial neural network are the simple clustering of the primitive artificial neurons. This clustering occurs by creating layers, which are then connected to one another. How these layers connect may also vary. Basically, all artificially neural networks have a similar structure of topology. Some of the neurons interface the real world to receive its inputs and other neurons provide the real world with the network’s outputs. All the rest of the neurons are hidden form view. Figure 3.3 THREE PECEPTRON FOR IMAGE COMPRESSION 9 As the figure above shows, the neurons are grouped into layers. The input layer consists of neurons that receive input form the external environment. The output layer consists of neurons that communicate the output of the system to the user or external environment. There are usually a number of hidden layers between these two layers; the figure above shows a simple structure with only one hidden layer. When the input layer receives the input its neurons produce output, which becomes input to the other layers of the system. The process continues until a certain condition is satisfied or until layer is invoked and fires their output to the external environment. To determine the number of hidden neurons the network should have to perform its best, one are often left out to the method trial and error. If the hidden number of neurons are increased too much an over fit occurs, that is the net will have problem to generalize. The training set of data will be memorized, making the network useless on new data sets. 3.1.7 Communication and types of connections Neurons are connected via a network of paths carrying the output of one neuron as input to another neuron. These paths is normally unidirectional, there might however be a two-way connection between two neurons, because there may be another path in reverse. 3.1.7.1 Inter-layer connections There are different types of connections used between layers; these connections between layers are called inter-layer connections. Fully connected Each neuron on the first layer is connected to every neuron on the second layer. Partially connected. A neuron of the first layer does not have to be connected to all neurons on the second layer. 10 Feed forward. The neurons on the first layer send their output to the neurons on the second layer, but they do not receive any input back form the neuron on the second layer. Bi-directional. There is another set of connections carrying the output of the neurons of the second layer into the neurons of the first layer. Feed forward and bi-directional connections could be fully-or partially connected. Hierarchical. If a neural network has a hierarchical structure, the neurons of a lower layer may only communicate with neurons on the next level of layer. Resonance. The layers have bi-directional connections, and they can continue sending messages across the connections a number of times until a certain condition is achieved. 3.1.7.2 Intra-layer connections. In more complex structures the neurons communicate among themselves within a layer, this is known as intra-layer connections. There are two types of intra-layer connections. Recurrent. The neurons within a layer are fully or partially connected to one another. After these neurons receive input form another layer, they communicate their outputs with one another a number of times before they are allowed to send their outputs to another layer. Generally some conditions among the neurons of the layer should be achieved before they communicate their outputs to another layer. 11 On-center/off surround. A neuron within a layer has excitatory connections to itself and its immediate neighbors, and has inhibitory connections to other neurons. One can imagine this type of connection as a competitive gang of neurons. Each gang excites itself and its gang member and inhibits all member of other gangs. After a few rounds of signal interchange, the neurons with an active output value will win, and is allowed to update its and its gang member’s weights. (The are two types of connections between two neurons, excitatory or inhibitory. In the excitatory connection, the output of one neuron increases the action potential of the neuron to which it is connected. When the connection type between two neurons is inhibitory, then the output of the neuron sending a message would reduce the activity or action potential of the receiving neuron. One causes the summing mechanism of the next neuron to add while the other causes it to subtract. One excites while the other inhibits. 3.1.8 Learning. The brain basically learns from experience. Neural networks are sometimes called machine-learning algorithms, because changing of its connection weights (training) causes the network to learn the solution to a problem. The strength of connection between the neurons is stored as a weight-value for the specific connection. The system learns new knowledge but adjusting these connection weights. The learning ability of a neural network is determined by its architecture and by the algorithmic method chosen for training. The training method usually consists of one of three schemes: 1. Unsupervised learning. Used no external teacher and is based upon only local information. It is also referred to as self-organization, in the sense that it self-organizes data presented to the network and detects their emergent collective properties. Paradigms of unsupervised learning are Hebbian learning and competitive learning. From Human Neurons to Artificial Neuron 12 other aspect of learning concerns the distinction or not of a separate phase, during which the network is trained, and a subsequent operation phase. We say that a neural network learns off-line if the learning phase and the operation phase are distinct. A neural network learns on-line if it learns and operates at the same time. Usually, supervised learning is performed off-line, whereas unsupervised learning is performed on-line. The hidden neurons must find a way to organize themselves without help from the outside. In this approach, no sample outputs are provided to the network against which it can measure its predictive performance for a given vector of inputs. This is learning by doing. 2. Reinforcement learning This incorporates an external teacher, so that each output unit is told what its desired response to input signals ought to be. During the learning process global information may be required. Paradigms of supervised learning include error-correction learning, reinforcement learning and stochastic learning. An important issue concerning supervised learning is the problem of error convergence, i.e. the minimization of error between the desired and computed unit values. The aim is to determine a set of weights which minimizes the error. One well-known method, which is common to many learning paradigms is the least mean square (LMS) convergence. This method works on reinforcement from the outside. The connections among the neurons in the hidden layer are randomly arranged, then reshuffled as the network is told how close it is to solving the problem. Reinforcement learning is also called supervised learning, because it requires a teacher. The teacher may be a training set of data or an observer who grades the performance of the network results. Both unsupervised and reinforcement suffers from relative slowness and inefficiency relying on a random shuffling to find the proper connection weights. 3. Back propagation This method is proven highly successful in training of multilayered neural nets. The network is not just given reinforcement for how it is doing on a task. Information about errors is also filtered back through the system and is used to adjust the connections between the layers, thus improving performance. A form of supervised learning. 13 Off-line or On-line One can categorize the learning methods into yet another group, off-line or on-line. When the system uses input data to change its weights to learn the domain knowledge, the system could be in training mode or learning mode. When the system is being used as a decision aid to make recommendations, it is in the operation mode, this is also sometimes called recall. Off-line In the off-line learning methods, once the systems enters into the operation mode, its weights are fixed and do not change any more. Most of the networks are of the off-line learning type. On-line In on-line or real time learning, when the system is in operating mode (recall), it continues to learn while being used as a decision tool. This type of learning has a more complex design structure. 3.1.9 Learning laws There are a variety of learning laws, which are in common use. These laws are mathematical algorithms used to update the connection weights. Most of these laws are some sort of variation of the best known and oldest learning law, Hebb’s Rule. Man’s understanding of how neural processing actually works is very limited. Learning is certainly more complex than the simplification represented by the learning laws currently developed. Research into different learning functions continues as new ideas routinely show up in trade publications etc. a few of the major laws are given as an example below. Hebb’ Rule The first and the best known learning rule was introduced by Donald Hebb. The description appeared in his book The organization of Behavior in 1949. This basic rule is: If a neuron receives an input from another neuron, and if both are highly active (mathematically have the same sign), the weight between the neurons should be strengthened. 14 Hopfield Law This law is similar to Hebb’s Rule with the exception that it specifies the magnitude of the strengthening or weakening. It states, “if the desired output and the input are both active or both inactive, increment the connection weight by the learning rate, otherwise decrement the weight by the learning rate.” (Most learning functions have some provision for a learning rate, or learning constant. Usually this term is positive and between zero and one.) The Delta Rule The Delta Rule is a further variation of Hebb’s Rule, and it is one of the most commonly used. This rule is based on the idea of continuously modifying the strengths of the input connections to reduce the difference (the delta) between the desired output value and the actual output of a neuron. This rule changes the connection weights in the way that minimized the mean squared error of the network. The error is back propagated into previous layers one layer at a time. The process of back-propagating the network errors continues until the first layer is reached. The network type called Feed forward, Backpropagation derives its name from this method of computing the error term. This rule is also referred to as the Windrow-Hoff Learning Rule and the Least Mean Square Learning Rule. Kohonen’s Learning Law This procedure, developed by Teuvo Kohonen, was inspired by learning in biological systems. In this procedure, the neurons compete for the opportunity to learn, or to update their weights. The processing neuron with the largest output is declared the winner and has the capability of inhibiting its competitors as well as exciting its neighbors. Only the winner is permitted output, and only the winner plus its neighbors are allowed to update their connection weights. The Kohonen rule does not require desired output. Therefore it is implemented in the unsupervised methods of learning. Kohonen has used this rule combined with the oncenter/off-surround intra-layer connection to network,which has an unsupervised learning method. 15 create the self-organizing neural 3.1.10 Applications of Neural Networks The most common use for neural networks is to project what will most likely happen. There are many areas where prediction can help in setting priorities. For example, the emergency room at a hospital can be a hectic place; to know who need the most critical help can enable a more successful operation. Basically, all organizations must establish priorities, which govern the allocation of their resources. Neural networks have been used as a mechanism of knowledge acquisition for expert system in stock market forecasting with astonishingly accurate results. Neural networks have also been used for bankruptcy prediction for credit card institutions. Although one may apply neural network systems for interpretation, prediction diagnosis, planning, monitoring, debugging, repair, instruction, and control, the most successful applications of neural networks are in categorization and pattern recognition. Such a system classifies the object under investigation (e.g. an illness, a pattern, a picture, a chemical compound, a work, and the financial profile of a customer) as one of numerous possible categories that, in return, may trigger the recommendation of an action (such as treatment plan or a financial plan. A company called Nestor, have used neural network for financial risk assessment for mortgage insurance decision, categorizing the risk of loans as good or bad. Neural networks has also been applied to convert text to speech, NET talk is one of the systems developed for this purpose. Image processing and pattern recognition form an important area of neural networks, probably one of the most actively research areas of neural networks. Another area of research for application of neural networks is character recognition and handwriting recognition. This area has use in banking, credit card processing and other financial services, where reading and correctly recognizing on documents is of crucial significance. The pattern recognition capability of neural networks has been used to read handwriting in processing checks; and human must normally enter the amount into the system. A system that could automate this task would expedite check processing and reduce errors. 16 One of the best-known applications is the bomb detector installed in some U.S. airports. This device called SNOOPE, determine the presence of certain compounds from the chemical configurations of their components. In a document from International Joint conference, one can find reports on using neural networks in areas ranging from robotics, speech, signal processing, vision, character recognition to musical composition, detection of heart malfunction and epilepsy, fish detection and classification, optimization, and scheduling. Basically, most applications of neural networks fall into the follwing five categories: Prediction Uses input values to predict some output e.g. pick the best stocks in the market, predict weather, identify people with cancer risk. Classification Use input values to determine the classification e.g. is the input the letter A, is blob of the video data a plane and what kind of plane is it. Data association Like classification but is also recognizes data that contains errors. E.g. not only identify the character that were scanned but identify when the scanner is not working properly. Data Conceptualization Analyze the inputs so that grouping relationships can be inferred. E.g. extract from a database the names of those most likely to by a particular product. Data Filtering Smooth an input signal. e.g. take the noise out of a telephone signal. 3.2 IMAGE PROCESSING 17 The importance of visual communication has increased tremendously in the last few decades. The progress in microelectronics and computer technology, together with the creation of network operating with various channel capacities, is the bases of an infrastructure for a new are of telecommunications. New applications are preparing a revolution in the everyday life of our modern society. Communication based applications include ISDN surveillance. Storage based audiovisual applications include Training, Education, Entertainment, Advertising, Video mail and Document annotation. Essential for the introduction of new communication services is low cost. Visual information is one of the richest and most bandwidth consuming modes of communication. The digital representation of raw video requires a large amount of data. The transmission of this raw video data requires a large transmission bandwidth. To reduce the transmission and storage requirements, the video must be handled in compressed formats. To meet the requirements, the new applications, powerful data compression techniques are needed to reduce the global bit rate drastically. Even in the presence of growing communication channels offering increased bandwidth. The issue of quality is of prime importance in most applications of compression. In fact, although most applications require high compression ratios, this requirement is in general in conduction with desire for high quality in the resulting pictures. The standardization of video coding techniques has become a high priority because only a standard can reduce the high cost of video compression codes and resolve the critical problem of inter operability of equipment from different manufacturers. The existence of the standards is often the trigger to the volume production of integrated (VLSI) necessary for significant cost reductions. Bodies such as the international Standards Organization (ISO) and International. Telecommunication Union (ITU-T) today recommends the video compression standards in practice. 18 3.2.1 Image processing Digital image processing can be classified broadly into four areas: I. Image Enhancement, II. Image Restoration, III. Image Coding, IV. Image Understanding. 3.2.1.1 Image Enhancement Image enhancement is the use of image processing algorithms to remove certain types of distortion in an image. Removing noise, making the edge structures in the image stand out, enhances the image or any other operations that makes the image to look better. The most widely used algorithms for enhancement are based on pixel functions that are known as window operations. A window operation performed on an image is nothing more than the process of examining the pixels in a certain region of the image, called the window region, and computing same type of mathematical function derived from the pixels in the window. 3.2.1.2 Image Restoration In image restoration, an image has been degraded in some manner and the objective is to reduce or eliminate the degradation. The development of an image restoration system depends on the type of degradation. 3.2.1.3 Image coding The objective of image coding is to represent an image with as few bits as possible preserving certain level of image quality and intelligibility acceptable for a given 19 application. Image coding can be used in reducing the bandwidth of a communication channel; when an image needs to be retrieved. 3.2.1.4 Image Understating The objective of image understating is to symbolically represent the contents of an image. An application of image understanding includes computer vision and robotics. Image understanding differs from the other three areas in one major respect. In image enhancement, restoration and coding both the input and the output are images and signal processing has been the backbone of many successful systems of these areas. In image understanding the input is an image, but the output is symbolic representation of the contents of the image. Successful development of the systems in this area involve not only signal processing but also other disciplines such as Artificial intelligence. 3.2.2 Image Compression Direct transmission of the video data requires a high-bit-rate (Bandwidth) channel. When such a high bandwidth channel is unavailable or not economical, compression techniques have to be used to reduce the bit rate and ideally maintain the same visual quality. Similar arguments can be applied to storage media in which the concern is memory space. Video sequence contain significant amount of redundancy within and between frames. It is this redundancy frame. It is this redundancy that allows video sequences to be compressed. Within each individual frame, the values of neighboring pixels are usually close to one another. This spatial redundancy can be removed fro the image without degrading the picture quality using “Intraframe” techniques. Also, most of the information in a given frame may be present in adjacent frames. This temporal redundancy can also be removed, in addition to the “within frame” redundancy by “interframe” coding. 20 3.2.3 Principles of Image Compression The principles of image compression are based on information theory. The amount of information that a source produce is Entropy. The amount of information one receives from a source is equivalent to the amount of the uncertainty that has been removed. A source produces a sequence of variables from a given symbol set. For each symbol, there is a product of the symbol probability and its logarithm. The entropy is a negative summation of the products of all the symbols in a given symbol set. Compression algorithms are methods that reduce the number of symbols used to represent source information, therefore reducing the amount of space needed to store the source information or the amount of time necessary to transmit it for a given channel capacity. The mapping from the source symbols into fewer target symbols is referred to as Compression and Vice-versa Decompression. Image compression refers to the task of reducing the amount of data required to store or transmit an image. At the system input, the image is encoded into its compressed from by the image coder. The compressed image may then be subjected to further digital processing, such as error control coding, encryption or multiplexing with other data sources, before being used to modulate the analog signal that is actually transmitted through the channel or stored in a storage medium. At the system output, the image is processed step by the step to undo each of the operations that were performed on it at the system input. At the final step, the image is decoded into its original uncompressed form by the image decoder. If the reconstructed image is identical to the original image the compression is said to be lossless, otherwise, it is lossy. 21 3.2.4 Performance measurement of image Compression There are three basic measurements for the IC algorithm. 1. Compression Efficiency It is measured by compression ratio, which is defined as the ratio of the size (number of Bits) of the original image data over the size of the compressed image data 2. Complexity The number of data operations required performing bit encoding and decoding processes measures complexity of an image compression algorithm. the data operations include additions, subtractions, multiplications, division and shift operations. 3. Distortion measurement (DM) For a lossy compression algorithm, DM is used to measure how much information has been lost when a reconstructed version of a digital image is produced from the compressed data. The common distortion measure is the Mean-Square-Error of the original data and the compressed data. The Single-to-Noise ration is also used to measure the performance of lossy compression algorithm. 3.2.5 Compression Standards Digital images and digital video are normally compressed in order to save space on hard disks and to speed up transmission. There are presently several compression standards used for network transmission of digital signals on a network. Data sent by a camera using video standards contain still image mixed with data containing changes, so that unchanged data (for instance the background) are not sent in every image. Consequently the frame rate measured in frames per second (fps) is much grater. 22 3.2.6 Image compression techniques Still images are simple and easy to send. However it is difficult to obtain single images from a compressed video signal. The video signal uses a lesser data to send or store a video image and it is not possible to reduce the frame rate using video compression. Sending single images is easier when using a modem connection or anyway with a narrow bandwidth. Main compression standard for still Main compression standards for video image signal JPEG M-JPEG (Motion.JPED) Wavelet H.261,263etc. JPEG 2000 MPEG1 GIF MPEG2 MPEG3 MPEG4 Table 3.1 compression standards JPEG (Joint Photographic Expert Group) Popular compression standard used exclusively for still images. Each image is divided in 8 x 8 pixels; each block is then individually compressed. When using a very high compression the 8 x 8 blocks can be actually seen in the image. Due to the compression mechanism, the decompressed image is not the same image which has been compressed; this because this standard has been designed considering the performance limits of human eyes. The degree of detail losses can be varied by adjusting compression parameters. It can store up to 16 million colors. 23 Wavelet Wavelets are functions used in representing data or other functions. They analyze the signal at different frequencies with different resolutions. Optimized standard for images with amount of data with sharp discontinuities. Wavelet compression transforms the entire image differently from JPEG and is more natural as if follows the shape of the objects in the picture. It is necessary to use a special software for viewing, being this a non-standardized compression method. JPEG2000 Based on Wavelet technology. Rarely used. GIF(Graphic Interchange Format). Graphic format used widely with Web images. It is limited to 256 colors and is a good standard for images which are not too complex. It is not recommended for network cameras being the compression ration too limited. M-JPEG (Motion –JPEG) This is a not a separate. standard but rather a rapid flow of JPEG image that can be viewed at a rate sufficient to give the illusion of motion. Each frame within the video is stored as a complete image in JPEG format. Singe image do not interact among the selves. Image are then displayed sequentially at a high frame rate. This method produces a high quality video, but at a cost of large files. 24 H.261, H.263 etc., Standards approved by ITU (International Telecommunications Union). They are designed for videoconference applications and produce images with a high. DCT-Based Image Coding Standard The idea of compressing an image is not new. The discovery of DCT in 1974 is an important achievement for the research community working on image compression. The DCT can be regarded as a discrete-time version of the Fourier-cosine series. It is a close relative of DFT, a technique for converting a single into elementary frequency components. Thus DCT can be computed with a Fast Fourier Transform (FFT) like algorithm in O (n log n) operations. Unlike DFT, DCT is revalued and provides a better approximation of a single with fewer coefficients. The DCT of a discrete signal x(n), n=0, 1,…,N-1 is defined as: X(u )  r 1 2  (2n  1)( m)  a (u )  x (n ) cos   n 0 N 2N  where C(u) = 0.707 for u = 0 and = 1 otherwise. In 1942 JPEG established the first international standard for still image compression where the encoders and decoders are DCT-based. The JPEG standard specifies there modes namely sequential, progressive, and hierarchical for lossy encoding, and one mode of lossless encoding. The baseline JPEG CODER’, which is the sequential encoding in its simplest form, is briefly discussed here. Fig.3.1 and 3.2 shows the key processing steps in such as encoder and decoder for grayscale images. Color image compression can be approximately regarded as compression of multiple grayscale images, which are either compressed entirely one at a time, or are compressed by alternately interleaving 8 x 8 sample blocks from each in turn. In this article, we focus on grayscale images only. 25 The DCT-based encoder is essentially compression of a stream of 8 x8 blocks of image samples. Each 8 x 8 block makes its way through each processing step, and yields output in compressed form into the data stream. Because adjacent image pixels are highly correlated , the ‘forward’ DCT (FDCT) processing step lays the foundation for achieving data compression by concentrating most of the signal in the lower spatial frequencies have zero or near –zero amplitude and need not be encoded. In principle, the DCT introduces no loss to the source image samples; it merely transforms them to a domain in which they can be more efficiently encoded. After output from the FDCT, each of the 64 DCT coefficients is uniformly quantization in conjunction with a carefully designed 64-element Quantization Table (QT). At the decoder, the quartered values are multiplied by the corresponding QT elements to recover the original unquantized values. After quantization, all of the quantized coefficients are ordered into the “zigzag” sequence as shown in. this ordering helps to facilitate entropy encoding by placing low frequency non-zero coefficients before high frequency coefficients. The DC coefficient, which contains a significant fraction of the total image energy, is differently encoded. Entropy Coding (EC) achieves additional compression listlessly by encoding the quantized DCT coefficients more compactly based on their statistical characteristics. The JPEG proposal specifies both Huffman coding and arithmetic coding. The baseline sequential code uses Huffman coding, but codecs with both methods are specified for all modes of operation. Arithmetic coding, though more complex, normally achieves 5-10% better compression than Huffman coding. 26 4. IMAGE COMPRESSION WITH NEURAL NETWORKS: Apart from the existing technology on image compression represented by series of JPEG, MPEG, and H.26x standards, new technology such as neural networks and genetic algorithms are being developed to explore the future of image coding. Successful applications of neural networks to vector quantization have now become well established, and other aspects of neural network involvement in this area are stepping up to play significant roles in assisting with those traditional compression techniques. Existing research can be summarized as follows: 1. Back-Propagation image Compression; 2. Hierarchical Back-Propagation Neural Network 3. Adaptive Back-Propagation Neural Network 4. Hebbian Learning Based Image Compression 5. Vector Quantization Neural Networks; 6. Predictive Coding Neural Networks. 4.1 Basic Back Propagation Neural Network:The neural network structure can be illustrated in Fig.4.1. There layers, one input layer, one output layer and one hidden layer, are designed. Both input layer and output layer are fully connected to the hidden layer. Compression is achieved by designing the value of K, the number of neurons at the hidden layer, less than that of neurons at both input and output layers. The input image is split up into blocks or vectors of 8x8, 4x4 or 16x16 pixels. When the input vector is referred to as N-dimensional which connected to each neuron at the hidden layer can be represented by {wjb j=1,2,… K and I=1,2.. N}, which can also be described by a matrix of K x N. From the hidden layer to the output layer, the connections can be represented by {wij;: 1< \ < N, 1< j< K} which is another weight matrix of N x K. Image compression is achieved by training the network in such a way that the coupling weights { wij }, scale the input vector of N-dimension into a narrow channel of K-dimension (K<N) at the hidden layer and produce the optimum output value which makes the quadratic error between input and output minimum. In accordance with the neural network structure, the operation can be described as follows: 27 For encoding and N h j   w ji xi i 1 1 j  K for decoding.  K xi   w' ji h j j 1 Where x i [0,1] 1 j  N denotes the normalized pixel values for grey scale images with grey levels [0,255]. The reason of using normalized pixel values is due to the fact that neural networks can Operate more efficiently when both their inputs and outputs are limited to a range of [0,1]. Figure 4.1 BACK – PROPAGATION NEURAL NETWORK 28 The above linear networks can also be designed into non-linear if a transfer function such as sigmoid is added to the hidden layer and the output layer to scale the summation down in the above equations. With this basic back-propagation neural network, compression is conducted in two phases: training and encoding. In the first phase, a set of image samples is designed to train the network via back-propagation learning rule, which uses each input vector as the desired output. This is equivalent to compressing the input into the narrow channel represented by the hidden layer and then reconstructing the input from the hidden to the output layer. The second phase simply involves the entropy coding of the state vector hj at the hidden layer. In cases that adaptive training is conducted, the entropy coding of those coupling weights is also required in order to catch up with some input characteristics that are not encountered at the training stage. The entropy coding is normally designed as the simple fixed length binary coding although many advanced variable length entropy-coding algorithms are available. This neural network development, in fact, is in the direction of K-L transform technology, which actually provides the optimum solution for all linear narrow channel type of image compression neural networks. Equations (1) and (2) are represented in matrix form: [h ]  [ w ]T [ x ] for encoding and decoding. [ x ]  [ W ' ][ h ]  [ W ' ][ w ]T [ x ] The K-L transform maps input images into a new vector space where all the coefficients in the new space is de-correlated. This means that the covariance matrix of the new vectors is a diagonal matrix whose elements along the diagonal are eigen-values of the 29 covariance matrix of the original input vectors. Let ej and j, i=1, 2…n, be eigen-vectors and eigen values of cx the covariance matrix for input vector x, and those corresponding eigen values are arranged in a descending order so that  > +1, for I=1,2,..n-1. To extract the principal components, K eigen vectors corresponding to the K largest eigen-values in cx. In addition, all eigen-vactors in [AK]are ordered in such a way that the first row of {AK} is the eigen-vector corresponding to the smallest eigen-value. Hence, the forward K-L transform or encoding can defined as: [ y]  [AK ][ x]  [mx ] and the inverse K-L transform or decoding can be defined as: [ x ]  [A K ]T [ y]  [m x ] where [mx] is the mean value of [x] and [ x ] represents the reconstructed vectors or image blocks. Thus the mean square error between x and [ x ] is given by the following equation: e m  E( x  x ) 2  n n 1 n 2  (x g  x k )    i    i j1 j k 1 m J 1 where the statistical mean value E{.} is approximated by the average value over all the input vector samples which, in image coding, are all the non-overlapping blocks of 4x4 or 8x8 pixels. Therefore, by selecting the K eigen-vectors associated with largest eigen-values to run K-L transform over input pixels, the resulting errors between reconstructed image and original one can be minimized due to the fact that the values of s decrease monotonically. 30 For the comparison between the equation pair (3-4) and the equation pair (5-6), it can be concluded that the linear neural network reaches the optimum solution whenever the following condition is satisfied: [ W' ][ W]T  [A K ]T [A K ] Under this circumstance, the neuron weights from input to hidden and from hidden to output can be described respectively as follows: [W' ]  [A K ][ U]1 ; [W]T  [U][A K ]T Where [U] is an arbitrary K x K matrix and [U] [U]-1 gives an identity matrix of K x K. Hence, it can be seen that the liner neural network can achieve the same compression performance as that of K-L transform without necessarily obtaining its weight matrices being equal to [AK]T and [AK]. 4.2 Hierarchical Back-Propagation Neural Network The basic back-propagation network can be further extended to construct a hierarchical neural network by adding two more hidden layers into the existing network, in which the three hidden layers are termed as combiner layer, compressor layer and decomposer layer. The structure can be shown in Figure 4.2. The idea is to exploit correlation between pixels by inner hidden layer and to exploit correlation between blocks of pixels by outer hidden layers. From input layer to combiner layer and decombiner layer to output layer, local connections are designed, which has the same effect as M fully, connected neural sub-networks. 31 Figure 4.2 HIERARCHICAL NEURAL NETWORK STRUCTURE Training such a neural network can be conducted in terms of : (i) Outer Loop Neural Network 9OLNN) Training; (ii) Inner Loop Neural Network (ILNN) Training; and (iii) Coupling weight allocation for the Overall Neural Network. 4.3 Adaptive Back-Propagation Neural Network Adaptive back-propagation neural network is designed to make the neural network compression adaptive to the content of input image. The general structure for a typical adaptive scheme can be illustrated in Fig. 4.3, in which a group of neural networks with increasing number of hidden neurons (hmin, hmax), is designed. The basic idea is to classify the input image blocks in age blocks into a few sub-sets with different features according to their complexity measurement. A fine tuned neural network then compresses each sub-set. 32 Figure 4.3 ADAPTIVE NEURAL NETWORK STRUCTURE Training of such a neural network can be designed as : (a) parallel training (b) serial training; and (c) activity based training; The parallel training scheme applies the complete training set simultaneously to all neural networks and use S/N (signal-to-noise) ratio to roughly classify the image blocks into the same number of sub-sets as the of neural networks. After this initial coarse classification is completed, each neural network is then further trained by its corresponding refined subset of training blocks. Serial training involves an adaptive searching process to build up the necessary number of neural networks to accommodate the different patterns embedded inside the training images. Starting with a neural network with predefined minimum number of hidden neurons, hmin, the neural network is roughly trained by all the image blocks. The S/N ratio, further training is started to the next neural network with the number of hidden neurons increased and the corresponding threshold readjusted for further classification. This process is repeated until the whole training set is classified into a maximum number of sub-sets corresponding to the same number of neural networks established. 33 In the next two training schemes, extra two parameter, activity A(P j) and four directions are defined to classify the training set rather than using the neural networks. Hence the back propagation training of each neural network can be completed in one phase by its appropriate sub-set. The so-called activity of the block is defined as: A(Pi )   A j (Pi (i, j)) even i , j and 1 1 A(Pi (i, j))    (p i (i, j)  Pi (i  r, j  s)) 2 r  1 s  1 where AP(Pi(I,j)) is the activity of each pixel which concerns its neighboring 8 pixels as r and s vary from –1 to +1 in equation (11). Prior to training, all image blocks are classified into four classes according to their activity values, which are, identified as very low, low, high and very high activities. Hence four neural networks are designed with increasing number of hidden neurons to compress the four different sub-sets of input images after the training phase is completed. On top of the high activity parameter, further feature extraction technique is applied by considering four main directions presented in image details, i.e., horizontal, vertical and the two diagonal directions. These preferential direction features can be evaluated by calculating the values of mean squared differences among neighboring pixels along the four directions. For the image patterns classified as high activity, further four neural network corresponding to the above directions are added to refine their structure and tune their learning processes to the preferential orientations of the input. Hence the overall neural network system is designed to have six neural networks among which two correspond to 34 low activity and medium activity sub-sets and other four networks correspond to the high activity and four direction classifications. 4.4 Hebbian Learning Based Image Compression While the back-propagation based narrow-channel neural network aim at achieving compression upper bounded by K-L transform, a number of Hebbian learning rules have been developed to address the issue how the principal components can be directly extracted from input image blocks to achieve image data compression. The general neural networks structural consists of one input layer and one output layer. Hebbian learning rule comes from hebb’s postulation that if two neurons are very active at the same time which is illustratyed by the high values of both its output and one of its inputs, the strength of the connection between the two neurons will grow or increase. Hence, for the output values expressed as [h]=[W]T[x], the learning rule can be described as: Wi ( t  1)  W ( t )  h 1 ( t ) X ( t ) Wi ( t )  h 1 ( t )X( t ) where, Wi(t+1) = {Wi1, Wi2,….WiN}- the ith new coupling weight vector in the next cycle (t+1); 1 < I < M and M is the number of output neurons. - learning rate; hi(t)- ith output value; X(t)-input vector, corresponding to each individual image block. 11-Euclidean norm used to normalize the updated weights and make the learning stable. From the basic learning rule, a number of variations have been developed in the existing research. 35 4.5 Vector Quantization Neural Networks Since neural networks are capable of learning from input information and optimizing itself to obtain the appropriate environment for a wide range of tasks, a family of learning algorithms have been developed for vector quantization. The input vector is constructed from a K-dimensional space. M neurons are designed to compute the vector quantization code-book in which each neuron relates to one code-word vitas compling weights. The coupling weights. The coupling weight, {Wij}, associated with the I’th neuron is eventually trained to represent the code-word ci in the code-book. As the neural network is being trained, all the coupling weights will be optimized to represent the best possible partition of all the input vectors. To train the network, a group of image samples known to both encoder and decoder is often designated as the training set, and the first M input vectors of the training data set are normally used to initialize all the neurons. With this general structure, various learning algorithms have been designed and developed such as Kohone’s selforganising feature mapping, competitive learning, frequency sensitive competitive learning fuzzy competitive learning, general learning, and distortion equalized fuzzy competitive learning and PVQ (predictive VQ) neural networks. Let Wi(t) be the weight vector of the I’th neuron at the I’th iteration, the basic competitive learning algorithm can be summarized as follows: 1 d ( x , Wi ( t )) min d ( xW j ( t )) 0 otherwise zi  { Wi (t  1)  Wi (t )  (x  Wi (t))zi where d(x, Wi(t)) is the distance in L2 metric between input vector x and the coupling weight vector Wi(t)= { wi1,wi2….Wik}; K=p x p ;  is the leering rate, and z i is its output. A so called under utilization problem occurs in competitive learning which means some of the neurons are left out of the learning process and never win the competition. Various schemes are developed to tackle this problem. Kohonen self-organising neural 36 network overcomes the problem by updating the winning neuron as well as those in its neighborhood. Frequency sensitive competitive learning algorithm address the problem by keeping a record of how frequent each neuron is the winner to maintain that all neurons is the network are updated an approximately equal number of times. To implement this scheme, the distance is ,modified to include the total number of times that the neuron I is the winner. The modified distance measurement is defined as: d(x, w(t)i )  d(x, Wi (t)) xui (t) Where ui(t) is the total number of winning times for neuron I up to the t’th training cycle. Hence, the more the I’th neuron wins the competition, the greater its distance from the next input vector. Thus, the change of winning the competition diminishes. This way of tackling the under-utilization problem does not provide interactive solutions in optimizing the code-book. Around the competitive learning scheme, fuzzy membership functions are introduced to control the transition from soft to crisp decision during the code-boo design process. The essential idea is that one input vector is assigned to a cluster only to a certain extent rather than either ‘in’ or ‘out’. The fuzzy assignment is useful particularly at earlier training stages, which guarantees that all input vectors are included in the formation of new code-book represented by all the neuron coupling weights. Representative examples included direct fuzzy competitive learning, fuzzy algorithms for learning vector quantization and distortion equalized fuzzy competitive learning algorithm etc. 4.6 Predictive Coding Neural Networks Predictive coding has been proved a powerful technique in de-correlating input data for speech compression and image compression where a high degree of correlation is embedded among neighboring data samples. Although general predictive coding is 37 classified into various models such as AR and ARMA etc., auto-regressive model (AR) has been successfully applied to image compression. Hence, predictive coding in terms of applications in image compression can be further classified into linear and non-linear AR models. Conventional technology provides a mature environment and well developed theory for predictive coding which is represented by LPC (linear predictive coding) PCM (pulse code modulation), DPCM (delta PCM) or their modified variations. Non-linear predictive coding, however, is very limited due to the difficulties involved in optimizing the coefficients. extraction to obtain the best possible predictive values. Under this circumstance, neural network provides a very promising approach in optimizing non-linear predictive coding. With liner AR model, predictive coding can be described by the following equation: N X n   iX n i  Vn  p  v n i 0 where p represents the predictive value for the pixel Xn which is to be encoded in the next step. its neighboring pixels, Xn-1, Xn-2 ….Xn-N, are used by the linear model to produce the predictive value. vn stands for the errors between the input pixel and its predictive value. vn can also be modeled by a set of zero-mean independent and identically distributed random variables. Based on the above liner AR model, a multi-layer perception neural network can be constructed to achieve the design of its corresponding non-liner predictor as shown in Fig.1.4. For the pixel Xn which is to be predicted, its N neighboring pixels obtained from its predictive pattern are arranged into one dimensional input vector x{Xn-1 , Xn-2….Xn-N} for the neural network. A hidden layer is designed to carry out back propagation learning for training the neural network. The output of each neuron, say the j’th neuron, can be derived from the equation given below: N h f  f ()  f (  WjiX n 1 ) i 0 where f(v)  1 1 e  r is a sigmoid transfer function. 38 Figure 4.4 PREDICTIVE NEURAL NETWORK I To predict those drastically changing features inside image such as edges, contours etc., and high-order terms are added to improve the predictive performance. This corresponds to a non-linear AR model expressed as follows: X n   a i X n i    a jX n 1X n  j     a jk X n i X n  jX n  k i i j i j k Hence , another so called functional link type neural network can be designed to implement this type of non-liner AR model with high-order terms. The structure of the network is illustrated in Fig 1.5. it contains only two layer of neurons one for input and the other for output. Coupling weights, {wi}, between input layer and output layer are trained towards minimizing the residual energy which is defined as:  en  ( X n  x n ) RE  n 2 n where X n is the predictive value for the pixel Xn 39 Figure 4.5 PREDICTIVE NEURAL NETWORK II 40 5. PROPSED IMGAE COMPRESSION USING NEURAL NETWORK A two layer feed-forward neural network and the Levenberg Marquardt algorithm was considered. Image coding using a feed forward neural network consists of the following steps: An image, F, is divided into rxc blocks of pixels. Each block is then scanned to form a input vector x (n) of size p=rxc It is assumed that the hidden layer of the layer network consists of L neurons each with P synapses, and it is characterized by an appropriately selected weight matrix Wh. All N blocks of the original image is passed through the hidden layer to obtain the hidden signals, h(n), which represent encoded input image blocks, x(n) If L<P such coding delivers image compression. It is assumed that the output layer consists of m=p=rxc neurons, each with L synapses. Let Wy be an appropriately selected output weight matrix. All N hidden vector h(n), representing an encoded image H, are passed through the output layer to obtain the output signal, y(n). The output signals are reassembled into p=rxc image blocks to obtain a reconstructed image, Fr. There are two error matrices that are used to compare the various image compression techniques. They are Mean Square Error (MSE) and the Peak Signal-to-Noise Ratio (PSNR). The MSE is the cumulative squared error between the compressed and the original image whereas PSNR is the measure of the peak error. MSE  I m n [ I ( x, y)  I ' x, y ] MN  y 1 x 1 41 2 ………………5.1 The quality of image coding is typically assessed by the Peak signal-to-noise ratio (PSNR) defined as PSNR = 20 log 10 [255/sqrt(MSE)]………………5.2 Training is conducted for a representative class of images using the Levenberg Marquardt algorithm. Once the weight matrices have been appropriately selected, any image can be quickly encoded using the Wh matrix, and then decoded (reconstructed) using the W y matrix. Levenberg Marquardt Algorithm The Levenberg Marquardt algorithm is a variation of Newton’s method that was designed for minimizing functions that are sums of squares of other nonlinear functions. This is very well suited to neural network training where the performance index is the mean squared error. Basic Algorithm: Consider the form of Newton’s method where the performance index is sum of squares. The Newton’s method for optimizing a performance index F(x) is Xk+1= Xk – Ak –1 gk, …………..5.3 Where Ak = 2 F(x) and gk =F(x); It is assume d that F (x) is a sum of squares function: n F ( x)   vi2 x   V T x  vx  ………….5.4 r 1 Then the jth element of the gradient will would be n F x j  F x  / S j  2 Vi x vi x  / x j ……………5.5 i 1 42 The gradient can be written in matrix form: F(x) = 2JT (x) v(x) ,………………………………………..5.6 Where J(x) is the Jacobian matrix. Next the Hessian matrix is considered. The k.j element of Hessian matrix would be  F x 2 kj   2 F x / xk x j The Hessian matrix can then be expressed in matrix form: 2 F(x) = 2 JT (x) J(x) + 2 S(x) Where n S x    Vi x . 2 vi x  i 1 Assuming that S(x) is small, the Hessian matrix is approximated as 2 F(x)  2 JT(x) J(x) Substituting the values of 2 F(x) &  F(x), we obtain the Gauss-Newton method: Xk+1 = Xk – [JT (Xk) J ( Xk)]-1 JT (Xk) V(Xk) One problem with the Gauss-Newton over the standard Newton’s method is that the matrix H=JTJ may not be invertible. This can be overcome by using the following modification to the approximate Hessian matrix: G = H +  I. This leads to Levenberg –Marquardt algorithm Xk+1 = Xk – [JT (Xk) J ( Xk)+kI]-1 JT (Xk) V(Xk) Or  Xk =- [JT (Xk) J ( Xk)+kI]-1 JT (Xk) V(Xk) this algorithm has the very useful feature that as k is increased it approaches the steepest descent algorithm with small learning rate. 43 The iterations of the Levenberg- Marquardt back propagation algorithm (LMBP) can be summarized as follows: 1. Present all inputs to the network and compute the corresponding network outputs and the errors eq = tq – a Mq. Compute the sum of squared errors over all inputs. F(x). F (x) = eq T eq =(ej.q )2 = (vi)2 2. Compute the Jacobian matrix. Calculate the sensitivities with the recurrence relation. Augment the individual matrices into the Margquardt sensitivities. 3. Obtain  Xk. 4. Recompute the sum of squared errors using xk +  Xk.. If this new sum of squares is smaller than that computed in step 1 then divide  by v, let Xk+1 = Xk + Xk and go back to step 1. if the sum of squares is not reduced, then multiply  by v and go back to step 3. Training Procedure During training procedure data from a representative image or a class of images is encoded into a structure of the hidden and output weight matrices. It is assumed that an image, F, used in training of size Rx C and consists of rxc blocks. 1. The first step is to convert a block matrix F into a matrix X of size P x N containing training vectors, x(n), formed from image blocks. That is: P= r.c and p.N = R.C 2. The target data is made equal to the data, that is: D=X 3. The network is then trained until the mean squared error, MSE, is sufficiently small. 44 The matrices Wh and Wy will be subsequently used in the image encoding and decoding steps. Image Encoding The hidden-half of the two-layer network is used to encode images. The Encoding procedure can be described as follows: FX, H= (Wh. X) Where X is an encoded image of F. Image Decoding The image is decoded (reconstructed) using the output-half the two-layer network. The decoding procedure is described as follows: Y = (Wy. H), YF These steps were performed using MATLAB (Matrix laboratory). The compression so obtained was though offline learning. In the off-line learning methods, once the systems enters into the operation mode, its weights are fixed and do not change any more. 45 LEVENBERG-MARQUARDT ALGORITHM 46 6. IMPLEMENTATION OF IMAGE COMPRESSION USING MATLAB A sample image was taken as the input to be compressed. At each instance (1:64, 1:64) pixels were considered. Now using blkM2vc function the matrix was arranged column wise. The target was made equal to the input and the matrix was scaled down. The network was developed using 4 neurons in the first layer (compression) and 16 neurons in the second layer (decompression). The first layer used tangent sigmoid function and the linear function in the second layer. Then the training was performed using Levenberg-Marquardt Algorithm. The training goal was set to le-3 and epochs were used. The following functions were used for this purpose net.traniparam.goal = le-3, net.trainparam.epochs=100. After this the network was simulated and its output was plotted against the target. The function used for this purpose was: A = sim (net_s,in3); Rearranging of the matrix was done using function vc2blkM followed by scaling up. 47 MATLAB CODE comp.m I = imread('J:\matlab\toolbox\images\imdemos\autumn.tif'); size(I) image(I) in1=I(1:64,1:64); figure(1) r=4; imshow(in1) in2=blkM2vc(in1,[r r]); in3=in2/255; in4=in3; net_c=newff(minmax(in3),[4 16],{'tansig','purelin'},'trainlm'); net.trainparam.show=5; net.trainparam.epochs=300; net.trainparam.goal=1e-5; [net_s,tr]=train(net_c,in3,in4); a=sim(net_s,in3); fr=vc2blkM(a,r,64); asc=fr*255; az=uint8(asc) figure(2) imshow(az) disp('training is achieved'); disp('consider a new image to be compressed') II = imread('J:\matlab\toolbox\images\imdemos\fabric.png'); a1=II(1:64,1:64); figure(5) imshow(a1) a2=blkM2vc(a1,[r r]); a3=a2/255; out=sim(net_s,a3); a4=vc2blkM(out,r,64); a5=a4*255; a6=uint8(a5); figure(6) imshow(a6); \ 48 blkM2vc.m function vc = blkM2vc(M, blkS) [rr cc] = size(M) ; r = blkS(1) ; c = blkS(2) ; if (rem(rr, r) ~= 0) | (rem(cc, c) ~= 0) error('blocks do not fit into matrix') end nr = rr/r ; nc = cc/c ; rc = r*c ; vc = zeros(rc, nr*nc); for ii = 0:nr-1 vc(:,(1:nc)+ii*nc) = reshape(M((1:r)+ii*r,:),rc,nc); end 49 vc2blkM.m function M = vc2blkM(vc, r, rM) %vc2blkM Reshaping a matrix vc of rc by 1 vectors into a block-matrix M of rM by cM size % Each rc-element column of vc is converted into a r by c block of a matrix M and placed as a block-row element [rc nb] = size(vc) ; disp(rc); disp(nb); pxls = rc*nb ; if ( (rem(pxls, rM) ~= 0) | (rem(rM, r) ~= 0) ) error('incorrect number of rows of the matrix') end cM = pxls/rM ; if ( (rem(rc, r) ~= 0) | (rem(nb*r, rM) ~= 0) ) error('incorrect block size') end c = rc/r ; xM = zeros(r, nb*c); xM(:) = vc ; nrb = rM/r ; M = zeros(rM, cM); for ii = 0:nrb-1 M((1:r)+ii*r, :) = xM(:, (1:cM)+ii*cM) ; end 50 Functions used in MATLAB program: newff Create a feed-forward back propagation network Syntax net = newff net = newff(PR,[S1 S2...SNl],{TF1 TF2...TFNl},BTF,BLF,PF) Description net = newff creates a new network with a dialog box. newff(PR,[S1 S2...SNl],{TF1 TF2...TFNl},BTF,BLF,PF) takes, PR -- R x 2 matrix of min and max values for R input elements Si -- Size of ith layer, for Nl layers TFi -- Transfer function of ith layer, default = 'tansig' BTF -- Backpropagation network training function, default = 'traingdx' BLF -- Backpropagation weight/bias learning function, default = 'learngdm' PF -- Performance function, default = 'mse' and returns an N layer feed-forward backprop network. The transfer functions TFi can be any differentiable transfer function such as tansig, logsig, or purelin. The training function BTF can be any of the backprop training functions such as trainlm, trainbfg, trainrp, traingd, etc. Caution: trainlm is the default training function because it is very fast, but it requires a lot of memory to run. If you get an "out-of-memory" error when training try doing one of these: Slow trainlm training, but reduce memory requirements by setting net.trainParam.mem_reduc to 2 or more. (See help trainlm.) Use trainbfg, which is slower but more memory-efficient than trainlm. Use trainrp, which is slower but more memory-efficient than trainbfg. The learning function BLF can be either of the backpropagation learning functions such as learngd or learngdm. The performance function can be any of the differentiable performance functions such as mse or msereg. Algorithm 51 Feed-forward networks consist of Nl layers using the dotprod weight function, netsum net input function, and the specified transfer functions. The first layer has weights coming from the input. Each subsequent layer has a weight coming from the previous layer. All layers have biases. The last layer is the network output. Each layer's weights and biases are initialized with initnw. Adaption is done with trains, which updates weights with the specified learning function. Training is done with the specified training function. Performance is measured according to the specified performance function. trainParam This property defines the parameters and values of the current training function. net.trainParam The fields of this property depend on the current training function (net.trainFcn). Evaluate the above reference to see the fields of the current training function. train Train a neural network Syntax [net,tr,Y,E,Pf,Af] = train(net,P,T,Pi,Ai,VV,TV) Description train trains a network net according to net.trainFcn and net.trainParam. train(NET,P,T,Pi,Ai,VV,TV) takes, net -- Neural Network P -- Network inputs T -- Network targets, default = zeros Pi -- Initial input delay conditions, default = zeros Ai -- Initial layer delay conditions, default = zeros VV -- Structure of validation vectors, default = [] TV -- Structure of test vectors, default = [] 52 and returns, net -- New network TR -- Training record (epoch and perf) Y -- Network outputs E -- Network errors. Pf -- Final input delay conditions Af -- Final layer delay conditions sim Simulate closed-loop/open-loop response to arbitrary reference and disturbance signals Syntax sim(MPCobj,T,r) sim(MPCobj,T,r,v) sim(MPCobj,T,r,SimOptions) or sim(MPCobj,T,r,v,SimOptions) [y,t,u,xp,xmpc,SimOptions]=sim(MPCobj,T,...) Description The purpose of sim is to simulate the MPC controller in closed-loop with a linear timeinvariant model, which, by default, is the plant model contained in MPCobj.Model.Plant. As an alternative sim can simulate the open-loop behavior of the model of the plant, or the closed-loop behavior in the presence of a model mismatch between the prediction plant model and the model of the process generating the output data. sim(MPCobj,T,r) simulates the closed-loop system formed by the plant model specified in MPCobj.Model.Plant and by the MPC controller specified by the MPC object MPCobj, and plots the simulation results. T is the number of simulation steps. r is the reference signal array with as many columns as the number of output variables. sim(MPCobj,T,r,v) also specifies the measured disturbance signal v, that has as many columns as the number of measured disturbances. uint8 53 Return the stored integer value of a fi object as a built-in uint8 Syntax Description The stored integer is the raw binary number, in which the binary point is assumed to be at the far right of the word. uint8(a) returns the stored integer value of fi object a as a built-in uint8. If the stored integer word length is too big for a uint8, or if the stored integer is signed, the returned value saturates to a uint8 54 MATLAB Results: 6.1 Training procedure till the MSE becomes less than e-5 55 ORIGINAL IMAGE COMPRESSED IMAGE DECOMPRESSED IMAGE 56 7. CONCLUSION The project “IMAGE COMPRESSION AND DECOMPRESSION USING NEURAL NETWORKS” has been successfully programmed using MATLAB and tested. The computing world has a lot to gain from neural networks. Their ability to learn by example makes them very flexible and powerful. Furthermore there is no need to devise an algorithm in order to perform a specific task; i.e. there is no need to understand the internal mechanisms of that task. They are also very well suited for real time systems because of their fast response and computational times which are due to their parallel architecture. Neural networks also contribute to other areas of research such as neurology and psychology. They are regularly used to model parts of living organisms and to investigate the internal mechanisms of the brain. Perhaps the most exciting aspect of neural networks is the possibility that some day 'conscious' networks might be produced. There is a number of scientists arguing that consciousness is a 'mechanical' property and that 'conscious' neural networks are a realistic possibility. Even though neural networks have a huge potential we will only get the best of them when they are integrated with computing, AI, fuzzy logic and related subjects. Neural networks are performing successfully where other methods do not, recognizing and matching complicated, vague, or incomplete patterns. 57 FUTURE SCOPE: Artificial Neural Networks is currently a hot research area in image processing and it is believed that they will receive extensive application to various fields in the next few years. In contrast with the other technologies, neural networks can be used in every field such as medicine, marketing, industrial process control etc. This makes our application flexible and can be extended to any field of interest. Integrated with the other fields like Artificial intelligence, fuzzy logic neural networks have a huge potential to perform. Neural networks have been applied in solving a wide variety of problems. It is an emerging and fast growing field and there is a huge scope for research and development. 58 BIBILIOGRAPHY 1. H.Demuth and M. Beale. Neural Network TOOLBOX User’s Guide. For use with MATLAB. The Math Works lne.. (1998) 2. Henrieue Ossoinig. Erwn Reisinger., Reinhold Weiss Design and FPGAImplementation of a Neural Network 3. Kiamal Z. Pekmestzi. Multiplexer-Based Array Multipliers. IEEE TRANSACTIONS ON COMPUTERS, VOL, 48 , JANUARY( 1998) 4. Hennessy. J.L and Patterson. D.A. Compter Architecture: A quantitative Approach. Morgan Kaufmanns, (1990) 5. J. Jiang. Image compression with neural networks. Signal Processing: Image Communication 14 (1999) 737-760 59

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download image compression and decompression using