Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Author: Tzu-Fang Sheu ,Nen-Fu Huang and Hsiao-Ping Lee Publisher: IEEE Globecom , 2006 Presenter: Tsung-Lin Hsieh Date: 2012/05/16 1 Introduction Related Work A Cost-Effective String Matching Algorithm Results 2 The Aho-Corasick algorithm (AC) had the best worst-case computational time complexity( compared to other algorithms [1] ~ [7]). But is also needs a lot of memory. Tuck et al modified the AC with a compressed data structure, which reduced the memory size, but also increased the processing time[10]. 3 In this paper, we will propose a practical multiplepattern matching algorithm that has better worstcase performance as well as smaller required memory. The proposed novel scheme is based on the property of Chinese Remainder Theorem and contributes modifications to the AC. 4 Aho-Corasick Algorithm each node – 1028 bytes -> too much memory 5 Aho-Corasick Algorithm with Bitmap It can reduce the memory to only 44 bytes/per state But it needs doing “popcount” to calculate the offset of the starting pointer -> too much time 6 Assume that we can find a simple function so that the input symbols {h,s,i} can be mapped to {0,1,2} Assume that there is a magic number X 7 8 For example, assume we have three valid symbols {h, s, i} that have paths to the child state as shown in Figure 3. Assign three prime numbers {2, 3, 5} for {h, s, i} respectively. We want to find X % 2 = 0 ,X % 3 = 1 ,X % 5 = 2. According to CRT -> X = 22 Using “s” for test ,prime represent “s” is 3. 22 % 3 = 1 -> when input is “s” ,visit child 1. 9 52 bytes The Multiple-Pattern Matching with a Magic Number Input “ish” 10 11 ACO-100 means penalty of external access is 100 cycles 12 As the required time and memory are usually tradeoff, to compare the overall costs of these three algorithms, we define an evaluation function C: C = CM × CT. (CM: total memory, CT: average time) 13