Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Abandoned object detection in traffic surveillance videos Group 18b: Rahul Sankhwar CS365A: Project Presentation 1 Motivation Due to increase in terrorism there is a need for a good surveillance system. Detection of abandoned objects is an integral part of it. D Pathak, A Sharang, A Mukerjee, “Anomaly Localization in Topic-based Analysis of Surveillance Videos” IEEE Winter Conference on Applications of Computer Vision (WACV 2015). Modelling • Unsupervised Modelling Detection • Anomalous Clip Detection Localization • Spatio-Temporal Anomaly Localization Modeling The authors modeled the anomaly detection problem analogous to topic modeling in NLP. Reference: BTP report. D Pathak and A Sharang Visual word formation Video Frame Vibe Foreground Extractor Foreground Image Location HOG-HOF descriptor 3 dimensions of visual word Blob Size Reference: BTP report. D Pathak and A Sharang • Location : • Each frame of dimension m x n is divided into blocks of 20 x 20 • HOG - HOF descriptor : • For each block, a foreground pixel was selected at random and spatiotemporal descriptor was computed around it. • From the descriptors obtained from the training set, 200,000 descriptors were randomly selected. 20 cluster centres were obtained from these descriptors by k-means clustering. • Each descriptor was assigned to one of these centres. • Size : • In each block , we compute the connected components of the foreground pixels • The size of the connected components is quantised to two values: large and small Reference: BTP report. D Pathak and A Sharang Parametric Bayesian Modeling Video Clip Visual Words Extraction Parametric Bayesian Model (pLSA) pLSA pLSA model gives us likelihood of documents in the topic space, i.e. given a document it gives the probability that the document belongs to a certain topic. Topic vector Given a document, from pLSA, we can get the probability that the document belongs to a certain topic. If there are K topics we can get a K dimensional vector where i-th dimension tells us the likelihood corresponding to i-th topic. This is called the topic vector. Detection The authors proposed an efficient Projection model algorithm for detection of anomaly. Reference: BTP report. D Pathak and A Sharang Preliminaries • Bhattacharyya Distance : • If the documents 𝑑𝑥 and 𝑑𝑦 are represented by the probability distributions in topic space as 𝜃 𝑥 and 𝜃 𝑦 respectively, then distance is defined by d = − log 𝑦 𝑖 𝜃𝑖𝑥 𝜃𝑖 • Cumulative histogram of m documents: • A histogram obtained by stacking the word count histogram of the m documents. • Spatial neighbourhood of a word : • For a word at location 𝑖, 𝑗 , all words at locations 𝑖 ± 1, 𝑗 ± 1 , 𝑖 ± 1, 𝑗 and 𝑖, 𝑗 ± 1 are the spatial neighbours of the word. • Significant distribution of neighbourhood word : • The distribution of a word is significant if its frequency in the cumulative histogram is greater than a threshold 𝑡ℎ𝑛𝑏𝑟 Reference: BTP report. D Pathak and A Sharang Localization • Spatial Localization : Every word has location information in it. Therefore we can directly localize the anomalous words in test document to their spatial locality. • Temporal Localization : If we maintain a list of frame numbers corresponding to documentword pair, we can tag the frames with anomalous words. Reference: BTP report. D Pathak and A Sharang Relation to detection of abandoned objects The previous paper does not detect abandoned objects, since in the visual word formation abandoned objects are not being able to classified in features of the visual word. Reason: Foreground extraction mechanism used in the paper, Vibe, is based on motion cues and models abandoned objects/vehicles as foreground for few frames but then this information dies out. Therefore, problem with abandoned objects is that they loose the foreground characteristic after sometime. Solution Instead of using ViBe, we can use different foreground extraction mechanism. The following paper efficiently captures abandoned objects in the foreground efficiently: Y.L. Tian, R.S. Feris, H. Liu, A. Hampapur and M.T. Sun, “Robust detection of abandoned and removed objects in complex surveillance videos,” IEEE Transactions on Systems, Man and Cybernatics-PartC:Applications and Reviews, vol.41, no.5, pp. 565-576, 2011. http://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=5 571035&tag=1 Work done so far.. Ran code and done analysis for, “Anomaly Localization in Topic-based Analysis of Surveillance Videos”, on the traffic highway dataset. Code is available to us by the authors: http://www.cse.iitk.ac.in/users/abhisg/btp/abhi sg.zip Dataset: http://www.eecs.qmul.ac.uk/˜andrea/avss2007 _d.html Future work In the algorithm proposed in anomaly detection paper, instead of using ViBe, employing different foreground extraction mechanism. Using cosine distance instead of Bhattacharyya Distance. References • • • • • • D Pathak, A Sharang, A Mukerjee, “Anomaly Localization in Topic-based Analysis of Surveillance Videos” IEEE Winter Conference on Applications of Computer Vision (WACV 2015). O. Barnich and M. Van Droogenbroeck. “Vibe: A universal background subtraction algorithm for video sequences.” Image Processing, IEEE Transactions on, 20(6):1709–1724, 2011. I. Laptev, M. Marszalek, C. Schmid, and B. Rozenfeld. Learning realistic human actions from movies. In Computer Vision and Pattern Recognition, 2008. CVPR 2008. IEEE Conference on, pages 1–8. IEEE, 2008. Hofmann, Thomas. "Probabilistic latent semantic indexing." Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval. ACM, 1999 Mahadevan, Vijay, et al. "Anomaly detection in crowded scenes." Computer Vision and Pattern Recognition (CVPR), 2010 IEEE Conference on. IEEE, 2010. BTP Report and presentation on, “Unsupervised Modeling, Detection and Localization of Anomalies in Surveillance Videos” D Pathak, A Sharang, A Mukerjee. Thank you pLSA : Topic Model • Fixed number of topics : 𝑧1 , 𝑧2 … 𝑧𝑘 . Each word in the vocabulary is attached with a single topic. • Topics are hidden variables. Used for modelling the probability distribution • Computation – Marginalise over hidden variables – Conditional independence assumption: p(w|z) and p(d|z) are independent of each other Reference: BTP report. D Pathak and A Sharang EM Algorithm: Intuition • E-Step – Expectation step where expectation of the likelihood function is calculated with the current parameter values • M-Step – Update the parameters with the calculated posterior probabilities – Find the parameters that maximizes the likelihood function Reference: BTP report. D Pathak and A Sharang EM: Formalism Reference: BTP report. D Pathak and A Sharang EM in pLSA: E Step • It is the probability that a word w occurring in a document d, is explained by aspect z (based on some calculations) Reference: BTP report. D Pathak and A Sharang EM in pLSA: M Step • All these equations use p(z|d,w) calculated in E Step • Converges to local maximum of the likelihood function Reference: BTP report. D Pathak and A Sharang