Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Searching In Geographical Dataset by using modified k-mean Clustering algorithm Anita kumari kumawat MTech scholar Department of computer science Skit, jaipur [email protected] Abstract- Document clustering is used in a number of different areas for mining and information retrieval. Initially, document clustering an efficient way of finding the nearest neighbors of a document. In the clustering we used N data points in the dimensions and all the data points are arrange in the groups that’s group called the clusters .Here we used geographical map dataset that based on distance between the data object .In the clustering similar object related to same object and dissimilar object related to other cluster. The map dataset is used to implement modified k-mean algorithm. Here we used the concept of k-mean algorithm that’s algorithm only work on linear search but for the geographical search circular points are locate so that a different algorithm are used for that particular points are locate by mouse click that are the cluster to define the nearest point to locate the particular cluster. They are not efficient for the queries where geographical location is important, such as finding petrol pump, hospitals and police station within an area or close to a place of interest. Here we used map and according that map cities cluster are made and allocated their near images store in database about location to particular data cluster. When we want to search a particular location then locate the cluster and that’s contain multiple image information And we easily found the location in the minimum time. Keyword used:-data mining, data clustering, modified kmean algorithm, GIS 1. INTRODUCTION 1.1 Document clustering In the document clustering automatically group related documents into clusters. Here a fundamental problems are occurs that how we abstracts data for knowledge discovering than that’s type of problem are occurs in the data mining and knowledge discovery, and pattern classification is the clustering problem. There is very large amount of data availability in real world and it is very difficult to excess the useful information from this huge database and provide the information to which it is needed within time limit and in required pattern. So data mining is the tool for extracting the information from huge database and present it in the form of human readable. Cluster analysis of data is an important task in knowledge discovery and data mining. Cluster analysis aims to group data on the basis of similarities and dissimilarities among the data elements. The process can be performed in a supervised, semi-supervised or unsupervised manner .Cluster a collection of data objects Similar to one another within the same cluster and Dissimilar to the objects in other clusters .in the Cluster Mr. Saurabh Ranjan Srivastava Assistant professor Department of computer science Skit, jaipur analysis Grouping a set of data objects into clusters .Clustering is unsupervised classification means no predefined classes. Different algorithms have been proposed which take into account the nature of the data and the input parameters in order to cluster the data. Given geographic query that is composed of query keywords and a location. The geographic search engine retrieves documents that are the most textually and spatially relevant to the query keywords and the location. Fig1.1- An example of a data set with a clear cluster structure 1.2 Modified k-mean Clustering Algorithms Here we first define the predefined k-mean algorithm that’s algorithm only works on linear search .that’s algorithm are. K-MEAN ALGORITHM 1. Place K points into the space represented by the objects that are being clustered. These points represent initial group centroids. 2. Assign each object to the group that has the closest centroid. 3. When all objects have been assigned, recalculate the positions of the K centroids. 4. Repeat Steps 2 and 3 until the centroids no longer move. This produces a separation of the objects into groups from which the metric to be minimized can be calculated. In this GIS system we used the modified k-mean algorithm . Start MODIFIED K-MEAN ALGORITHM 1. 2. 3. 4. 5. 6. 7. Enter the client request in the dropdown list. In the search function call that’s have three sub functions like police station, hospital, petrol pump. After the select on mouse click on map where you want to search. According to mouse click define the cluster center and make red and circular nearest point and make green color Load images nearest to the location If found then load images otherwise show blank. Repeat the process again. 1.3 Uses of Clustering Interpretability and usability Discovery of clusters with arbitrary shape Ability to deal with noisy data, Incremental clustering and insensitivity to input order High dimensionality Data reduction-Summarization: Preprocessing for regression, PCA, classification, and association analysis. Compression: Image processing: vector quantization Prediction based on groups Cluster & find characteristics/patterns for each group Finding K-nearest Neighbors Localizing search to one or a small number of clusters . This algorithm process steps are- Search Hospital Petrol pump Police station Mouse click on map as input Search () function call According to mouse click define the cluster center and make red and circular nearest point and make green Load images show in fig3.2 Show images Not show images End Procedure of searching 2. THE PROPOSED MAP DATASET In this we used the concept of clustering ,we make cluster for the map dataset .we know that map is geographical dataset and contain large data we already discuss the many types algorithm that’s all have their functionality and have their drawbacks. We know that kmean algorithm is the document clustering algorithm but that’s algorithm cannot applied for large dataset therefore we used the cluster concept based on the near about data points. Map Reduce is a software framework that allows developers to write programs that process massive amounts of unstructured data in parallel across a distributed cluster of processors or stand-alone computers. Fig2.1- Geographical map Here we define the cluster centroied when we click on the map then according this cluster and their nearest point are selected we show the clusters that’s- nearest data points . when we Fig3.1- To locate particular cluster and nearest data points In this when we locate a particular location in the map then according this cluster select and fill color red and near about point make green. then according this we assign the hospital, police station and petrol pump .if we want to search the petrol pump then that’s location near petrol pump is loaded and easily we found the knowledge. Fig2.2 -Define cluster . The primary task of geographical information system is client gives request. Here we already store all the dataset According the each cluster in this we locate jaipur map dataset and show only a particular area of the city. Then according each area relate to the cluster. each cluster contains images of petrol pump, police station and hospital. 3. IMPLEMENTAION In this work we first design the map according this design the clusters. than each cluster assign their city location .each cluster near about locate the data points .here we used the jaipur city map data. Here in the jaipur city particular location cluster located and assign their Fig:3.2-according select cluster near about location of petrol pump. 4. CONCLUSION This application is used for geographical map dataset but we can used this application for local searching Like that for the Mall we can add a GPS system with this project and find out the position of a particular persons in the wide area .It is not for map it can also used for all over wide area because it is fast techniques. This paper presents a modified k-means clustering algorithm based on the clustering problem with balancing constraints, and achieves several conclusions. In addition, many numerical computations are made to analysis and verify the performances of the proposed algorithm. ACKNOWLEDGMENT A scholarly and quality work like designing any project can be accomplished by motivation, guidance and inspiration of certain quarters besides the individual efforts. Let me in this page express my heartiest gratitude to all those who helped me in various stage of this study. I am very much thankful to Mr. Saurabh Ranjan Srivastava, Dept. of CS for giving us permission to undergo this opportunity and providing all other necessary facility. REFERENCES [1]Wikipediahttp://en.wikipedia.org/wiki/geographical map dataset [2] Wikipedia http://en.wikipedia.org/wiki/ K-means map [3] Wikipedia http://en.wikipedia.org/wiki/ k-means geographical map [4] Zhu Shunzhi, Wang Dingding, Li Tao, “Data clustering with size constraints”, Knowledge-Based Systems, vol. 23, no. 8, pp. 883-889,2010 [5] Arindam Banerjee, Joydeep Ghosh, “Scalable clustering algorithms with balancing constraints”, Data Mining and Knowledge Discovery,vol. 13, no. 3, pp. 365-395, 2006 [6] Ian Davidson, S. S. Ravi. “Clustering with constraints: Feasibility issues and the k-means algorithm”, In Proc. of the 5th SIAM International Conference on Data Mining (SDM-05), 2005.