Download Searching In Geographical Dataset by using modified k

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Human genetic clustering wikipedia , lookup

Expectation–maximization algorithm wikipedia , lookup

Nonlinear dimensionality reduction wikipedia , lookup

K-nearest neighbors algorithm wikipedia , lookup

Nearest-neighbor chain algorithm wikipedia , lookup

K-means clustering wikipedia , lookup

Cluster analysis wikipedia , lookup

Transcript
Searching In Geographical Dataset by using
modified k-mean Clustering algorithm
Anita kumari kumawat
MTech scholar
Department of computer science
Skit, jaipur
[email protected]
Abstract- Document clustering is used in a number of
different areas for mining and information retrieval.
Initially, document clustering an efficient way of finding
the nearest neighbors of a document. In the clustering
we used N data points in the dimensions and all the data
points are arrange in the groups that’s group called the
clusters .Here we used geographical map dataset that
based on distance between the data object .In the
clustering similar object related to same object and
dissimilar object related to other cluster. The map dataset
is used to implement modified k-mean algorithm. Here
we used the concept of k-mean algorithm that’s
algorithm only work on linear search but for the
geographical search circular points are locate so that a
different algorithm are used for that particular points are
locate by mouse click that are the cluster to define the
nearest point to locate the particular cluster. They are not
efficient for the queries where geographical location is
important, such as finding petrol pump, hospitals and
police station within an area or close to a place of
interest. Here we used map and according that map cities
cluster are made and allocated their near images store in
database about location to particular data cluster. When
we want to search a particular location then locate the
cluster and that’s contain multiple image information
And we easily found the location in the minimum time.
Keyword used:-data mining, data clustering, modified kmean algorithm, GIS
1. INTRODUCTION
1.1 Document clustering
In the document clustering automatically group related
documents into clusters. Here a fundamental problems
are occurs that how we abstracts data for knowledge
discovering than that’s type of problem are occurs in the
data mining and knowledge discovery, and pattern
classification is the clustering problem. There is very
large amount of data availability in real world and it is
very difficult to excess the useful information from this
huge database and provide the information to which it is
needed within time limit and in required pattern. So data
mining is the tool for extracting the information from
huge database and present it in the form of human
readable. Cluster analysis of data is an important task in
knowledge discovery and data mining. Cluster analysis
aims to group data on the basis of similarities and
dissimilarities among the data elements. The process can
be performed in a supervised, semi-supervised or
unsupervised manner .Cluster a collection of data objects
Similar to one another within the same cluster and
Dissimilar to the objects in other clusters .in the Cluster
Mr. Saurabh Ranjan Srivastava
Assistant professor
Department of computer science
Skit, jaipur
analysis Grouping a set of data objects into clusters
.Clustering is unsupervised classification means no
predefined classes. Different algorithms have been
proposed which take into account the nature of the data
and the input parameters in order to cluster the data.
Given geographic query that is composed of query
keywords and a location. The geographic search engine
retrieves documents that are the most textually and
spatially relevant to the query keywords and the location.
Fig1.1- An example of a data set with a clear cluster
structure
1.2 Modified k-mean Clustering Algorithms
Here we first define the predefined k-mean algorithm
that’s algorithm only works on linear search .that’s
algorithm are.
K-MEAN ALGORITHM
1. Place K points into the space represented by the
objects that are being clustered. These points represent
initial group centroids.
2. Assign each object to the group that has the closest
centroid.
3. When all objects have been assigned, recalculate the
positions of the K centroids.
4. Repeat Steps 2 and 3 until the centroids no longer
move. This produces a separation of the objects into
groups from which the metric to be minimized can be
calculated.
In this GIS system we used the modified k-mean
algorithm .
Start
MODIFIED K-MEAN ALGORITHM
1.
2.
3.
4.
5.
6.
7.
Enter the client request in the dropdown list.
In the search function call that’s have three sub
functions like police station, hospital, petrol
pump.
After the select on mouse click on map where
you want to search.
According to mouse click define the cluster
center and make red and circular nearest point
and make green color
Load images nearest to the location
If found then load images otherwise show
blank.
Repeat the process again.
1.3 Uses of Clustering

Interpretability and usability

Discovery of clusters with arbitrary shape
Ability to deal with noisy data, Incremental
clustering and insensitivity to input order High
dimensionality

Data reduction-Summarization: Preprocessing
for regression, PCA, classification, and
association analysis. Compression: Image
processing: vector quantization

Prediction based on groups Cluster & find
characteristics/patterns for each group

Finding K-nearest Neighbors Localizing search
to one or a small number of clusters
. This algorithm process steps are-
Search
Hospital
Petrol pump
Police station
Mouse click on
map as input
Search () function call
According to mouse click define the
cluster center and make red and
circular nearest point and make green
Load
images
show in
fig3.2
Show
images
Not show
images
End
Procedure of searching
2. THE PROPOSED MAP DATASET
In this we used the concept of clustering ,we make
cluster for the map dataset .we know that map is
geographical dataset and contain large data we already
discuss the many types algorithm that’s all have their
functionality and have their drawbacks. We know that kmean algorithm is the document clustering algorithm but
that’s algorithm cannot applied for
large dataset
therefore we used the cluster concept based on the near
about data points. Map Reduce is a software framework
that allows developers to write programs that process
massive amounts of unstructured data in parallel across a
distributed
cluster
of
processors
or
stand-alone
computers.
Fig2.1- Geographical map
Here we define the cluster centroied when we click on
the map then according this cluster and their nearest
point are selected we show the clusters that’s-
nearest
data
points
.
when
we
Fig3.1- To locate particular cluster and nearest data
points
In this when we locate a particular location in the map
then according this cluster select and fill color red and
near about point make green. then according this we
assign the hospital, police station and petrol pump .if we
want to search the petrol pump then that’s location near
petrol pump is loaded and easily we found the
knowledge.
Fig2.2 -Define cluster
.
The primary task of geographical information system is
client gives request. Here we already store all the dataset
According the each cluster in this we locate jaipur map
dataset and show only a particular area of the city. Then
according each area relate to the cluster. each cluster
contains images of petrol pump, police station and
hospital.
3. IMPLEMENTAION
In this work we first design the map according this
design the clusters. than each cluster assign their city
location .each cluster near about locate the data points
.here we used the jaipur city map data. Here in the jaipur
city particular location cluster located and assign their
Fig:3.2-according select cluster near about location of
petrol pump.
4. CONCLUSION
This application is used for geographical map dataset but
we can used this application for local searching Like that
for the Mall we can add a GPS system with this project
and find out the position of a particular persons in the
wide area .It is not for map it can also used for all over
wide area because it is fast techniques. This paper
presents a modified k-means clustering algorithm based
on the clustering problem with balancing constraints, and
achieves several conclusions. In addition, many
numerical computations are made to analysis and verify
the performances of the proposed algorithm.
ACKNOWLEDGMENT
A scholarly and quality work like designing any project
can be accomplished by motivation, guidance and
inspiration of certain quarters besides the individual
efforts. Let me in this page express my heartiest gratitude
to all those who helped me in various stage of this study.
I am very much thankful to Mr. Saurabh Ranjan
Srivastava, Dept. of CS for giving us permission to
undergo this opportunity and providing all other
necessary facility.
REFERENCES
[1]Wikipediahttp://en.wikipedia.org/wiki/geographical map
dataset
[2] Wikipedia http://en.wikipedia.org/wiki/ K-means map
[3] Wikipedia http://en.wikipedia.org/wiki/ k-means
geographical map
[4] Zhu Shunzhi, Wang Dingding, Li Tao, “Data clustering with
size constraints”, Knowledge-Based Systems, vol. 23, no. 8,
pp. 883-889,2010
[5] Arindam Banerjee, Joydeep Ghosh, “Scalable clustering
algorithms with balancing constraints”, Data Mining and
Knowledge Discovery,vol. 13, no. 3, pp. 365-395, 2006
[6] Ian Davidson, S. S. Ravi. “Clustering with constraints:
Feasibility issues and the k-means algorithm”, In Proc. of the
5th SIAM International Conference on Data Mining (SDM-05),
2005.