Download Final Presentation

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Entity–attribute–value model wikipedia , lookup

Extensible Storage Engine wikipedia , lookup

Tandem Computers wikipedia , lookup

Database wikipedia , lookup

Microsoft Access wikipedia , lookup

Microsoft Jet Database Engine wikipedia , lookup

Team Foundation Server wikipedia , lookup

Relational model wikipedia , lookup

Database model wikipedia , lookup

Clusterpoint wikipedia , lookup

SQL wikipedia , lookup

Open Database Connectivity wikipedia , lookup

Microsoft SQL Server wikipedia , lookup

PL/SQL wikipedia , lookup

Transcript
Industrial Project (234313)
Final Presentation
“App Analyzer”
Deliver the right apps users want!
(VMware)
Students:
Edward Khachatryan & Elina Zharikov
Supervisors: Yoel Calderon, Yan Aksenfeld
The Problem

IT administrator doesn’t know which applications need to be
managed
Mirage
Servers
& Single
Instance
Stores
Network Optimized
Synchronization & Streaming
Base layer
Application
layer(s)
Drivers
User
profile
User data
Machine
identity
Apps not
installed by
Mirage
Goals
Find the optimal combination of Base and
App layers for a given organization
 Produce reports for the administrator

Finance Apps
Finance Desktops
Single
Base Layer
HR Apps
HR Desktops
Windows 7
Antivirus
Common Apps
IT Apps
IT Desktops
Methodology
Research clustering algorithms
Connect to Mirage Database on SQL
Server
 Parse UTF encoded XML data
 Process and analyze the data
 Build custom reports


Methodology

Research and choose the right set of
tools
◦ Python libraries:




scikit-learn for clustering algorithms
lxml for parsing UTF encoded XML
SQLAlchemy for SQL interaction
pandas for gluing it all together
◦ Microsoft SQL Report Builder for custom
reports
◦ VMWare Mirage web interface for GUI
Achievements
Quick and efficient data analysis: the
desired results can be generated in just a
few minutes
 User friendly experience: a variety of
reports can be produced in a matter of
few clicks
 Integration with the existing VMWare
Mirage platform
 A variety of parameters to customize the
output

Examples
Examples
Examples
Examples
Examples
Examples

Live demonstration…
Conclusions




DBSCAN is a fast clustering algorithm. It’s
scalable for large datasets and works well
with Boolean vectors data.
Instead of the usual Euclidian distance, it’s
better to work with metrics intended for
boolean-valued vector spaces, such as
Jaccard, Sokal-Sneath or Dice.
Using open source libraries saves a lot of
valuable time.
Microsoft SQL Report Builder is a great
WYSIWYG tool for building custom reports
Progress Recap
31.3 – Kickoff Meeting
 31.3-12.4 – Research period: reading materials on
clustering algorithms.
 12.4-19.4 – Installing Microsoft SQL Server,
restoring a VMWare Mirage database, querying
and parsing the data from the database.
 19.4-26.4 – Creating a filtering module to clean
up the raw application list: uniting applications by
their name, product ID or upgrade code, filtering
out unimportant applications. Finalizing the
criteria for Base Layer apps.

Progress Recap




26.4-11.5 – Focusing on 4 clustering
algorithms (K-Means, Agglomerative,
DBSCAN, Birch), testing various parameters
and metrics on different databases.
12.5 – Midway meeting
12.5-19.5 – Continuing the aforementioned
tests, focusing strictly on DBSCAN.
19.5-25.5 – Setting up and configuring a
virtual machine running Windows Server
with VMWare Mirage and Microsoft SQL
Server Reporting Services.
Progress Recap

25.5-7.6
◦ Learning to use Microsoft SSRS, the Report Builder tool and Mirage
web interface.
◦ Moving the Python IDE and SQL databases to the virtual machine.
◦ Actually exporting our results to SQL instead of CSV and text files.
◦ Building a sample report.

7.6-17.6 – Building custom reports according to the given
guidelines.

18.6-27.6 – Improving reports’ appearance, fixing bugs,
parameterizing the Python code.