Download Big Data and Complex Networks Analytics

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Big Data and Complex
Networks Analytics
Timos Sellis, CSIT
Kathy Horadam, MGS
Big Data – What is it?
Most commonly accepted definition, by
Gartner (the 3 Vs)
“Big data is high-volume, high-velocity
and high-variety information assets that
demand cost-effective, innovative forms of
information processing for enhanced
insight and decision making.”
2
Big Data – some stats
• high-volume, high-velocity and high-variety
> 2 million emails sent
100,000 tweets 571 websites added
250,000 items sold on amazon
34,722“likes”
$272,020 spend on web shopping
Every
minute…
(http://www.domo.com/blog/blog
/2012/06/08/how-much-data-iscreated-every-minute/)
3
Complex Networks – What is it?
Network with significant topological features
common in real-world networks
eg most technological, biological and
social networks
Rapidly expanding field
bringing together mathematics,
engineering, computer science,
sociology, epidemiology, physics, biology.
4
Big Data and Complex Network
Synergies
•Both share interesting properties
– Large scale (volume)
– Complexity (variety)
– Dynamics (velocity)
•Interesting analytics algorithms
•Many applications with both characteristics
(social networks, utility networks, security, etc)
5
Big Data - Research Issues (1)
•Main stream
–Infrastructure and Architectures (New large
scale data architectures, Cloud architectures)
–Models (Data representation, storage, and
retrieval) and
–Data Access (Query processing and
optimization, Privacy, Security)
6
Big Data - Research Issues (2)
•Complex Data Analytics
–Computational, mathematical, statistical, and
algorithmic techniques for modelling high
dimensional data, large graphs, and complex
(interrelated) data
–Learning, inference, prediction, and knowledge
discovery for large volumes of dynamic data sets
–Data retrieval and data mining to facilitate pattern
discovery, trend analysis and anomaly detection
–Dimensionality reduction, sparse data
7
Big Data - Research Issues (3)
•Highly Streaming Data
–Positional streams
–Social network data
–Mobile app data
–Game data
8
Big Data - Research Issues (4)
•Data Integration
–Findability and search
–Information fusion of multiple data sources
–Semantic integration
–Recommendation systems
9
Networks- Research Issues (1)
Analytics
Mathematical models of simpler networks do not
show the significant topological features.
– Network structure and community detection
– Knowledge discovery, especially of
characteristic small communities (motifs) in
large networks
– Bipartite networks
10
Networks- Research Issues (2)
Dynamics
– Algorithm development: machine learning, high
dimensional data, large networks
– New topological, statistical techniques
–
Eg. persistent homology: track connectivity changes
RMIT could be a national leader if we could develop this further
11
Networks- Research Issues (3)
Detection and Prediction
–Identification of influential or hidden nodes or
communities across networks
–Structural anomaly detection (via supervised or
unsupervised learning)
–Model transmission or flow through network
00
!!
Correlation=94%
Data
Fit
50
0
06
June 2001
1st June 2002
1st June 2003
1st June 2004
1st June 2005
1st June 2006
1st June 2007
year
Fitting period
1st June 2008
1st June 2009
1st June 2010
1st June 2011
1st June 2001
Extrapolation
12
Networks- Research Issues (4)
•Location and Spatial Networks
– Prioritised habitats
13
Possible Research Themes (1)
• Situation Awareness applications (Disaster
Management, Fault detection)
• Resource Management applications (Ecology,
environment, power network management)
• Public Health applications (Epidemics, medical
records)
• Financial and Forensic applications (Fraud
detection, money laundering)
• Smart cities applications (Transport, Energy)
14
Possible Research Themes (1)
• Security applications (Biometrics, computer and
information security)
• Positioning Technologies applications
(Agriculture, Forest health, real-time tracks, large
mobile networks)
• Education (Learning analytics)
15
RMIT today
• High-interest, cutting-edge and well-funded research in:
• Large scale Data Integration – Data quality, etc
• Sensor networks – Data driven complex networks, Sensor
network data, Distributed Sensor Networks
• Complex Networks/Graphs – network/graph models and
structure detection, graph mining, network/graph analysis,
prediction, identification and security
• Positioning apps/technologies
• Power and Transport networks, network analysis for
detecting possible problems, streamed metering data, real
time analytics
16
RMIT today - Examples
Former Employees
Current Employees
Contractors
Insiders
Trusted Business
Cloud Providers
Partners
Anomaly detection
Smart metering
Money laundering
Epidemic spread
Biometric Identification
17
RMIT tomorrow
• Foster collaboration between many disciplines
towards large scale information management. For
example, planners, designers and technologists
can collaborate on designing buildings fitted with
sensors using intelligent optimisation techniques.
• Plan for a major collaborative effort, like a CRC.
• Build long term partnerships with key international
and national public and private organizations.
18
Preliminary SWOT analysis
Strengths
Weaknesses
1.
2.
3.
4.
5.
6.
1. No major results/history in the
area
2. Big data and complex networks
on its own is not recognised as
an RMIT strength
Infrastructure/data management
Complex network dynamics
Location based services
Information retrieval
Optimization
Theoretical analysis
Opportunities
Threats
1. NICTA funding potential for RMIT centre 1. A couple of CoE proposals
2. Cover different application areas,
submitted
compared to on-going activities
2. Some other on-going efforts
3. Identify a short term impact opportunity
(CRCs, government CoE)
4. Identify an opportunity that can attract
3. Fragmentation based on
an industry sector (e.g. logistics, energy
disciplines, due to cultural
and positioning/mobile applications)
difference
19
Related documents