Download Nominum Data Science

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Transcript
Service Brief
Nominum Data Science
Analyzing 100 billion DNS transactions every day to discover and
validate emerging cyberthreats.
Nominum is a pioneer and global leader in DNS-based security and services
innovation, providing an integrated suite of DNS-based applications that are used
by service providers across the globe. Nominum Data Science performs powerful
data analysis on global DNS transactions daily, identifying new cyberthreats from
legitimate traffic. Threat intelligence is streamed live to service provider customers,
immediately blocking malicious domains to protect networks, subscribers and
business customers from phishing, ransomware, DDoS and other cyberthreats.
Unique Data Insights
To offer proactive protection, Nominum Data Science analyzes daily, weekly and
quarterly data sets to predict the next steps cybercriminals will take. The goal:
detect attack signals in the sea of DNS data, and validate known attack types
while simultaneously detecting new, unknown and unnamed malicious activity.
In addition to using commercial and public data sources, the team analyzes 100
billion queries daily from Nominum customers. Nominum works with more than 130
service providers in over 40 countries, resolving 1.7 trillion queries daily.
SERVICE PROVIDER DATA
DNS AND PROXY
MOBILE + FIXED
SUBSCRIBERS + BUSINESSES
= 100B QUERIES DAILY
COMMERCIAL DATA SOURCES
BUSINESS CUSTOMERS
PUBLIC DATA SOURCES
SUBSCRIBERS
Nominum Data Science is a global
team of experts from a diverse range
of disciplines: internet security, machine
learning, artificial intelligence, natural
language processing and neural
networks, which analyzes 100 billion
DNS queries every day, enabling realtime cyberthreat protection for networks,
subscribers and businesses.
The team analyzes over 100 billion queries daily
from global service providers. This, along with
third-party data, is examined through proprietary
methods to detect threats and stream threat
intelligence.
NETWORK
DATA SCIENCE METHODS
Streaming Threat Intelligence
Anomaly Detection & Pattern Recognition
nominum.com
1
Service Brief
Revealing Hidden Threats
One method the team employs is tracking domains generated by Domain Generation
Algorithms (DGAs). DNS-based attacks go through complicated movements and
variations of patterns to avoid detection. A common theme attacks have is creating
large spikes in queries and domain names, which often sounds the alarm for Nominum
Data Science to step in and investigate further. For example, a DDoS attack may
include only a few domain names, but generate billions of queries that start and
stop unpredictably, causing enormous spikes in traffic. The catch is these attacks
will target both popular and obscure domains, making the attack difficult to verify
right away. But if we dive deeper into the data, evidence of an attack becomes clear.
To thwart these attacks, the team analyzes the following patterns for potential
further analysis:
•
New domain names. Hundreds of thousands of new domain names might
appear each day, particularly those with nonsensical names.
•
Domain name lengths. Domains with 14 characters, 17 characters and 37
characters exceed what may be expected in a typical distribution. This type of
character length may indicate a malicious algorithm or machine-generated domain.
•
Frequency of queries against a domain. Patterns may include an unusual
spike in the frequency of queries and the number of clients querying domains.
A common theme
attacks have is
creating large
spikes in queries
and domain
names, which
often sounds the
alarm for Nominum
Data Science
to step in and
investigate further.
Clustering Algorithms Group Domain Names
Nominum Data Science has developed unique algorithms to detect anomalies. One
set of algorithms queries patterns to determine whether they match a specific profile of
known malicious activity or not. Another set of algorithms applies advanced machine
learning techniques to these “anomalous” names to find the malicious activity.
These clustering algorithms use attributes for each domain name to calculate a
vector and measure how closely it compares with every other name. This process
exposes subtle patterns that link different names to a single malware family. Names
Suppobox-c
Pykspa
Unknown
len-12-18 com
Necurs
Nominum correlation technology identifies
domain names with common characteristics.
Necurs V2
New GOZ
Necurs
Conficker-A
Dyre
Unknown
len-12-18 com
Necurs
Suppobox-c
Necurs V2
Unknown len-16 com
nominum.com
2
Service Brief
with similar vectors form what are known as “clusters” that represent the output of a
multi-dimensional feature matrix into a two-dimensional space. Each dot represents
a domain, and the closer the dots are to each other the more similar they are.
Core Domain Analysis Uncovers Cybercrime Patterns
Because a large percentage of all new domains are used for malicious purposes,
Nominum Data Science has a recipe to classify new domains. First, new domains
are filtered into a quarantine list or “gray list.” Then, additional classification algorithms
are used to make the distinction between gray-area domains and legitimate domains.
Next, domains that have resolved DNS queries are grouped as well as the domains
that have unresolved queries.
This is important for threat classification: new, unresolved domains are usually
associated with botnet C&Cs. New, resolved domains are associated with phishing,
adware, malvertising and other types of attacks, which must be registered and
resolvable to perform their intended malicious function. We begin with one million
queries processed per second, then filter for new core domains only (usually 50-60 per
second). Nominum machine learning algorithms are applied, along with filtering and
clustering, to identify malicious domains. On average, four to five percent of domains
reach the end of the funnel, and are relayed to our streaming threat intelligence.
Because a large
percentage of all
new domains are
used for malicious
purposes,
Nominum Data
Science has a
recipe to classify
new domains.
The Nominum Threat Dashboard provides
a real-time, inside view into the process of
detecting new malicious core domains. This
snapshot represents a single day of analysis.
Once the domains are classified into two mega-groups, Nominum Data Science
applies proprietary (unsupervised) machine learning algorithms to build smaller
clusters of domains, identifying subtle relationships between the cluster’s members
to glue them together. Finally, we determine “known/named-malicious” or “unknown/
unnamed-malicious” queries. These clusters are matched with up-to-date third party
cyberintelligence data. If even a single domain in a cluster is mapped to a “known”
malicious domain, this elevates the maliciousness level of the entire cluster (what we
call “guilt by association”). The more domains we can map in a cluster to “known”
malicious domains, the higher our confidence is in the maliciousness of the cluster.
Finally, the “unknown/unnamed-malicious” cluster category contains groups that
do not match any known threat but still have enough bad characteristics to indicate
maliciousness. A cluster of unresolved domains, e.g., those with a similar string length,
are very likely malicious, even though the security industry has not yet identified
and named them. In this way, threats can be blocked that have not previously been
identified, based on anomalous behavior.
nominum.com
3
Service Brief
Rapid Detection of Threats
Today’s sophisticated attacks routinely evade conventional after-the-fact technologies
such as firewalls and signature-based detection so it’s essential to adopt new
methods that predictively neutralize these new threats. Nominum learns from internet
activity patterns to identify attacker infrastructure being staged for the next threat,
making possible to predict and prevent attacks before they’re fully launched. For
example, C&C communications can be stopped before they do real harm.
This proactive protection requires a few things: an extremely large data set and
proprietary data analytics and visualization tools that allow Nominum to actively
anticipate and block cyberthreats. Human intelligence is combined with machine
learning to uncover new patterns. Statistical models categorize these patterns, detect
anomalies, and automatically identify known and emergent threats. Not only can
DNS see traffic very quickly; the time to protection is dramatically reduced when
advanced machine learning techniques are applied.
0
seconds
Predicted
48
14
seconds
minutes
DDoS
Other Threats
Average time to detect threats is less than 15
minutes including threats that are previously
unknown/unpublished.
Live Streaming Intelligence
Nominum threat lists are regularly updated, based on an extensive validation process.
Over 100,000 domains are added daily to the domain block list to guard against
fast-changing exploits. N2 ThreatAvert provides network protection against botnets,
tunneling and DDoS attacks including amplification while N2 Secure Consumer
and N2 Secure Business provide protection from ransomware, phishing and other
malware, across all devices. By offering advanced, proactive protection from today’s
dynamic threats, service providers gain unique competitive advantage.
ABOUT NOMINUM
CORPORATE HEADQUARTERS
Nominum provides an integrated suite of carrier-grade DNS-based cloud solutions
that enable fixed and mobile operators to enhance and protect their networks,
strengthen security for consumers and business subscribers, and offer innovative
value-added services. The result is improved service agility, increased revenue,
greater brand loyalty and a strong competitive advantage. More than 130 providers
in over 40 countries use Nominum software.
Nominum, Inc.
800 Bridge Parkway
Redwood City, CA 94065
© 2017 Nominum, Inc. Nominum, Vantio and N2 are trademarks of Nominum, Inc.
+1 (650) 381-6000
[email protected]