Download Performance Management (Best Practices)

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Recursive InterNetwork Architecture (RINA) wikipedia , lookup

Computer network wikipedia , lookup

Piggybacking (Internet access) wikipedia , lookup

Cracking of wireless networks wikipedia , lookup

Distributed firewall wikipedia , lookup

Network tap wikipedia , lookup

List of wireless community networks by region wikipedia , lookup

Airborne Networking wikipedia , lookup

Transcript
Performance Management
(Best Practices)
REF:www.cisco.com
Document ID 15115
Introduction
• Performance Management involves
optimization of network response time and
management of consistency and quality of
individual and overall network services
• The most important service is the need to
measure the user/application response time.
• For most users, response time is the critical
performance success factor.
Background (1)
• Performance problems often correlate with
capacity of resources (CPU, RAM, Bandwidth).
– In networks, this is typically bandwidth and data
that must wait in queues before it can be
transmitted through the network.
– In voice applications, this wait time almost
certainly impacts users because factors such as
delay and jitter affect the quality of the voice call.
Performance management issues
•
•
•
•
User performance
Application performance
Capacity planning
Proactive fault management
• It is important to note that with newer application
like video and voice performance management is the
key success
Performance management process
flow (1)
Develop a network management
concept of operation
Measure Performance
Perform a Proactive Fault Analysis
Performance management process
flow (2)
1 develop a network management concept of
operation
– Define the required features : Services, Scalability
objectives
– Define availability and network management
objectives
– Define performance SLAs and Metrics
– Define SLA
Performance management process
flow (3)
2 Measure Performance
– Gather network baseline data
– Measure availability
– Measure response time
– Measure accuracy
– Measure utilization
– Capacity planning
Performance management process
flow (4)
3 perform a proactive fault analysis
– Use threshold for proactive fault management
– Network management implementation
– Network operation metrics
Performance management process
flow (5)
Develop a network management
concept of operation
Measure Performance
Perform a Proactive Fault Analysis
Develop a network management
concept of operation
• The purpose of this document is to describe
the overall desired system characteristics from
an operational standpoint
• The focus of this document is to form the long
range operational planning activities for
network management and operation.
• It also provides guidance for the development
of all subsequent definition documentation,
such as service level agreements.
Define the required features:
Services, Scalability Objectives
• Define services objectives :
– To describe what the objectives that networks and
services are supposed to be
– This step requires that you understand applications,
basic traffic flows, user and site counts, and required
network services.
• Define scalability objectives:
– to help network engineers design networks that meet
future growth requirement and not experience
resource constraint (media capacity, number of routes
and etc)
Define availability and network
management Objectives (1)
• Defining availability objectives is to explain the
level of services needed (service level
requirements)
• This helps to ensure the solution meets end
availability requirements
• It might lead to
– categorize different class of service for each
availability requirement
– Higher availability objective might necessitate
increased redundancy and support procedures
Define availability and network
management objectives (2)
• Define manageability objectives to ensure that
overall network management does not lack
management functionality
• It might lead to
– Have understand the process and tools used for
organization
– Uncover all important MIB or network tool
information required to support a potential network
– Have training required to support the new network
service
Define performance SLAs and Metrics
• Performance SLAs and metrics help define and
measure the performance of new network
solutions to ensure they meet performance
requirements.
• The performance SLAs should include the
average expected volume of traffic, peak
volume of traffic, average response time and
maximum response time allowed
Define SLAs (1)
• SLA (Service Level Agreement) – Customer
(Enterprise) , SLM (Service Level Management) Provider
• SLA include definitions for problem types and
severity and help desk responsibilities
– Escalation path, time before escalation at each tier support
level
– Time to start work on the problem
– Time to close target based on priority
– Service to provide in the area of capacity planning, hardware
replacement
Performance management process flow
Develop a network management
concept of operation
Measure Performance
Perform a Proactive Fault Analysis
Measure Performance
• Gather Network Baseline data
– Perform a baseline of the network before and
after a new solution deployment
– A typical router/switch baseline report includes
capacity issues related to CPU, memory, buffer,
link/media utilization, throughput
– Application baseline: bandwidth used by app per
time period
Measure availability
• Availability is the the measure of time for
which a network system or application is
available to a user
– Coordinate the help desk phone calls with the
statistics collected from managed devices
– Check scheduled outages
– Etc
Measure Response Time
• Network response time is the time required to travel
between two points
• Simple level – pings from the network management
station to key points I the network. (not accuracy)
• Server-centric polling : SAA (Service Assurance Agent)
on router (Cisco) to measure response time to a
destination device
• Generate traffic that resembles the particular
application or technology of interest
Measure accuracy
• Accuracy is the measure of interface traffic
that does not result in error and can be
expressed in term of percentage
• Accuracy = 100 – error rate
• Error rate = ifInErrors * 100 / (ifInUcastPkts +
IfInNUcastPkts)
Measure Utilization (1)
• Utilization measure the use of a particular
resource over time
• Percentage in which the usage of a resource is
compared with its maximum operational
capacity
• High utilization is not necessarily bad
• Sudden jump in utilization can indicate
unnormal condition
Measure Utilization (2)
• Input utilization =
ifInOctets *8*100/(time in second)*ifSpeed
• Output Utilization
ifOutOctets *8*100/(time in second)*ifSpeed
Capacity planning
• The following are potential areas for concern:
– CPU
– Backplane or I/O
– Memory
– Interface and pip sizes
– Queuing, latency and jitter
– Speed and distance
– Application characteristics
Performance management process flow
Develop a network management
concept of operation
Measure Performance
Perform a Proactive Fault Analysis
Perform a Proactive fault analysis
• One method to perform fault management is
through the use of RMON alarms and event
groups
• Distributed management system that enables
polling at a local level with aggregation of data
at a manager to manager
Use threshold for proactive fault
management (1/2)
• Threshold is the point of interest in specific
data stream and generate event when
threshold is triggered
• 2 classes of threshold for numeric data
– Continuous threshold apply to continuous or time
series data such as data stored in SNMP counter or
gauges
– Discrete threshold apply to enumerated objects or
discrete numeric data such as Boolean objects
Use threshold for proactive fault
management (2/2)
• 2 different forms of continuous threshold
– Absolute :use with gauges
– Relative (delta): use with counter
• Step to determine threshold
– 1 select the objects
– 2 select the devices and interfaces
– 3 determine the threshold values for each object or
interface
– 4 determine the severity for the event generated by
each threshold
Network management implementation
• The organization should have an implemented
network management system.
• SNMP/RMON or other network management
system tools
Network operation metrics (1/2)
• Number of problems that occurs by call priority
• Minimum, maximum and average time to close
in each priority
• Breakdown of problems by problem type
(hardware, software crash, configuration,
power user error)
Network operation metrics (2/2)
• Breakdown of time to close for each problem
type
• Availability by availability or SLA
• How often you met or missed SLA
requirements
Performance Management
Indicator
Indicators for performance management
(1/3)
• Performance indicators provide mechanism by which
an organization can measure critical success factors.
• They are the followings:
• Document the network management business
objectives
• Create detailed and measurable service level
objectives
Indicators for performance management
(2/3)
• Provide documentation the service level agreement
(SLA) with charts or graphs that show the success or
failure of how these agreements are met over the
time
• Collect a list of the variables for the baseline such as
polling interval, network management overhead
incurred, possible trigger threshold
– whether the variable is used as a trigger for a trap, and
trending analysis used against each variable
Indicators for performance management
(3/3)
• Have a periodic meeting that reviews the analysis of the
baseline and trends.
• Have a what−if analysis methodology documented.
– This should include modeling and verification where
applicable
• When thresholds are exceed, develop documentation
on the methodology used to increase network
resources.
– One item to document is the time line required to put in
additional WAN bandwidth and a cost table
Document the network management
business objectives (1/3)
• This document is the organization network
management strategy and should coordinate
the overall business goals of network
operations, engineering, design, other
business units and the end users.
• It enable the organization to form the long
range planning activities for network
management and operation.
Document the network management
business objectives (2/3)
• Identify a comprehensive plan with achievable
goals
• Identify each business service/application that
require network support
• Identify those performance-based metric needed
to measure service
Document the network management
business objectives (3/3)
• Plan the collection and distribution of the
performance metric
• Identify the support needed for network
evaluation and user feedback
• Have documented, detailed and measurable
SLA objectives
Document the Service Level Agreements
• Before document the SLA, you must define the
service level objectives metrics
• This document should be available to users for
evaluation to provide feedback for variables
needed to maintain the service agreement
level
• SLAs are living agreement
– What works today might become obsolete
tomorrow
Create a list of variables for the baseline
• This list includes items such as
– polling interval
– Network management overhead incurred
– Possible trigger thresholds
– Trending analysis used against each variable
– Router health
– Switch health
– Routing information
– Utilization
– delay
Reviews the baseline and trends
• Network management personnel should
conduct meeting periodically (operational and
planning)
• Also include the review of SLA
Document a what-if analysis methodology
• A what-if analysis involves modeling and
verification of solutions.
• It includes the major questions, the
methodology, data sets and configuration file
• The main point is that he what-if analysis is an
experiment hat someone else should be able to
recreate with the information provided in the
document
Document the methodology used to
increase network performance
• This document includes additional WAN bandwidth
and a cost table that helps increase the bandwidth
for a particular type of link
• It helps the organization realize how much time and
money it costs to increase the bandwidth
• Periodic review this document to ensure that it
remain up to date