Download J2SE and J2EE Performance - Best Practices, Tips and

Document related concepts
no text concepts found
Transcript
J2SE
and
J2EE Performance
Best Practices, Tips And Techniques
Rima Patel Sriganesh
We've got the
vision...
Member of Technical Staff
Now...
[email protected]
We've to execute!
Seoul Sun™ Tech Days FY 05
Objective
To know various ways of improving
performance of your J2SE and J2EE
applications
“Perspectives”
on Performance
Startup time
Peak/sustained performance
Perceived performance
Performance as per the user
Scalability
Readiness of application to handle
increasing load without requiring change
in design
faced
When
with bad performance...
Ask yourself these questions
Is it an external issue?
Database systems, messaging systems, etc.
Is it an operating environment issue?
Memory, disk, CPU, network
Is it an application issue?
Resulting from bad design or bad coding
Over-design is a bad design
Agenda
J2SE Performance Tuning
J2SE Platform Performance Today
JVM Tuning – Pre J2SE 5.0 Platform
Performance Enhancements in J2SE 5.0
Performance Tools in J2SE 5.0 Platform
J2EE Performance Tuning
Performance Tuning Guidelines for J2EE
Applications
The Process of J2EE Application Performance
Tuning
Tuning J2EE Cluster Performance
5
J2SE Platform Performance today
6
Performance
Enhancements
in JDK 1.3
Improved performance for readUTF and
writeUTF
Improved JScroll painting
Improved JTable performance
Paint coalescing
Frame resizing, and internal frame blitting
BigInteger performance improvements
Better Performing numeric operations
java.lang.StrictMath
7
Performance
Enhancements
in JDK 1.4
java.nio package
This new package provides for improved
performance in buffer management, scalable
network and file i/o, character set support and
regular expression matching
Java 2D
The Java 2D enhancements such as new pipeline
architecture, in turn benefits Swing/JFC
performance.
Reflection
Many reflective operations (example:
Class.newInstance() ) have been rewritten for
higher performance
8
in JDK 1.4 (Contd)
Performance
Enhancements
Networking
Networking functionality in J2SDK 1.4.2 has
improved performance for HTTP Streaming
java.math
New static method probablePrime has been added
for generating prime numbers, thereby providing
an efficient way of generating prime numbers
Hotspot
A new full speed debugging has been added to the
Java Hotspot VM. The improved performance
allows for long running programs to be more easily
debugged, and tested
9
Pre J2SE 5.0 JVM Tuning
10
Tuning
JVM
– Pre 5.0
Java HotSpot™ VM is highly tunable
Meets a broad range of requirements
Fine tune control of application behavior
Problem for Server Applications
Default JVM software behavior largely benefits
small client applications
Poor “out-of-the-box” performance for some
server applications.
There are many tuning parameters to test
What is the best strategy for tuning
server applications?
11
HotSpot™
VM Heap
Layout – A Refresher
Survivor Ratio
(2Mb default)
(64Kb default)
From
Space
Eden Space
Young Generation
Tenured Space
Old Generation
(5Mb min, 44Mb max default)
Permanent Space
Permanent Generation
(4Mb default)
To
Space
JVM
Memory
Management
– HotSpot
Generational GC divides the heap into multiple
areas
Young Generation
Eden – Java Objects are created in this nursery
Two Survivors Spaces – Copies Surviving objects
Old Generation
Stores longer lived Java objects
Permanent Generation
Stores the JVM classes and method objects
GC Works?
How
Generational
and
Heap
Parameters
-Xmx
-XX:MaxPermSize
virtual
Tenured
virtual
Perm
virtual
-XX:NewRatio
Eden
SS0 SS1
-XX:SurvivorRatio
“virtual” exists if -Xms and -Xmx is different
Tuning
5.0
JVM
– Pre
(Contd)
Java software performance first steps for JDK 1.4.2 release
Server applications
Select the Server Compiler
Java -server application
Turns on advanced compiler optimizations
Loop peeling and unrolling, inlining,
dead code elimination
Correctly size the overall heap
Java -server -Xms512m -Xmx512m application
Set minimum and maximum sizes to same value
to avoid resizing overhead
Set heap large enough to minimize the frequency
of Full GCs
16
Performance
Java
Software
Steps
Set young generation size, starting with ¼
overall heap size
Java -server -Xmn128m -Xms512m
-Xmx512m Application
Start with ¼, but experiment with larger and
smaller values
Benchmarking performance flags (-XX:
+AggressiveHeap) set the young generation
size to 3/8th the overall heap size
Sizing young generation beyond ½ the overall
heap size using the default collector forces Full
GCs
17
Performance
Steps
Java
(Contd)
Monitor JVM software using JVMstat tools
http://developers.sun.com/dev/coolstuff/jvmstat/
Most, but not all, jvmstat tools included in 1.5
More on Tools covered later in the talk
Profile Java Technology-Based Application
HPROF profiling tools
http://java.sun.com/j2se/1.4.2/docs/guide/jvmpi/jvmpi.html
Sun Studio 8 Performance Analyzer
http://wwws.sun.com/software/products/studio/index.html
18
Java
HotSpot
VM
tuning options
New garbage collectors
Default
Incremental
Throughput
Parallel young generation collection
Improve scalability on multi-CPU server systems
Concurrent
XX:+UseParallelGC
Turns on the throughput collector
19
Java
HotSpot
VM
tuning options (Contd)
XX:+AggressiveHeap
Performance flag for high-throughput applications
Inspects system resources
Sets various parameters to be optimal for
long-running, memory allocation-intensive jobs
Size of initial heap based on available memory
MIN(½ RAM, RAM–160M)
Automatically sizes generations
Young generation is set to 3/8th overall heap
Turns on throughput collector
WARNING: AggressiveHeap can change
at anytime!
20
Pre J2SE 5.0 Performance Tools
21
GC
Portal
Analyses verbose:gc output
Reconstructs application behaviour
Mathematically modelled
Outputs summary details of how the GC
operated during application lifetime
Results can be used to tune VM
Free download
http://developer.java.sun.com/developer/technicalArticles/
Programming/GCPortal/
Jvmstat
Experimental monitoring tool
Currently supports JDK 1.4.1 and above
Instrumentation of GC
Extracts information from JVM in
realtime
Lightweight
Non-intrusive
http://www.sun.com/developers/coolstuff/jvmstat
Jfluid
Research project at Sun Labs
Modified HotSpot VM
Dynamic bytecode instrumentation
GUI tool for simple interaction
Currently supports CPU and memory
profiling
http://research.sun.com/projects/jfluid
Why
is
Tuning important?
“Good” tuning
-server
-Xms512m
-Xmx512m
“Best” tuning
-server
-Xms1600m
-Xmx1600m
-XX:+AggressiveHeap
Hotspot Tuning Comparison
50000
45000
40000
35000
30000
25000
20000
15000
10000
5000
Good
Best
0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
25
J2SE 5.0 Performance Enhancements
26
5.0
J2SE
Performance
Features
“Smart tuning”
Small systems performance optimizations
Client features
Class data sharing
Various JFC/Swing and Java 2D™ APIs
improvements
X86 Optimizations
27
Smart
Tuning JDK 5.0 release “Smart Tuning”
Forget
everything you just heard!
Provide good “out-of-the-box” performance
without hand tuning
Examine host machine and configure
Hotspot appropriately
Dynamically adjust Java HotSpot VM
software environment at runtime
Simple tuning options based on application
requirements not JVM software internals
28
“Smart
How
Tuning”
Works
Determine type of machine JVM
software is running on
Server machine
Larger heap
Parallel garbage collector
Server compiler
Client machine
Minimum heap
Serial garbage collector
Client compiler
Adaptive Heap Sizing policy
29
Tuning”
Effects
of “Smart
“Out-of-the-Box” Performance Composite Summary
225%
200%
175%
150%
125%
100%
75%
50%
25%
0%
JDK 1.4.2 FCS
JDK 1.5.0
Source: Sun Microsystems
8 CPU 1.2Ghz Sun Fire v880
Solaris Next 64bit
30
Performance
Tuned
vs. Smart Tuning
120%
100%
80%
60%
40%
20%
0%
Smart Tuning
Hand Tuning
Source: Sun Microsystems
8 CPU 1.2Ghz Sun Fire v880
Solaris Next 64bit
31
Client
Performance
in 5.0
Class data sharing
Create class archive to share among multiple
JVM machines
Faster classload = faster startup time
Shared class archive = reduced footprint
JFC/Swing and Java 2D API improvements
Better image management
Font re-architecture
Java Native Interface overhead improvements
32
Client
Performance
in 5.0
J2SE™ Technology Client Performance Improvements
150
125
100
JDK 1.4.2
75
JDK 1.5.0
50
25
0
Startup
Footprint
Swing
Source: Sun Microsystems
33
Performance
Other
Enhancements in 5.0
StringBuilder class
Addition of StringBuilder class that works on
an unsynchronized StringBuffer for
performance enhancement.
You should replace all StringBuffer
occurences with StringBuilder
34
Other
Performance
in 5.0 (Contd)
Enhancements
Java 2D technology
Improved accelertion of BufferedImage objects.
Support for hardware accelerated rendering of
OpenGL, and improved text rendering
performance
Image I/O
Performance and memory usage enhancements
when reading and writing JPEG files
35
J2SE 5.0 Performance Tools
36
J2SE
5.0
Platform
Performance
Tools
JConsole
J2SE monitoring and management console
JMX-compliant grpahical tool for monitoring JVMs –
both remote and local
JPS
JVM Process Status Tool
Lists instrumented Hotspot VM on a target system
JStat
JVM Statistics Monitoring Tool
Attaches to an instrumented Hotspot VM and
collects the logs
Performance stats as per the command line options37
J2SE
5.0
Platform Performance
Tools (Contd)
JStatd
Launches an RMI server application that monitors
for the creation and termination of instrumented
Hotspot VMs and provide an interface to allow
remote monitoring tools to attach to JVMs running
on the local system
38
Summary
Understand that the virtual machine will
help you tune performance
Use profiling tools to find bottlenecks
Adapt HotSpot™ parameters to your
application
Always use the latest JRE
Sun is always improving Java™ performance
ever
If you
ask or wonder
...
My application runs slowly. Why?
Why are there intermittent long pauses in my app?
Why does my app keep running out of file descriptors?
Why is my app not scaling with the number of
processors
How do I profile heap usage?
Check-out: The Java Performance FAQ @
http://java.sun.com/docs/hotspot/PerformanceFAQ.html
Resources
http://java.sun.com/j2se
http://java.sun.com/docs/hotspot/VMOptions.html
java.sun.com/blueprints/performance
java.sun.com/products/hotspot
research.sun.com/projects/jfluid
developers.sun.com/dev/coolstuff/jvmstat
developer.java.sun.com/developer/
technicalArticles/Programming/GCPortal
41
J2EE Performance
the
Understand
J2EE stack...
Performance Nugget
The key to your performance
problem can lie in any of
these layers. Hence, you have
to be able to tune any of these
layers.
Covered in J2SE
Performance part
Out of scope.
However, we will
provide generic
guidelines.
Performance Tuning Guidelines
for J2EE Applications
Guidelines
General
Performance
Servlets and JavaServer Pages
Avoid using shared modified class variables and hence,
synchronized blocks of code in servlets
Session creation is expensive
Invalidate sessions no longer needed
Use <%page session=false%> directive to prevent
automatic session creation in JSP
Do not store large object graphs in HttpSession to
avoid forced Java serialization
Do not use HttpSession as a cache for transactional
data, instead use “read-only” entity beans, if provided
by your container, to access cached transactional data
Guidelines
General
Performance
Enterprise Java Beans
Cache EJB references to avoid JNDI lookups
Cache EJB home objects in the servlet's init()
Use setSessionContext() or ejbCreate() to cache bean
specific resources. Also release these resources in
ejbRemove() method.
Remove stateful session beans when they are not needed
to avoid passivation and hence, disk I/O
Use pass-by-reference for remote interfaces if possible
Sun Java System Application Server allows pass-by-
reference semantics in Sun deployment descriptor
Guidelines
General
Performance
Enterprise Java Beans (Contd.)
Make sure that rmic is generating EJB stubs without
nolocalstubs switch to generate stubs optimized for sameprocess clients and servers
This helps performance for applications where EJB
clients (Servlets/JSP) and EJBs are co-located in the
same JVM
The rmic options can be changed using the
administrative tools provided by your container
EJB
Pooling
and Caching
Pooling - Caching Matrix
Pooling
Stateless Session Beans
Stateful Session Beans
Entity beans (BMP/CMP)
Message-driven beans
Caching
X
X
X
X
X
Pooling
A pool consists of instances of beans of the same type w/o identity
Pooling enhances performance by saving the amount of time spent
during request-cycle for bean class instance creation
Caching
Beans are cached when the number of concurrent users requesting
the services of beans exceeds that of the maximum allowable
number of bean instances
Pooling
Enterprise
Java Beans
Various settings
Steady pool size – specifies the initial and minimum number of
beans that must be maintained in a pool
Pool resize quantity – specifies the number of beans to be
created or deleted when the pool is being resized by the server
Maximum pool size – specifies the maximum pool size. Can
specify 0 to denote unbounded pool. CAUTION – JVM heap
may get filled with objects in the pool
Pool idle timeout – specifies the maximum time that a bean is
allowed to remain idle in the pool after which the bean is
destroyed
Pooling
Enterprise
Java Beans
Pool tuning tips
Ensure that initial and maximum values of pool size are
representative of normal and peak loads
Setting a very large initial or maximum value for pool size
might lead to ineffective use of system resources for
applications that do not have much concurrent load
Leading to huge garbage collection pauses
Setting a small initial or maximum value compared to typical
loads is going to cause a lot of object creation and object
destruction
Caching
Enterprise
Java Beans
Various settings
Cache resize quantity – specify the number of beans to be
created or deleted when the cache is being serviced
Maximum cache size – specifies the maximum number of beans
in a cache. Value of 0 denotes unbounded cache.
Cache idle timeout – specifies the maximum time that a stateful
session bean or entity bean is allowed to be idle in the cache,
after which the bean will be passivated
Removal timeout
This setting is applicable only to stateful session beans
It specifies the maximum time period for which the stateful
session bean is allowed to remain passivated after which, its
state will be removed from persistent store.
Caching
Enterprise
Java Beans
Various settings (Contd.)
The client will not be able to access bean after its state is
removed
Victim selection policy
This setting is applicable only to stateful session beans
It specifies the algorithm for selecting victims to be removed
from the stateful session bean cache
Some of the popular algorithms – NRU, LRU, FIFO
Caching
Enterprise
Java Beans
Tuning tips
It is good to be able to use beans from cache as much as
possible, theoretically, because:
A bean in cache represents ready state i.e. The bean has
identity associated with it
Beans moving out of cache have to be passivated or
destroyed
Once passivated the bean has to be activated to come back
into the cache
Therefore, any request serviced using these cached beans
avoids overhead of creation, setting identity, and potentially,
activation.
Caching
Enterprise
Java Beans
Tuning tips (Contd.)
However, there are downsides to caching extensively
Memory consumed by all beans affect the heap available in
the VM
Increasing objects and memory taken by cache means
longer, and perhaps more frequent, full GC
Application server might run out of memory
At the same time, maintaining a small cache will lead to lot of
passivation and activation of bean instances and hence,
serialization and de-serialization, thereby straining CPU cycles
and disk I/O
Note – A periodic cleaner will remove all beans in the cache
that have reached cache idle timeout period
Bean
Entity
Tuning
Guidelines
General Guidelines
Provide a bigger cache for beans that are used more as compared
to those that are used less. For example, an Order entity bean that
is accessed comparatively lesser as compared to an Item entity
bean
Entity bean cache and pool sizes should be larger as compared to
session beans (stateful/stateless) taking into consideration finder
methods that return large number of entity stubs
Use lazy loading if you do not need all data represented by your
entity the first time you access it to optimize memory as well as
network bandwidth consumption
To code BMP for lazy loading, put relevant SQLs in the
appropriate getXXX() methods.
For CMP, most of the containers support lazy loading
Bean
Entity
Tuning
Guidelines
General Guidelines (Contd.)
Lazy Loading Caveat
If your client ends up accessing data that is loaded lazily
frequently then you thought, your bean might end up making
multiple network round-trips
Closely monitor the data that your clients end up accessing
frequently and load that data while first loading the bean
Mark entity beans as read-only or read-mostly
Read-only – if your entity beans are never used for changing
database data, then you should consider marking them readonly to avoid unnecessary ejbStore() calls at the end of each
transaction. This works great for data that never changes!
Bean
Entity
Tuning
Guidelines
General Guidelines (Contd.)
Read-mostly – if your data changes infrequently, then mark
corresponding entity beans to be read-mostly. This will ensure
that container calls ejbLoad() on the bean after a specified
refresh timeout period
Bean
Entity
Tuning
Guidelines
Commit Options
Tune commit options for entity beans (most application servers
support commit options B and C)
Commit option controls the action taken by the container
when the transaction in which bean participates is complete
Option B – when a transaction completes, the bean is kept in the
cache with its identity
This means the next invocation for the same primary key can
be serviced by the same instance from the cache (of course,
after calling ejbLoad() to synchronize state with the
database)
Bean
Entity
Tuning
Guidelines
Commit Options (Contd.)
Option C – when a transaction completes, the bean's ejbPassivate()
is called and the bean is disassociated from its identity (primary
key). This bean is returned to the free pool.
This means that next invocation for the same primary key will
have to grab a bean instance from free pool, set the primary key
on this instance, and then call ejbActivate() on the instance.
Again, bean's ejbLoad() will be called to sync up with the
database
Bean
Entity
Tuning
Guidelines
Difference between Commit Options
It is clear that option B avoids ejbActivate()/ejbPassivate() calls,
and hence, in most cases will offer better performance
However, if beans in the cache are rarely used, then commit option
C can perform better
Since in option C, container puts back the bean into the free
pool, instead of putting it in a cache where hit ratio is low
Bean
Entity
Tuning
Guidelines
How to choose between Commit Options B and C?
First take a look at cache-hits value using your EJB server
monitoring tools
If the cache-hits are high, compared to cache-misses, then
option B will work great
Otherwise, use commit option C
Note – You'll still have to tune cache initial, maximum and
resize quantities to achieve optimized performance
For instance you should maintain a larger pool of entity
beans when using commit option C, as compared to cache,
since pool is used more
Bean
Tuning Guidelines
Message-driven
Message-driven beans are pooled
All the stateless session bean pool settings and guidelines apply
For better throughput under high traffic conditions maintain a
large pool of MDB bean instances
When MDBs consume JMS messages
Select the right acknowledgment mode
Use Auto_acknowledge mode when you do not want to receive
duplicates, thereby avoiding ineffective use of network bandwidth
However, in Auto_acknowledge mode, JMS engine makes sending
the acknowledgment a priority, hence, throughput might suffer in
scenarios where a lot of JMS messages arrive and require
processing
To avoid this behavior from JMS engine, use dups_ok_acknowledge
Optimizations
JDBC
When dealing with large amounts of data, such as
searching a large table, use JDBC directly rather than
entity beans
If using JDBC 3.0, use CachedRowSet to get
disconnected rowset functionality
This boosts performance by not maintaining
connection to database while you are traversing the
larget set of data
To ensure that connections are returned to the pool,
always close the connection after use
Optimizations
JDBC
(Contd.)
Use prepared SQL statements for high statement cache hit
ratios in BMP
Each open statement corresponds to an open cursor in
the database, therefore close statements in your BMP
once done using them
Use batch updates if your J2EE component issues multiple
updates to the database (via executeUpdate() API)
Turn auto commit off (setAutoCommit() API)
This way your transaction would be committed only
when commit() is explicitly issued
Not doing so will result in roundtrip to the database for
each executeUpdate() call
Optimizations
JDBC
(Contd.)
Use batch retrievals if your J2EE component issues SQLs that
will eventually fetch large amounts of data
By setting fetch size on the Statement to a large number,
driver will not have to go to database frequently to fetch data
You should set the fetch size taking into consideration your
system resources also
Use appropriate Statement API
Statement – to execute SQLs with no input/output parameters
PreparedStatement – to execute SQLs with input parameters
only
CallableStatement – to execute SQLs with input/output
parameters
EJB
Transaction
Tuning Guidelines
General Guidelines
A transaction should not encompass user input or user
think time to avoid resources being held unnecessarily for
long time
Container managed transactions usually provide better
performance
Declare explicitly transaction attributes such as
'NotSupported' or 'Never' for non-transactional EJB
methods
For large transaction chains use 'TX_REQUIRED' to
ensure all EJB methods in the call chain use the same
transaction
Use the lowest cost locking available for the database for
your required level of transaction consistency
EJB
Transaction
Tuning Guidelines
General Guidelines (Contd.)
Use XA capable data sources only when two or more data
sources are going to be involved in the transaction
If a database participates in both global and local
transactions, register two connection pools – one for
global and one for local – and use appropriate pool in your
application
EJB Transaction Isolation Level Tuning
Guidelines
Isolation levels help maintain integrity of concurrently
accessed data
Isolation levels are set at the database connection (or
connection pool) level
Therefore, the given isolation level setting will apply
to all database access via that connection pool
Use lowest possible isolation level i.e.
READ_UNCOMMITTED for beans that represent
operations on data which are not critical from integrity
standpoint
Zero locking, Zero cost.
EJB Transaction Isolation Level Tuning
Guidelines (Contd.)
Use READ_COMMITTED for applications that always read the
data that is committed
Cost – database server locks the data, returns it to your
application, and then releases the lock on data
Use REPEATABLE_READ for applications that intend to always
read and re-read the same data
Cost – others can concurrently read the data you are accessing
but not modify it. Others can add new data.
Use SERIALIZABLE for applications that want to hold exclusive
access to data
Cost – others can not read or modify the data you are accessing
concurrently. Others' requests for data being accessed by your
application will be serialized by the database.
HTTP
Server
Tuning
Guidelines
Connection queue setting
Connection queue refers to number of sessions in the
queue, and the average delay before the connection is
accepted
Current, peak, limit settings typically
If the peak value is close to limit, increase
maximum connection limit, to avoid dropping
connections under heavy load
Acceptor threads setting
Acceptor threads accept connections and put them in
connection queue where they are then picked up by
worker threads
A good rule of thumb is to have one acceptor thread
per CPU in your system
HTTP
Server
Tuning
Guidelines
(Contd.)
Persistent (aka Keep-alive) connections setting
Persistent connections support the ability to send
multiple requests across a single HTTP session to
avoid overloading server with numerous connections
Tune number of threads in a keep-alive system
Tune keep-alive timeout to specify the number of
seconds before an idle keep-alive connection is closed
by the server
Tune maximum number of “waiting” keep-alive
connections in your server
HTTP
Server
Tuning
Guidelines
(Contd.)
Cache static content in order for server to handle requests
for static content quickly
Tune the maximum number of cache entries allowed in
server
Tune maximum age to ensure valid state of cached
information, while optimizing usage of cached content
Tips
Miscellaneous
Tuning
In the server CLASSPATH setting, avoid excessive
directories to improve class loading time.
Package application related classes into JAR file
Set recompile reload interval so as to prevent JSP
recompilation. In Sun Java System Application server this
value is -1
Deploy applications that do not contain EJBs as WAR
instead of EAR files
The Process of J2EE Application
Performance Tuning
process flow
Performance
tuning
Performance Issue Raised
Ask What Symptoms
Reproduce Issue
Monitor Application
Root Cause Analysis
Take Corrective Action
Performance not OK
Performance OK
Send the code into production
Stack
Monitoring
J2EE
Monitoring is the first step towards performance tuning
• UNIX stat tools
– vmstat, iostat, mpstat,
netstat, kstat
• UNIX proc tools
– pstack, pmap, prstat/top,
truss/strace
• Hardware counters
– cpustat, busstat
• Misc. tools
– SE Toolkit
Stack
Monitoring
J2EE
Monitoring is the first step towards performance tuning
• jvmstat tools
• Command line options
– -verbose:[gc|class|jni]
– -Xprof
– -XX options
• Serviceability Agent
• JSR 174 interfaces
• Jconsole
Stack
Monitoring
J2EE
Monitoring is the first step towards performance tuning
• JMX interfaces
– SNMP, CIM/WBEM
• Bytecode instrumentation
– Various commercial tools
• JVMPI, JVMTI, JVMDI tools
– Commercial and free
Stack
Monitoring
J2EE
Monitoring is the first step towards performance tuning
• Custom instrumentation
– Tracking method enter/exit
• Custom logging
– Special performance events
• Custom managed beans
– Remotely monitor
cause
analysis
Root
Associate an observed symptom of poor performance
to a possible root cause
Symptoms mostly include
Low CPU utilization at web tier, business tier or data tier
One or many processors busy
High system CPU time at one of the tiers
Unusual disk activity at data tier or business tier
Poor response time distribution
Once the symptom is identified, you might require
additional data gathering to decide upon the root
cause
cause
analysis
Root
Low CPU Utilization – Symptoms
Use any tool that provides CPU stats (vmstat, mpstat,
top, etc.) for checking CPU utilization
Symptoms of low CPU utilization
High idle time with just one busy CPU
Idle time not decreasing with increased load
Response times rapidly degrading with increased load
Elapsed Time Statistics
300.16 time (seconds)
224.39 idle time
46.21 user time
15.32 system time
14.23 wait time
100.00 %
74.76 %
15.39 %
5.11 %
4.74 %
cause
analysis
Root
Low CPU Utilization – Causes
Application or application server threads are blocked
probably because
Waiting for resources
Synchronized code
Heap size too small
Look further into
Thread dumps
GC Logs
Monitor or profile with a more sophisticated tool for detailed
analysis
cause
analysis
Root
Low CPU Utilization – Resolution
Thread dump (Illustration)
"http8000-Processor14" ... waiting for monitor entry
- waiting to lock <0x6fda29d8> (a java.lang.Object)
"http8000-Processor9" ... waiting for monitor entry
- waiting to lock <0x6fda29d8> (a java.lang.Object)
"http8000-Processor6" ... waiting for monitor entry
- waiting to lock <0x6fda29d8> (a java.lang.Object)
"http8000-Processor2" ... waiting for monitor entry
- waiting to lock <0x6fda29d8> (a java.lang.Object)
"http8000-Processor1" ... waiting for monitor entry
- waiting to lock <0x6fda29d8> (a java.lang.Object)
Confirm thread dump results with profiler
cause
analysis
Root
Low CPU Utilization – Resolution
Gather statistics from application tier and data tier
Is application code highly synchronized?
Is disk I/O or network I/O unusual?
Monitor application server for busy JDBC
connections
Increase size of JDBC connection pool and see
what happens
Monitor or profile application with more sophisticated
tool
OptimizeIt, JProbe, PerformaSure, IntroScope, or
ServerTrace
cause
analysis
Root
One of Many Processors Busy – Symptoms
Primarily applies to Web and Application Server tiers
Symptom – one processor spending a lot of time in
user time while others sit idle
mpstat output
CPU usr sys wt idl
0
91
0
0
9
1
1
1
3
95
2
0
1
4
94
3
1
0
3
96
cause
analysis
Root
One of Many Processors Busy – Causes
What it means
System is synchronized on a single thread or resource
Look further into
GC
Command line options: -verbose:gc, -Xloggc
jvmstat tool
Possible causes
Frequent garbage collection
Frequent class compilation
Shared resource contention
cause
analysis
Root
One of Many Processors Busy – Resolution
-Xloggc output (Illustration)
[Full GC 792332K->412757K(1040896K), 8.9157 secs]
[Full GC 799898K->221096K(1040896K), 5.3018 secs]
Resolution
Increase JVM machine heap size
Use ParallelGC algorithm
cause
analysis
Root
High System CPU Utilization – Symptoms
Symptoms
High system time and low idle time
High user or wait time is another issue
System time of about 10 is usually a problem
Sometimes for database systems, system time could be as
high as 40
Use processor/CPU monitoring tools such as vmstat,
mpstat, top, prstat to find out about the symptoms
procs
r b
1 1
2 0
0 0
0 0
cpu
us sy
40
5
32 57
36 58
30 45
id
51
8
4
2
wa
3
3
2
3
cause
analysis
Root
High System CPU Utilization – Causes
What it means
Lots of calls into the operating system
Lots of I/O, socket creation, timestamping activities
Look further into
Application server thread dumps
Application server profiles
truss/trace of systems calls
Possible causes
Mutex/Lock contention
Unbuffered I/O
cause
analysis
Root
High System CPU Utilization – Resolution
truss output (Illustration)
1742:
read(5, "CF", 1)
1742:
read(5, "9F", 1)
= 1
1742:
read(5, "06", 1)
= 1
1742:
read(5, "8F", 1)
= 1
1742:
read(5, "A8", 1)
= 1
1742:
read(5, " q", 1)
= 1
1742:
read(5, " Z", 1)
= 1
1742:
read(5, " *", 1)
= 1
1742:
read(5, " (", 1)
= 1
1742:
read(5, "B6", 1)
= 1
Resolution
Introduce buffered I/O
= 1
cause
analysis
Root
High Disk Utilization – Symptoms
Symptoms
Especially when application server logs to a separate device
(/tmp usually), any other unusual disk activity should alarm
High database service times (100s of ms)
High %b (above 30)
High Wait CPU % time
Use iostat or similar such disk monitoring tool
Device
ssd4
kr/s
1.6
kw/s
69.6
svc_t
8.5
%w
0
%b
4
cause
analysis
Root
High Disk Utilization – Causes
What it means
Time waiting for disk affects response time
Look further into
truss/trace of systems calls
Application server statistics
Database server write statistics
Possible causes
Excessive logging
Stateful session bean passivation
Bad cache on database disk
cause
analysis
Root
High Disk Utilization – Resolution
Application server statistics (Illustration)
asadmin server.applications.MyApp.my_jar.
MyBean.bean-cache.*
server.applications.MyApp.my_jar.myBean.
bean-cache.numpassivationsuccess-count = 54581
Resolution
Increase stateful session bean cache size
Tuning J2EE Cluster Performance
Cluster
performance
factors
Latencies inherent in any server (regardless of
clustered or not)
Request processing time
Data retrieval (from EIS/RDBMS) time
Rendering/displaying data
Additional latencies due to clustering
Size, frequency, and efficiency of data transfer
Other costs of making data available across the cluster
Size of the application server cluster – sometimes the
cost of clustering overtakes the benefit
Additional tiers – load balancers and HA requirements:
How highly available stores are managed.
Cluster
performance
factors
Size, frequency and efficiency of data transfer
Size
How much of the data needs to be sent (10 M vs. 10 K)
Frequency
How frequently does the data need to be transferred (10
ms vs. 10 secs)
How efficient is the data transfer protocol
Memory replication protocol
JDBC to database
JMS messaging
Cluster
performance
factors
Size, frequency and efficiency of data transfer
Size
Keep data size/volume low – write only what matters
Avoid transferring complex object graphs
Frequency
How much latency can you take? Choose data transfer
frequency accordingly.
How efficient is the data transfer protocol
Memory replication protocol: tune the memory in terms
of replication efficiency
JDBC to database: tune the JDBC connections
JMS messaging: understand the cost involved in using
JMS brokers
Cluster
performance
factors
Other costs of making data available cross-cluster
Cost of serialization and de-
serialization of data
HTTP sessions, state of stateful session beans,
etc.
Frequency of serialization
Size of backing store [memory / HA
store] also has an impact
Cluster
performance
factors
Other costs of making data available cross-cluster
Choose data structures with serialization in mind
Controlling frequency
Not all data needs to be stored all the time
Store the data only when it is modified
Size of backing store
Memory
Make sure that JVMs are sized (e.g. Heap size, garbage collection
strategy)
Don't keep stale data – examine expiration strategy
DB backend store
Ensure database is sized properly
Ensure I/O is efficient
Finally...
There Are “Other” Factors
Involved For Good Performance
Good
Coding
Make yourself aware of basic coding best practices such as
Avoiding finalizers
Explicitly nulling variables
Not using system.gc()
Preloading classes in the background
Minimizing use of inner classes (reduces class loading time)
Using separate threads for event handling
Controlling serialization
Using buffered I/O
Using final for methods/variables that are invariant
Using string concatenations to create strings at compile time but
using string buffers to create strings at runtime
Using the smallest data type as per the requirement
Good
Design
Make effective use of design patterns
Some of the most relevant EJB design patterns wrt
performance are
Using data transfer objects (DTOs) to transfer data
between the EJB and the caller
Using a business delegate or a facade to aggregate the
use and interactions of multiple EJBs
POJOs or session beans as focal point for
operations over multiple EJBs
Using ServiceLocator pattern to unify and cache all the
resource/EJB lookups
Summary
J2EE application performance is
dependent upon many layers
Use monitoring tools at all levels
Application
Application server
JVM
System layer
Good performance is also equally
dependent on good coding
Resources
http://java.sun.com/performance
Mastering Enterprise JavaBeans
book for performance and design
best practices
http://javaperformancetuning.com
Rima
Patel
Sriganesh Member of Technical Staff/Technology Evangelist
[email protected]
Sun™ Tech Days