Tuesday 8 September 2009

J2EE Application Performance Tuning Part 2

-: Caching objects in Hibernate to improve performance in J2EE Applications :-


What is caching?

The general concept of caching is that when an object is first read from an external storage, a copy of it will be stored in an area referred to as cache.For subsequent readings the object can be retrieved from the cache directly, which is faster than retrieving it from external storage.

Levels of caching in Hibernate

As a high performance O/R mapping framework, Hibernate supports the caching of persistent objects at different levels.

-<< First Level caching >>-

In hibernate by default objects are cached with session scope. This kind of caching is called “first level caching”.

-<< Second Level Caching >>-

First level caching doesn't help when the same object needs to be read across different sessions. In order to enable this one needs to turn on "second level caching" in hibernate i.e. setting up objects caches that are accessable across multiple sessions.
Second level caching can be done on class associations on collections and on database query statements.

Caching frameworks for non-distributed and distributed J2EE application environments.

Hibernate supports many caching frameworks like EHCache, OSCache, SwarmCache, JBossCache and Terracota.
  • In a non-distrinuted environment EHCache framework is a good choice, it is also the default cache provider for hibernate.

  • In a distributed environment a good choice would be Terracota, which is a highly powerful open source framework that supports distributed caching and provides network attached memory.


-: Identifying and dealing with memory leaks :-


Memory leaks can occur due to:

    - Logical flaws in the code.
    - System's architecture setup.
    - Application server's incompatibility with third party products.


In a large enterprise scale application it is not always easy to identify memory leaks, so under certain circumstances one will need to run the application inside a memory profiler to identify memory leaks. "JProfiler" is one that is quite popular.

Some memory leak scenarios caused be erroneous code are as follows:

  • ResultSet and Statement objects created using pooled connections. When the connection is closed it just returns to the connection pool ut doesn't close the ResultSet or Statement objects.

  • Collection elements not removed after use in the application.

  • Incorrect scoping of variables i.e. if a variable is needed only in a method but is declared as member variable of a class, then its lifetime is unnecessarily is extended to that of the class, which will hold up memory or a longer time period.

some simple memory leak examples to follow:

Example 1. Memory leak caused by
    ** collection elements not removed & incorrect scoping of variables **.


// The following code throws:

java.lang.OutOfMemoryError: Java heap space ".

// This is because method MemoryLeakingMethod(HashMap emps)
// is invoked with a class variable as method parameter.
// so the memory used by it cannot be reclaimed by garbage
// collector between method executions unless, it is
// nullified or collection elements removed.
// multiple calls to the method with different variables
// will fill up the java heap space.


so,
public class MemoryLeakClass {

private HashMap emp01,emp02,emp03...;
..........
public static void main(String[] args) {

MemoryLeakClass m = new MemoryLeakClass();
m.MemoryLeakingMethod(m.emp01);

try{

      Thread.currentThread().sleep(10000);
      System.gc();   // trying to reclaim memory used by m.emp01
           // but, not possible because m.emp01
           // is a class variable with instance scope
           // and maintains strong reference.
           // However, memory will be
           // reclaimed if WeakHashMap used
           // instead of HashMap


}catch(InterruptedException e){e.printStackTrace();}

m.MemoryLeakingMethod(m.emp02);
m.MemoryLeakingMethod(m.emp03);
...........multiple executions..
java.lang.OutOfMemoryError: Java heap space

}

       -: Method: MemoryLeakingMethod(HashMap emps) :-

public void MemoryLeakingMethod(HashMap emps){

// The HashMap 'emps' passed to this method is a class variable.

System.out.println("*** Memory leaking method ***"+" Run: "+run++);

try {

for(int i=0;i<100000;i++){
emp = new Employees();

// populating 'Employees' object.

emp.setName(rs.getString("name"));
emp.setMeritalStatus(rs.getString("meritalStatus"));
...........

// adding Employees object to HashMap class variable 'emps'

emps.put(new Integer(i), emp);

}
}catch(SQLException e){e.printStackTrace();}
catch (java.text.ParseException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}

}


*** << If the variable scoping cannot be changed then using a WeakHashMap instead of a HashMap can solve this problem.This is because a weakReference gets freed aggressively by the garbage collector. So the garbage collection code in the main method mentioned above will reclaim the memory between method executions.>> ***


The following change to the above code will prevent an OutOfMemoryError:

Change: private HashMap emp01,emp02,emp03...;
with : private WeakHashMap emp01,emp02,emp03...;


-: WeakHashMap vs HashMap :-


A WeakHashMap is identical to a HashMap in terms of it's functionality, except that the entries in it do not maintain a strong references against it's keys, so the garbage collector may remove the keys from the WeakHashMap and subsequently garbage collect the object. In other works the WeakHashMap behaves like a weakly referenced object.

-: Serializable vs Externalizable :-


Serialization can be a slow process if you have a large object graph and if the classes in the object graph contain large number of variables. The serializable interface by default will serialize the state of all the classes forming the object graph.

Sometimes it may not be a requirement to serialize the state of all the classes/superclasses in the object graph. This can normally be done by declaring the unwanted class variables as transient. But what if this needs to be decided at runtime? The solution is to replace the serializable implementation with externalizable.

The externalizable interface provides full control to the class implementing it on which states to maintain and which ones to discard and this can be done conditionally at run-time using the two methods readExternal and writeExternal. This complete control over marshalling and unmarshalling at run-time can result in being able to achieve improved application performance.

*** A note of caution though.. The methods readExternal and writeExternal are public methods so one has to look at the security aspects of the code.