Tuesday, June 14, 2011

Garbage collection with Automatic Resource Management in Java 7

This post provides a brief overview of a new feature introduced in Java 7 called Automatic Resource Management or ARM. The post delves how ARM tries to reduce the code that a developer has to write to efficiently free the JVM heap of allocated resources.

One of the sweetest spots of programming in the Java programming language is automatic handling of object de-allocation. In Java world this is more popularly known as garbage collection; it basically means that developers do not have to worry about de-allocating the object allocated by their code. As soon as a developer is finished with using the object he can nullify all references to the object and then the object becomes eligible for garbage collection.

Garbage collection has a flip side to it however. Unlike in C/C++ where the coder has complete control of memory allocation and de-allocation [malloc, free, new, delete etc], in Java the developer does not have significant control over the process of de-allocation of objects. The JVM manages the process of garbage collecting of unused objects and it is really up to the whims of the JVM when to run a cycle of garbage collection. True, there are method calls like System.gc() or Runtime.getRuntime().gc() that indicates that garbage collection will be run, but these methods merely serve to remind the JVM that -"maybe you need to run a garbage collection now, just a suggestion, no pressure!". The JVM is fully authorized to disregard such requests and is coded to run garbage collection only when it really sees fit. Hence in practice, developers are always advised not to build their program logic believing System.gc() or Runtime.getRuntime().gc() will trigger a full garbage collection.

There is no denying how much good automatic garbage collection has done to enhance the productivity of developers. However there are some corner cases where garbage collection is not sufficient to mantain a "clean" heap, free of unused objects. Especially if the objects deal with some form of native resources that is served by the underlying operating system. These objects include, but are not limited to IO streams, database connections etc. For these kind of objects developers must release the resources explicitly. Typically these are done through try-catch blocks.

Let us look at a small example that closes an InputStream after finishing the processing of the stream:
InputStream in = null;

try
{
    in = new FileInputStream(new File("test.txt");
    //do stuff with in
}
catch(IOException ie)
{
    //SOPs
}
finally
{
    //do cleanup
}
The above looks good and clean; however as soon as we try to close the input stream via in.close() in the finally block, we need to surround it with a try-catch block that catches the checked exception, IOException. Thus the code sample transforms to:
InputStream in = null;

try
{
    in = new FileInputStream(new File("test.txt"));
    //do stuff with in
}
catch(IOException ie)
{
    //SOPs
}
finally
{
    try
    {
        in.close();
    }
    catch(IOException ioe)
    {
        //can't do anything about it
    }
}
Now the above code looks bloated, and with multiple kinds of checked exceptions in different hierarchy, we need more catch clauses. Very soon the code becomes lengthy and difficult to maintain, not to mention the code losing its initial clean and no-nonsense look that even appealed to the eye.

But there is a good news.

Java 7 makes this easier with the new try-catch block. With this feature we can avoid the finally block itself. This is how we do it:
try(InputStream in = new FileInputStream(new File("test.txt"))
{
    //do stuff with in
}
catch(IOException ie)
{
    //SOPs
}
The above block of code will do the cleanup part itself. This is made possible by the introduction of a new interface, java.lang.AutoCloseable which defines a single method, void close() throws Exception. Objects which are subtypes of this interface can be automatically close()d using the above syntax. The above feature is applicable to objects of any class that implement the AutoCloseable interface.

The best part is that even if we initialize multiple AutoCloseable instances in the try() block, it will call the close() method for all the objects, even if some close() method on some object throw any exception.

Coming to the handling of the exceptions, if there was any IOExceptions in our try block as well as in the implicit finally block [where the AutoCloseables are actually being close()ed], the exception thrown will be the one that was thrown in the try block rather than the one in the implicit finally block.

However we can still have the details of the implicit finally block's exception from the method Throwable.getSuppressed() which is added as a new method in Java 7.

I think Automatic Resource Management feature, or ARM is a great addition to Java 7.

Happy coding!!

Friday, June 10, 2011

String Handling in Java

Strings are one of the most heavily used data types in the Java Programming Language. It is no wonder that much thought has gone into the design of the String data type. They are immutable, fast and have almost all the operations that we might need on a daily basis. However not only the data type itself, the internal handling of the String datatype also has received much focus of research. The Java Virtual Machine has a brilliant way of handling and allocating Strings.

Now when we write String s; we are just creating a new reference to the string and no allocation has been done. When we say String s = "Swaranga"; a new String object is created in the heap whose starting address is passed to s; Now here is the catch : if we further say String s2 = "Swaranga"; then in most cases s2 will point to the same "Swaranga" object that was pointed to by s; This is so because the compiler knows that once the original "Swaranga" object was created its contents will never change; hence it is safe to assign the same object to s2 also; This saves a lot of memory in the heap.

The same concept is extended when we create sub strings from a String object. Consider this : String s = "Swaranga"; here a new string object is created and starting location is assigned to s. Now if we write String s3 = s1.subString(2, 5); then a new String object is NOT created. Instead s3 is just assigned the address of the third character of "Swaranga". The reason is same : since the compiler knows that "Swaranga" can never be changed hence it is safe to assign a sub string of "Swaranga".

In all of this we must remember that while it is impossible to change the contents of a String object once allocated, it is trivial to change the references though. For instance in the above example we can easily write s = s3; this does not change the contents of "Swaranga" but only shifts the reference s from the starting location of "Swaranga" to the starting location of the third character in "Swaranga" which is 'a';

For those programmers coming from the C background the statement String s = "Swaranga"; can be viewed as being similar to the C statement const char * cp = "Swaranga"; or the statement char const * cp = "Swaranga";

Happy coding!

Thursday, June 9, 2011

Generic method to sort a Map according to Values

We all have at least once used classes implementing java.util.Map. Some of the most prominent ones are
  • HashMap
  • TreeMap
  • LinkedHashMap
Of these, HashMap and LinkedHashMap do not sort the entries. TreeMap sorts the entries according to the natural order of the set of keys in the map. Hence it becomes very trivial to store key-value mappings ordered by the keys.

HashMaps and LinkedHashMaps on the other hand do not sort its entries. LinkedHashMap does however guarantee a insertion order [or in some cases fetching order] in its entries. HashMaps do not guarantee even that. There are times when we want the map entries to be ordered according to the values instead of keys.


So, following is a generic method  that will sort a given map. The method is generic and will work with any maps provided it has the following property:

The values of the map must have a natural order.

It basically means that the Values must implement the java.util.Comparable interface. The reason for this requirement is that, since this is a generic method, it will not know how to sort the values. It must rely on its natural ordering. Of course the method can be further overloaded to use a Comparator instance so that instead of natural ordering, it can sort the entries according to some other contextual order.

Enough theory, here it goes:
/**

 * Sort a map according to values.

 * @param  < K >  the key of the map.
 * @param  < V >  the value to sort according to.
 * @param mapToSort the map to sort.

 * @return a map sorted on the values.

 */  

public static  < K, V extends Comparable < ? super V >  >  Map < K, V > 
sortMapByValues(final Map  < K, V >  mapToSort)
{
    List < Map.Entry < K, V >  >  entries =
        new ArrayList < Map.Entry < K, V >  > (mapToSort.size());  

    entries.addAll(mapToSort.entrySet());

    Collections.sort(entries,
                     new Comparator < Map.Entry < K, V >  > ()
    {
        @Override
        public int compare(
               final Map.Entry < K, V >  entry1,
               final Map.Entry < K, V >  entry2)
        {
            return entry1.getValue().compareTo(entry2.getValue());
        }
    });      

    Map < K, V >  sortedMap = new LinkedHashMap < K, V > ();      

    for (Map.Entry < K, V >  entry : entries)
    {
        sortedMap.put(entry.getKey(), entry.getValue());
    }      

    return sortedMap;
}
Feel free to use it, twist it in whatever convoluted way you want.

Happy coding!