Friday, June 10, 2011

String Handling in Java

Strings are one of the most heavily used data types in the Java Programming Language. It is no wonder that much thought has gone into the design of the String data type. They are immutable, fast and have almost all the operations that we might need on a daily basis. However not only the data type itself, the internal handling of the String datatype also has received much focus of research. The Java Virtual Machine has a brilliant way of handling and allocating Strings.

Now when we write String s; we are just creating a new reference to the string and no allocation has been done. When we say String s = "Swaranga"; a new String object is created in the heap whose starting address is passed to s; Now here is the catch : if we further say String s2 = "Swaranga"; then in most cases s2 will point to the same "Swaranga" object that was pointed to by s; This is so because the compiler knows that once the original "Swaranga" object was created its contents will never change; hence it is safe to assign the same object to s2 also; This saves a lot of memory in the heap.

The same concept is extended when we create sub strings from a String object. Consider this : String s = "Swaranga"; here a new string object is created and starting location is assigned to s. Now if we write String s3 = s1.subString(2, 5); then a new String object is NOT created. Instead s3 is just assigned the address of the third character of "Swaranga". The reason is same : since the compiler knows that "Swaranga" can never be changed hence it is safe to assign a sub string of "Swaranga".

In all of this we must remember that while it is impossible to change the contents of a String object once allocated, it is trivial to change the references though. For instance in the above example we can easily write s = s3; this does not change the contents of "Swaranga" but only shifts the reference s from the starting location of "Swaranga" to the starting location of the third character in "Swaranga" which is 'a';

For those programmers coming from the C background the statement String s = "Swaranga"; can be viewed as being similar to the C statement const char * cp = "Swaranga"; or the statement char const * cp = "Swaranga";

Happy coding!

1 comment:

  1. Srings are vey special in java which implies you need to know basic stuff about Strings like String pools, how String cache hashcode or Why String is immutable. Another point I observe while working with sub-strings is that they keep references of original String array and can cause memory leak if there is no active reference of String. you can read more on my post How SubString works in Java