phanikumar's posterous

phanikumar's posterous

Feb 16 / 2:54pm

Understanding JVM Internals

Every developer who uses Java knows that Java bytecode runs in a JRE (Java Runtime Environment). The most important element of the JRE is Java Virtual Machine (JVM), which analyzes and executes Java byte code. Java developers do not need to know how JVM works. So many great applications and libraries have already been developed without developers understanding JVM deeply. However, if you understand JVM, you will understand Java more, and will be able to solve the problems which seem to be so simple but unsolvable.

Thus, in this article I will explain how JVM works, its structure, how it executes Java bytecode, the order of execution, examples of common mistakes and their solutions, as well as the new features in Java SE 7 Edition.

Virtual Machine

The JRE is composed of the Java API and the JVM. The role of the JVM is to read the Java application through the Class Loader and execute it along with the Java API.

A virtual machine (VM) is a software implementation of a machine (i.e. a computer) that executes programs like a physical machine. Originally, Java was designed to run based on a virtual machine separated from a physical machine for implementing WORA (Write Once Run Anywhere), although this goal has been mostly forgotten. Therefore, the JVM runs on all kinds of hardware to execute the Java Bytecode without changing the Java execution code.

The features of JVM are as follows:

  • Stack-based virtual machine: The most popular computer architectures such as Intel x86 Architecture and ARM Architecture run based on a register. However, JVM runs based on a stack.
  • Symbolic reference: All types (class and interface) except for primitive data types are referred to through symbolic reference, instead of through explicit memory address-based reference. 
  • Garbage collection: A class instance is explicitly created by the user code and automatically destroyed by garbage collection.
  • Guarantees platform independence by clearly defining the primitive data type: A traditional language such as C/C++ has different int type size according to the platform. The JVM clearly defines the primitive data type to maintain its compatibility and guarantee platform independence.
  • Network byte order: The Java class file uses the network byte order. To maintain platform independence between the little endian used by Intel x86 Architecture and the big endian used by the RISC Series Architecture, a fixed byte order must be kept. Therefore, JVM uses the network byte order, which is used for network transfer. The network byte order is the big endian.

Sun Microsystems developed Java. However, any vendor can develop and provide a JVM by following the Java Virtual Machine Specification. For this reason, there are various JVMs, including Oracle Hotspot JVM and IBM JVM. The Dalvik VM in Google's Android operating system is a kind of JVM, though it does not follow the Java Virtual Machine Specification. Unlike Java VMs, which are stack machines, the Dalvik VM is a register-based architecture. Java bytecode is also converted into an register-based instruction set used by the Dalvik VM.

Java bytecode

To implement WORA, the JVM uses Java bytecode, a middle-language between Java (user language) and the machine language. This Java bytecode is the smallest unit that deploys the Java code.

Before explaining the Java bytecode, let's take a look at it. This case is a summary of a real example that has occurred in development process.

Symptom

An application that had been running successfully no longer runs. Moreover, returns the following error after the library has been updated.

1.Exception in thread "main" java.lang.NoSuchMethodError: com.nhn.user.UserAdmin.addUser(Ljava/lang/String;)V
2.    at com.nhn.service.UserService.add(UserService.java:14)
3.    at com.nhn.service.UserService.main(UserService.java:19)

The application code is as follows, and no changes to it have been made.

1.// UserService.java
2.
3.public void add(String userName) {
4.    admin.addUser(userName);
5.}

The updated library source code and the original source code are as follows.

01.// UserAdmin.java - Updated library source code
02.
03.public User addUser(String userName) {
04.    User user = new User(userName);
05.    User prevUser = userMap.put(userName, user);
06.    return prevUser;
07.}
08.// UserAdmin.java - Original library source code
09.
10.public void addUser(String userName) {
11.    User user = new User(userName);
12.    userMap.put(userName, user);
13.}

In short, the addUser() method which has no return value has been changed to a method that returns the User class instance. However, the application code has not been changed, since it does not use the return value of the addUser() method.

At first glance, the com.nhn.user.UserAdmin.addUser() method seems to still exist, but if so, why does NoSuchMethodError occur?

Reasons

The reason is that the application code has not been compiled to a new library. In other words, the application code seems to invoke methods regardless of the return value. However, the compiled class file indicates the method that has a return value.

You will see this through the following error message.

1.java.lang.NoSuchMethodError: com.nhn.user.UserAdmin.addUser(Ljava/lang/String;)V

NoSuchMethodError has occurred since the "com.nhn.user.UserAdmin.addUser(Ljava/lang/String;)V" method could not be found. Take a look at "Ljava/lang/String;" and the last "V". In the expression of Java Bytecode, "L<classname>;" is the class instance. This means that the addUser() method returns one java/lang/String object as a parameter. In the library of this case, the parameter has not been changed, so it is normal. The last "V" of the message stands for the return value of the method. In the expression of Java Bytecode, "V" means that it has no return value. In short, the error message means that one java.lang.String object has been returned as a parameter and the com.nhn.user.UserAdmin.addUser method without any return value has not been found.

Since the application code has been compiled to the previous library, the class file defined that a method that returns "V" should be invoked. However, in the changed library, the method that returned "V" did not exist, but the method that returned "Lcom/nhn/user/User;" has been added. Therefore, a NoSuchMethodError occurred.

Note

The error has occurred since the developer did not compile a new library again. However, in this case, the library provider is mostly responsible for that. There was no return value of the method as public, but it later has been changed to return the user class instance. This is an obvious method signature change. This means that the backward compatibility of the library has been broken. Therefore, the library provider must have reported to the users that the method has been changed.

Let's go back to the Java Bytecode. Java Bytecode is the essential element of JVM. The JVM is an emulator that emulates the Java Bytecode. Java compiler does not directly convert high-level language such as C/C++ to the machine language (direct CPU instruction); it converts the Java language that the developer understands to the Java Bytecode that the JVM understands. Since Java bytecode has no platform-dependent code, it is executable on the hardware where the JVM (accurately, the JRE of the same profile) has been installed, even when the CPU or OS is different (a class file developed and compiled on the Windows PC can be executed on the Linux machine without additional change.) The size of the compiled code is almost identical to the size of the source code, making it easy to transfer and execute the compiled code via  the network.

The class file itself is a binary file that cannot be understood by a human. To manage this file, JVM vendors provide javap, the disassembler. The result of using javap is called Java assembly. In the above case, the Java assembly below is obtained by disassembling the UserService.add() method of the application code with the javap -c option.

1.public void add(java.lang.String);
2.  Code:
3.   0:   aload_0
4.   1:   getfield        #15; //Field admin:Lcom/nhn/user/UserAdmin;
5.   4:   aload_1
6.   5:   invokevirtual   #23; //Method com/nhn/user/UserAdmin.addUser:(Ljava/lang/String;)V
7.   8:   return

In this Java assembly, the addUser() method is invoked by the fourth row, "5: invokevirtual #23;". This means that the method corresponding to the 23rd index should be invoked. The method of the 23rd index is annotated by the javap program. The invokevirtual is the OpCode (operation code) of the most basic command that invokes a method in the Java Bytecode. For reference, there are four OpCodes that invoke a method in the Java Bytecode: invokeinterface, invokespecial, invokestatic, and invokevirtual. The meaning of each OpCode is as follows.

  • invokeinterface: Invokes an interface method
  • invokespecial: Invokes an initializer, private method, or superclass method
  • invokestatic: Invokes static methods
  • invokevirtual: Invokes instance methods

The instruction set of Java Bytecode consists of OpCode and Operand. The OpCode such as invokevirtual requires a 2-byte Operand.

By compiling the application code above with the updated library and then disassembling it, the following result will be obtained.

1.public void add(java.lang.String);
2.  Code:
3.   0:   aload_0
4.   1:   getfield        #15; //Field admin:Lcom/nhn/user/UserAdmin;
5.   4:   aload_1
6.   5:   invokevirtual   #23; //Method com/nhn/user/UserAdmin.addUser:(Ljava/lang/String;)Lcom/nhn/user/User;
7.   8:   pop
8.   9:   return

You can see that the method corresponding to the 23rd has been converted to the method that returns "Lcom/nhn/user/User;".

In the disassembled result above, what does the number in front of the code mean?

It is the byte number. Perhaps this is the reason why the code executed by the JVM is called Java "Byte"code. In short, the bytecode instruction OpCodes such as aload_0, getfield, and invokevirtual are expressed as a 1-byte byte number. (aload_0 = 0x2a, getfield = 0xb4, invokevirtual = 0xb6) Therefore, the maximum number of Java Bytecode instruction OpCodes is 256.

OpCodes such as aload_0 and aload_1 do not need any Operand. Therefore, the next byte of aload_0 is the OpCode of the next instruction. However, getfield and invokevirtual need the 2-byte Operand. Therefore, the next instruction of getfield on the first byte is written on the fourth byte by skipping two bytes. The bytecode shown through Hex Editor is as follows.

1.2a b4 00 0f 2b b6 00 17 57 b1

In the Java Bytecode, the class instance is expressed as "L;" and void is expressed as "V". In this way, other types have their own expressions. The table below summarizes the expressions.

Table 1: Type Expression in Java Bytecode

Java Bytecode Type Description
B byte signed byte
C char Unicode character
D double double-precision floating-point value
F float single-precision floating-point value
I int integer
J long long integer
L<classname> reference an instance of class <classname>
S short signed short
Z boolean true or false
[ reference one array dimension

The table below shows examples of Java Bytecode expressions.

Table 2: Examples of Java Bytecode Expressions

Java Code Java Bytecode Expression
double d[][][]; [[[D
Object mymethod(int I, double d, Thread t) (IDLjava/lang/Thread;)Ljava/lang/Object;

For more details, see "4.3 Descriptors" in "The Java Virtual Machine Specification, Second Edition". For various Java Bytecode instruction sets, see "6. The Java Virtual Machine Instruction Set" in "The Java Virtual Machine Specification, Second Edition".

Class File Format

Before explaining the Java class file format, let's review an example that frequently occurs in Java Web applications.

Symptom

When writing and executing JSP on Tomcat, the JSP did not run, and the following error occurred.

1.Servlet.service() for servlet jsp threw exception org.apache.jasper.JasperException: Unable to compile class for JSP Generated servlet error:
2.The code of method _jspService(HttpServletRequest, HttpServletResponse) is exceeding the 65535 bytes limit"

Reasons

The error message above varies slightly depending on the Web application server, however, one thing is the same; it is because of the 65535 byte limit. The 65535 byte limit is one of the JVM limitations, and stipulates that the size of one method cannot be more than 65535 bytes.

I will present the meaning of the 65535 byte limit and why it has been set in more detailed manner.

The branch/jump instructions used in the Java Bytecode are "goto" and "jsr".

1.goto [branchbyte1] [branchbyte2]
2.jsr [branchbyte1] [branchbyte2]

Both of the two receive 2-byte signed branch offset as their Operand so that they can be expanded to the 65535th index at a maximum. However, to support more sufficient branch, Java Bytecode prepares "goto_w" and "jsr_w" that receive 4-byte signed branch offset.

1.goto_w [branchbyte1] [branchbyte2] [branchbyte3] [branchbyte4]
2.jsr_w [branchbyte1] [branchbyte2] [branchbyte3] [branchbyte4]

With the two, branch is available with an index exceeding 65535. Therefore, the 65535 byte limit of Java method may be overcome. However, due to various other limits of the Java class file format, the Java method still cannot exceed 65535 bytes. To view other limits, I will simply explain the class file format.

The outline of a Java class file is as follows: 

01.ClassFile {
02.    u4 magic;
03.    u2 minor_version;
04.    u2 major_version;
05.    u2 constant_pool_count;
06.    cp_info constant_pool[constant_pool_count-1];
07.    u2 access_flags;
08.    u2 this_class;
09.    u2 super_class;
10.    u2 interfaces_count;
11.    u2 interfaces[interfaces_count];
12.    u2 fields_count;
13.    field_info fields[fields_count];
14.    u2 methods_count;
15.    method_info methods[methods_count];
16.    u2 attributes_count;
17.    attribute_info attributes[attributes_count];}

The above is included in "4.1. The ClassFile Structure" of "The Java Virtual Machine Specification, Second Edition".

The first 16 bytes of the UserService.class file disassembled earlier are shown as follows in the Hex Editor.

ca fe ba be 00 00 00 32 00 28 07 00 02 01 00 1b

With this value, see the class file format.

  • magic: The first 4 bytes of the class file are the magic number. This is a pre-specified value to distinguish the Java class file. As shown in the Hex Editor above, the value is always 0xCAFEBABE. In short, when the first 4 bytes of a file is 0xCAFEBABE, it can be regarded as the Java class file. This is a kind of "witty" magic number related to the name "Java".
  • minor_version, major_version: The next 4 bytes indicate the class version. As the UserService.class file is 0x00000032, the class version is 50.0. The version of a class file compiled by JDK 1.6 is 50.0, and the version of a class file compiled by JDK 1.5 is 49.0. The JVM must maintain backward compatibility with class files compiled in a lower version than itself. On the other hand, when a upper-version class file is executed in the lower-version JVM, java.lang.UnsupportedClassVersionError occurs.
  • constant_pool_count, constant_pool[]: Next to the version, the class-type constant pool information is described. This is the information included in the Runtime Constant Pool area, which will be explained later. While loading the class file, the JVM includes the constant_pool information in the Runtime Constant Pool area of the method area. As the constant_pool_count of the UserService.class file is 0x0028, you can see that the constant_pool has (40-1) indexes, 39 indexes.
  • access_flags: This is the flag that shows the modifier information of a class; in other words, it shows public, final, abstract or whether or not to interface.
  • this_class, super_class: The index in the constant_pool for the class corresponding to this and super, respectively.
  • interfaces_count, interfaces[]: The index in the the constant_pool for the number of interfaces implemented by the class and each interface.
  • fields_count, fields[]: The number of fields and the field information of the class. The field information includes the field name, type information, modifier, and index in the constant_pool.
  • methods_count, methods[]: The number of methods in a class and the methods information of the class. The methods information includes the methods name, type and number of the parameters, return type, modifier, index in the constant_pool, execution code of the method, and exception information.
  • attributes_count, attributes[]: The attribute_info structure has various attributes. For field_info or method_info, attribute_info is used.

The javap program briefly shows the class file format in a format that users can read. When UserService.class is analyzed using the "javap -verbose" option, the following contents are printed.

01.Compiled from "UserService.java"
02. 
03.public class com.nhn.service.UserService extends java.lang.Object
04.  SourceFile: "UserService.java"
05.  minor version: 0
06.  major version: 50
07.  Constant pool:const #1 = class        #2;     //  com/nhn/service/UserService
08.const #2 = Asciz        com/nhn/service/UserService;
09.const #3 = class        #4;     //  java/lang/Object
10.const #4 = Asciz        java/lang/Object;
11.const #5 = Asciz        admin;
12.const #6 = Asciz        Lcom/nhn/user/UserAdmin;;// … omitted - constant pool continued …
13. 
14.{
15.// … omitted - method information …
16. 
17.public void add(java.lang.String);
18.  Code:
19.   Stack=2, Locals=2, Args_size=2
20.   0:   aload_0
21.   1:   getfield        #15; //Field admin:Lcom/nhn/user/UserAdmin;
22.   4:   aload_1
23.   5:   invokevirtual   #23; //Method com/nhn/user/UserAdmin.addUser:(Ljava/lang/String;)Lcom/nhn/user/User;
24.   8:   pop
25.   9:   return  LineNumberTable:
26.   line 14: 0
27.   line 15: 9  LocalVariableTable:
28.   Start  Length  Slot  Name   Signature
29.   0      10      0    this       Lcom/nhn/service/UserService;
30.   0      10      1    userName       Ljava/lang/String; // … Omitted - Other method information …
31.}

Due to a lack of space, I have extracted some parts from the entire printout. The entire printout shows you the various information included in the constant pool and the contents of each method.

The 65535 byte limit of method size is related to the contents of method_info struct. The method_info struct has Code, LineNumberTable, and LocalVariableTable attribute, as shown in the "javap -verbose" print shown above. All of the values corresponding to the length of LineNumberTable, LocalVariableTable, and exception_table included in the Code attribute are fixed at 2 bytes. Therefore, the method size cannot exceed the length of LineNumberTable, LocalVariableTable, and exception_table, and is limited to 65535 bytes.

Many people have complaints about the method size limit, and the JVM specifications state that 'it may be expandable later.’ However, no explicit move toward improvement has been made so far. Considering the characteristic of JVM specifications that loads almost same contents in the class file to the method area, it will be significantly difficult to expand the method size while maintining backward compatibility.

What will happen if an incorrect class file is created because of a Java compiler error? Or, what if due to errors in network transfer or file copy process, a class file can be broken?

To prepare for such cases, the Java class loader is verified through a very strict and tight process. The JVM specifications explicitly detail the process.

Note

How can we verify that the JVM successfully executes the class file verification process? How can we verify that various JVMs from various JVM vendors satisfy the JVM specifications? For verification, Oracle provides a test tool, TCK (Technology Compatibility Kit). The TCK verifies a JVM specification by executing ten thousands of tests, including a many incorrect class files in various ways. After passing the TCK, the JVM can be called a JVM. 

Like TCK, there is JCP (Java Community Process; http://jcp.org), which proposes new Java technical specifications as well as Java specifications. For the JCP, a specification document, reference implementation, and TCK for a proposed JSR (Java Specification Request) must be completed to complete JSR. Users who want to use new Java technology proposed as JSR should license the implementation from the RI provider, or directly implement it and test the implementation with TCK.

JVM Structure

The code written in Java is executed by following the process shown in the figure below. 

 java-code-execution-process.png

Figure 1: Java Code Execution Process.

A class loader loads the compiled Java Bytecode to the Runtime Data Areas, and the execution engine executes the Java Bytecode.

Class Loader

Java provides a dynamic load feature; it loads and links the class when it refers to a class for the first time at runtime, not compile time. JVM's class loader executes the dynamic load. The features of Java class loader are as follows:

  • Hierarchical Structure: Class loaders in Java are organized into a hierarchy with a parent-child relationship. The Bootstrap Class Loader is the parent of all class loaders.
  • Delegation mode: Based on the hierarchical structure, load is delegated between class loaders. When a class is loaded, the parent class loader is checked to determine whether or not the class is in the parent class loader. If the upper class loader has the class, the class is used. If not, the class loader requested for loading loads the class.
  • Visibility limit: A child class loader can find the class in the parent class loader; however, a parent class loader can find the class in the child class loader.
  • Unload is not allowed: A class loader can load a class but cannot unload it. Instead of unloading, the current class loader can be deleted, and a new class loader can be created.

Each class loader has its namespace that stores the loaded classes. When a class loader loads a class, it searches the class based on FQCN (Fully Qualified Class Name) stored in the namespace to check whether or not the class has been already loaded. Even if the class has an identical FQCN but a different namespace, it is regarded as a different class. A different namespace means that the class has been loaded by another class loader.

The following figure illustrates the class loader delegation model.

 class-loader-delegation-model.png

Figure 2: Class Loader Delegation Model.

When a class loader is requested for class load, it checks whether or not the class exists in the class loader cache, the parent class loader, and itself, in the order listed. In short, it checks whether or not the class has been loaded in the class loader cache. If not, it checks the parent class loader. If the class is not found in the bootstrap class loader, the requested class loader searches for the class in the file system.

  • Bootstrap class loader: This is created when running the JVM. It loads Java APIs, including object classes. Unlike other class loaders, it is implemented in native code instead of Java.
  • Extension class loader: It loads the extension classes excluding the basic Java APIs. It also loads various security extension functions.
  • System class loader: If the bootstrap class loader and the extension class loader load the JVM components, the system class loader loads the application classes. It loads the class in the $CLASSPATH specified by the user.
  • User-defined class loader: This is a class loader that an application user directly creates on the code.

Frameworks such as Web application server (WAS) use it to make Web applications and enterprise applications run independently. In other words, this guarantees the independence of applications through class loader delegation model. Such a WAS class loader structure uses a hierarchical structure that is slightly different for each WAS vendor.

If a class loader finds an unloaded class, the class is loaded and linked by following the process illustrated below.

 class-load-stage.png

Figure 3: Class Load Stage.

Each stage is described as follows.

  • Loading: A class is obtained from a file and loaded to the JVM memory.
  • Verifying: Check whether or not the read class is configured as described in the Java Language Specification and JVM specifications. This is the most complicated test process of the class load processes, and takes the longest time. Most cases of the JVM TCK test cases are to test whether or not a verification error occurs by loading wrong classes.
  • Preparing: Prepare a data structure that assigns the memory required by classes and indicates the fields, methods, and interfaces defined in the class.
  • Resolving: Change all symbolic references in the constant pool of the class to direct references.
  • Initializing: Initialize the class variables to proper values. Execute the static initializers and initialize the static fields to the configured values.

The JVM specification defines the tasks. However, it allows flexible application of the execution time.

Runtime Data Areas

 runtime-data-access-configuration.png

Figure 4: Runtime Data Areas Configuration.

Runtime Data Areas are the memory areas assigned when the JVM program runs on the OS. The runtime data areas can be divided into 6 areas. Of the six, one PC Register, JVM Stack, and Native Method Stack are created for one thread. Heap, Method Area, and Runtime Constant Pool are shared by all threads.

  • PC register: One PC (Program Counter) register exists for one thread, and is created when the thread starts. PC register has the address of a JVM instruction being executed now.
  • JVM stack: One JVM stack exists for one thread, and is created when the thread starts. It is a stack that saves the struct (Stack Frame). The JVM just pushes or pops the stack frame to the JVM stack. If any exception occurs, each line of the stack trace shown as a method such as printStackTrace() expresses one stack frame.

 jvm-stack-configuration.png

Figure 5: JVM Stack Configuration.

Stack frame: One stack frame is created whenever a method is executed in the JVM, and the stack frame is added to the JVM stack of the thread. When the method is ended, the stack frame is removed. Each stack frame has the reference for local variable array, Operand stack, and runtime constant pool of a class where the method being executed belongs. The size of local variable array and Operand stack is determined while compiling. Therefore, the size of stack frame is fixed according to the method.

Local variable array: It has an index starting from 0. 0 is the reference of a class instance where the method belongs. From 1, the parameters sent to the method are saved. After the method parameters, the local variables of the method are saved.

Operand stack: An actual workspace of a method. Each method exchanges data between the Operand stack and the local variable array, and pushes or pops other method invoke results. The necessary size of the Operand stack space can be determined during compiling. Therefore, the size of the Operand stack can also be determined during compiling.

  • Native method stack: A stack for native code written in a language other than Java. In other words, it is a stack used to execute C/C++ codes invoked through JNI (Java Native Interface). According to the language, a C stack or C++ stack is created.
  • Method area: The method area is shared by all threads, created when the JVM starts. It stores runtime constant pool, field and method information, static variable, and method bytecode for each of the classes and interfaces read by the JVM. The method area can be implemented in various formats by JVM vendor. Oracle Hotspot JVM calls it Permanent Area or Permanent Generation (PermGen). The garbage collection for the method area is optional for each JVM vendor.
  • Runtime constant pool: An area that corresponds to the constant_pool table in the class file format. This area is included in the method area; however, it plays the most core role in JVM operation. Therefore, the JVM specification separately describes its importance. As well as the constant of each class and interface, it contains all references for methods and fields. In short, when a method or field is referred to, the JVM searches the actual address of the method or field on the memory by using the runtime constant pool.
  • Heap: A space that stores instances or objects, and is a target of garbage collection. This space is most frequently mentioned when discussing issues such as JVM performance. JVM vendors can determine how to configure the heap or not to collect garbage.

Let's go back to the disassembled bytecode we discussed previously. 

1.public void add(java.lang.String);
2.  Code:
3.   0:   aload_0
4.   1:   getfield        #15; //Field admin:Lcom/nhn/user/UserAdmin;
5.   4:   aload_1
6.   5:   invokevirtual   #23; //Method com/nhn/user/UserAdmin.addUser:(Ljava/lang/String;)Lcom/nhn/user/User;
7.   8:   pop
8.   9:   return

Comparing the disassembled code and the assembly code of the x86 architecture that we sometimes see, the two have a similar format, OpCode; however, there is a difference in that Java Bytecode does not write register name, memory addressor, or offset on the Operand. As described before, the JVM uses stack. Therefore, it does not use register, unlike the x86 architecture that uses registers, and it uses index numbers such as 15 and 23 instead of memory addresses since it manages the memory by itself. The 15 and 23 are the indexes of the constant pool of the current class (here, UserService class). In short, the JVM creates a constant pool for each class, and the pool stores the reference of the actual target.

Each row of the disassembled code is interpreted as follows.

  • aload_0: Add the #0 index of the local variable array to the Operand stack. The #0 index of the local variable array is always this, the reference for the current class instance.
  • getfield #15: In the current class constant pool, add the #15 index to the Operand stack. UserAdmin admin field is added. Since the admin field is a class instance, a reference is added.
  • aload_1: Add the #1 index of the local variable array to the Operand stack. From the #1 index of the local variable array, it is a method parameter. Therefore, the reference of String userName sent while invoking add() is added.
  • invokevirtual #23: Invoke the method corresponding to the #23 index in the current class constant pool. At this time, the reference added by using getfield and the parameter added by using aload_1 are sent to the method to invoke. When the method invocation is completed, add the return value to the Operand stack.
  • pop: Pop the return value of invoking by using invokevirtual from the Operand stack. You can see that the code compiled by the previous library has no return value. In short, the previous has no return value, so there was no need to pop the return value from the stack.
  • return: Complete the method.

The following figure will help you understand the explanation.

example-of-java-bytecode-loaded-on-runtime-data-areas.png

Figure 6: Example of Java Bytecode Loaded on Runtime Data Areas.

For reference, in this method, no local variable array has been changed. So the figure above displays the changes in Operand stack only. However, in most cases, local variable array is also changed. Data transfer between the local variable array and the Operand stack is made by using a lot of load instructions (aload, iload) and store instructions (astore, istore). 

In this figure, we have checked the brief description of the runtime constant pool and the JVM stack. When the JVM runs, each class instance will be assigned to the heap, and class information including User, UserAdmin, UserService, and String will be stored in the method area.

Execution Engine

The bytecode that is assigned to the runtime data areas in the JVM via class loader is executed by the execution engine. The execution engine reads the Java Bytecode in the unit of instruction. It is like a CPU executing the machine command one by one. Each command of the bytecode consists of a 1-byte OpCode and additional Operand. The execution engine gets one OpCode and execute task with the Operand, and then executes the next OpCode.

But the Java Bytecode is written in a language that a human can understand, rather than in the language that the machine directly executes. Therefore, the execution engine must change the bytecode to the language that can be executed by the machine in the JVM. The bytecode can be changed to the suitable language in one of two ways.

  • Interpreter: Reads, interprets and executes the bytecode instructions one by one. As it interprets and executes instructions one by one, it can quickly interpret one bytecode, but slowly executes the interpreted result. This is the disadvantage of the interpret language. The 'language' called Bytecode basically runs like an interpreter.
  • JIT (Just-In-Time) compiler: The JIT compiler has been introduced to compensate for the disadvantages of the interpreter. The execution engine runs as an interpreter first, and at the appropriate time, the JIT compiler compiles the entire bytecode to change it to native code. After that, the execution engine no longer interprets the method, but directly executes using native code. Execution in native code is much faster than interpreting instructions one by one. The compiled code can be executed quickly since the native code is stored in the cache. 

However, it takes more time for JIT compiler to compile the code than for the interpreter to interpret the code one by one. Therefore, if the code is to be executed just once, it is better to interpret it instead of compiling. Therefore, the JVMs that use the JIT compiler internally check how frequently the method is executed and compile the method only when the frequency is higher than a certain level.

 java-compiler-and-jit-compiler.png

Figure 7: Java Compiler and JIT Compiler.

How the execution engine runs is not defined in the JVM specifications. Therefore, JVM vendors improve their execution engines using various techniques, and introduce various types of JIT compilers. 

Most JIT compilers run as shown in the figure below: 

 jit-compiler.png

Figure 8: JIT Compiler.

The JIT compiler converts the bytecode to an intermediate-level expression, IR (Intermediate Representation), to execute optimization, and then converts the expression to native code.

Oracle Hotspot VM uses a JIT compiler called Hotspot Compiler. It is called Hotspot because Hotspot Compiler searches the 'Hotspot' that requires compiling with the highest priority through profiling, and then it compiles the hotspot to native code. If the method that has the bytecode compiled is no longer frequently invoked, in other words, if the method is not the hotspot any more, the Hotspot VM removes the native code from the cache and runs in interpreter mode. The Hotspot VM is divided into the Server VM and the Client VM, and the two VMs use different JIT compilers.

hotspot-client-vm-and-server-vm.png 

Figure 9: Hotspot Client VM and Server VM.

The client VM and the server VM use an identical runtime; however, they use different JIT compilers, as shown in the above figure. The client VM and the server VM use an identical runtime, however, they use different JIT compilers as shown in the above figure. Advanced Dynamic Optimizing Compiler used by the server VM uses more complex and diverse performance optimization techniques.

IBM JVM has introduced AOT (Ahead-Of-Time) Compiler from IBM JDK 6 as well as the JIT compiler. This means that many JVMs share the native code compiled through the shared cache. In short, the code that has been already compiled through the AOT compiler can be used by another JVM without compiling. In addition, IBM JVM provides a fast way of execution by pre-compiling code to JXE (Java EXecutable) file format using the AOT compiler.

Most Java performance improvement is accomplished by improving the execution engine. As well as the JIT compiler, various optimization techniques are being introduced so the JVM performance can be continuously improved. The biggest difference between the initial JVM and the latest JVM is the execution engine.

Hotspot compiler has been introduced to Oracle Hotspot VM from version 1.3, and JIT compiler has been introduced to Dalvik VM from Android 2.2.

Note

The technique in which an intermediate language such as bytecode is introduced, the VM executes the bytecode, and the JIT compiler improves the performance of JVM is also commonly used in other languages that have introduced intermediate languages. For Microsoft's .Net, CLR (Common Language Runtime), a kind of VM, executes a kind of bytecode, called CIL (Common Intermediate Language). CLR provides the AOT compiler as well as the JIT compiler. Therefore, if source code is written in C# or VB.NET and compiled, the compiler creates CIL and the CIL is executed on the CLR with the JIT compiler. The CLR uses the garbage collection and runs as a stack machine like the JVM.

The Java Virtual Machine Specification, Java SE 7 Edition

On 28th July, 2011, Oracle released Java SE 7 and updated the JVM specifications to Java SE 7 version. After releasing "The Java Virtual Machine Specification, Second Edition" in 1999, it took 12 years for Oracle to release the updated version. The updated version includes various changes and modifications accumulated over 12 years, and describes more clear specifications. In addition, it reflects the contents included in "The Java Language Specification, Java SE 7 Edition" released with Java SE 7. The major changes can be summarized as follows:

  • Generics introduced from Java SE 5.0, supporting variable argument method
  • Bytecode verification process technique changed since Java SE 6
  • Added invokedynamic instruction and related class file formats for supporting dynamic type languages
  • Deleted the description of the concept of the Java language itself and referred reader to the Java language specifications
  • Deleted the description on Java Thread and Lock, and transferred these to the Java language specifications

The biggest change of these is the addition of invokedynamic instruction. This means that a change was made in the JVM internal instruction sets, as the JVM started to support dynamic type languages of which type is not fixed, such as script languages, as well as Java language from Java SE 7. The OpCode 186 which had not been used previously has been assigned to the new instruction, invokedynamic, and new contents have been added to the class file format to support the invokedynamic.

The version of the class file created by the Java compiler of Java SE 7 is 51.0. The version of Java SE 6 is 50.0. Much of the class file format has been changed. Therefore, class files with version 51.0 cannot be executed in the Java SE 6 JVM. 

Despite these various changes, the 65535 byte limit of the Java method has not been removed. Unless the JVM class file format is innovatively changed, it may not be removed in the future.

For reference, Oracle Java SE 7 VM supports G1, the new garbage collection; however, it is limited to the Oracle JVM, so JVM itself does not limit any garbage collection type. Therefore, the JVM specifications do not describe that.

String in switch Statements

Java SE 7 adds various grammars and features. However, compared to the various changes in language of Java SE 7, there are not so many changes in the JVM. So, how can the new features of the Java SE 7 be implemented? We will see how String in switch Statements (a function to add a string to a switch() statement as a comparison) has been implemented in Java SE 7 by disassembling it.

For example, the following code has been written.

01.// SwitchTest
02.public class SwitchTest {
03.    public int doSwitch(String str) {
04.        switch (str) {
05.        case "abc":        return 1;
06.        case "123":        return 2;
07.        default:         return 0;
08.        }
09.    }
10.}

Since it is a new function of Java SE 7, it cannot be compiled using the Java compiler for Java SE 6 or lower versions. Compile it using the javac of Java SE 7. The following screen is the compiling result printed by using javap –c.

01.C:Test>javap -c SwitchTest.classCompiled from "SwitchTest.java"
02.public class SwitchTest {
03.  public SwitchTest();
04.    Code:
05.       0: aload_0
06.       1: invokespecial #1                  // Method java/lang/Object."<init>":()V
07.       4: return  public int doSwitch(java.lang.String);
08.    Code:
09.       0: aload_1
10.       1: astore_2
11.       2: iconst_m1
12.       3: istore_3
13.       4: aload_2
14.       5: invokevirtual #2                  // Method java/lang/String.hashCode:()I
15.       8: lookupswitch  { // 2
16.                 48690: 50
17.                 96354: 36
18.               default: 61
19.          }
20.      36: aload_2
21.      37: ldc           #3                  // String abc
22.      39: invokevirtual #4                  // Method java/lang/String.equals:(Ljava/lang/Object;)Z
23.      42: ifeq          61
24.      45: iconst_0
25.      46: istore_3
26.      47: goto          61
27.      50: aload_2
28.      51: ldc           #5                  // String 123
29.      53: invokevirtual #4                  // Method java/lang/String.equals:(Ljava/lang/Object;)Z
30.      56: ifeq          61
31.      59: iconst_1
32.      60: istore_3
33.      61: iload_3
34.      62: lookupswitch  { // 2
35.                     0: 88
36.                     1: 90
37.               default: 92
38.          }
39.      88: iconst_1
40.      89: ireturn
41.      90: iconst_2
42.      91: ireturn
43.      92: iconst_0
44.      93: ireturn

A significantly longer bytecode than the Java source code has been created. First, you can see that lookupswitch instruction has been used for switch() statement in Java bytecode. However, two lookupswitch instructions have been used, not the one lookupswitch instruction. When disassembling the case in which int has been added to switch() statement, only one lookupswitch instruction has been used. This means that the switch() statement has been divided into two statements to process the string. See the annotation of the #5, #39, and #53 byte instructions to see how the switch() statement has processed the string.

In the #5 and #8 byte, first, hashCode() method has been executed and switch(int) has been executed by using the result of executing hashCode() method. In the braces of the lookupswitch instruction, branch is made to the different location according to the hashCode result value. String "abc" is hashCode result value 96354, and is moved to #36 byte. String "123" is hashCode result value 48690, and is moved to #50 byte.

In the #36, #37, #39, and #42 bytes, you can see that the value of the str variable received as an argument is compared using the String "abc" and the equals() method. If the results are identical, '0' is inserted to the #3 index of the local variable array, and the string is moved to the #61 byte.

In this way, in the #50, #51, #53, and #56 bytes, you can see that the value of the str variable received as an argument is compared by using the String "123" and the equals() method. If the results are identical, '1' is inserted to the #3 index of the local variable array and the string is moved to the #61 byte.

In the #61 and #62 bytes, the value of the #3 index of the local variable array, i.e., '0', '1', or any other value, is lookupswitched and branched.

In other words, in Java code, the value of the str variable received as the switch() argument is compared using the hashCode() method and the equals() method. With the result int value, switch() is executed.

In this result, the compiled bytecode is not different from the previous JVM specifications. The new feature of Java SE 7, String in switch is processed by the Java compiler, not by the JVM itself. In this way, other new features of Java SE 7 will also be processed by the Java compiler.

Conclusion

I don't think that we need to review how Java has been developed to use Java well. So many Java developers develop great applications and libraries without understanding JVM deeply. However, if you understand JVM, you will understand Java more, and it will be helpful to solve the problems like the case we have reviewed here.

Besides the description mentioned here, the JVM has various features and technologies. The JVM specifications provide a flexible specification for JVM vendors to provide more advanced performance so that various technologies can be applied by the vendor. In particular, garbage collection is the technique used by most languages that provides usability similar to that of a VM, the latest and state-of-the-art technique in its performance. However, as this has been discussed in many more prominent studies, I did not explain it deeply in this article.

For Korean speakers, if you need more information on the internal structure of JVM, I recommend you to refer to "Java Performance Fundamental" (Hando Kim, Seoul, EXEM, 2009). The book is written in Korean so it is easy to read. I have referenced this book as well as the JVM specifications to write this article. For English speaking readers, there should be many books covering Java Performance topic.

By Se Hoon Park, Messaging Platform Development Team, NHN Corporation.

Filed under  //  JVM   Java  
Dec 27 / 3:25pm

A List of Apps to Install on Your New iPad

We can imagine that a fair share of iPads got unwrapped this morning, and the first thing you’re going to want to do is switch it on and get a few awesome apps on there. Last year, we gave you a long list of apps that will help you get started with all of the popular essentials.

This year, we’ve decided to do the same thing, adding a ton of great apps to the list that you’ll want to get on to your iPad straight away. The list includes apps to handle your photos and videos, music apps, productivity apps, a few essentials to meet all of your social media needs, and of course a few games to keep you entertained.

Media

SnapSeed Got a new iPad? These are the first apps you should install on it

Snapseed - If you’re a photography buff and are finally going to get that iPad you always wanted, we’d recommend getting yourself an iPad SD Card Reader from Photojojo. This makes it easier than ever to get your photos straight from your camera onto your iPad, and it’s half the price of Apple’s iPad Camera Connection Kit. Once the photos are transferred, you’re going to want to be able to edit, process and share your photos. The best app we’ve come across that does this is Snapseed, which we’ve previously reviewed here. The $4.99 app is worth every cent. It comes with easy-to-use autocorrect features which will adjust your images at the click of a button, or if you want more control over your image’s final look, you can manually adjust basics like sharpness, brightness and contrast, and use some pretty cool looking filters to give your photos a grunge or vintage look. Snapseed is controlled entirely by swiping your image either right to left or up and down.

Pandora - For streaming random music or personalized radio stations, Pandora is a great option for those of you who are lucky enough to live in the right country. The iPad app makes it easy to access all of your Pandora radio stations when you log in, and you can rate, skip and pause the music.

MusicTandem - For those of you who don’t have access to Pandora, check out the $0.99 app MusicTandem, which we reviewed here. With the app, you can create personalized radio stations based on artist, genre or tag. When selecting music based on a specific artist, you can choose to play music only by that singer or band, or to play a variety of music similar to that one artist. Our one complaint when it comes to MusicTandem is that it doesn’t play in the background which is a much-needed feature. Another music app which certainly deserves an honourable mention for discovering new music is MusicHunter, which we’ve reviewed in-depth here, but it’s worth noting that the app is now completely free.

Shazam - Another essential app for music buffs is Shazam. The free app comes with unlimited tagging, so if you hear a song on the radio or TV and want to know what it is, just whip out your new iPad, hit tag, and let the magic of Shazam do the rest for you.

Showyou - If you want to watch online videos on your iPad, a great way to do that is with the the free app Showyou. You can connect the app to your YouTube, Vimeo, Twitter, Facebook and Tumblr accounts, making it easy to find interesting videos, and instantly share them with your friends and followers. Showyou is not only fully searchable if you’re looking for a specific video to watch, but it also comes pre-loaded with grids featuring videos on various topics and from various sources, with everything from Al Jazeera to Reddit TV. If you want to save a video to watch later, all you have to do is sign up for a free account.

FlexPlayer – Since VLC was pulled from the iTunes App Store, the best free alternative available now is FlexPlayer. You don’t have to convert your videos before transferring them to your iPad. Just hook your iPad up to your computer, fire up iTunes and copy your movie files to your iPad. FlexPlayer is a slick video player, and add to that the fact that it’s ad-free, there’s really nothing else that you need to watch your movies and TV shows on your iPad.

Social Media

Zite Got a new iPad? These are the first apps you should install on it

Zite - We’re big fans of Zite here at The Next Web, and reviewed it when it first launched here. Zite is one of the best ways to get your daily news fix with minimal effort on your part. All you have to do is plug in your preferred topics and the app will do the rest for you. And the more you use it, the better it becomes at making recommendations based on your personal preferences. Zite’s UI is best bar none. It’s sleek and minimal, showcases the articles you’re reading beautifully, and makes it easy to share posts on your favourite social networks or save them for reading later.

Blogsy - You might not get as much writing as you’d expect using the iPad, but if you’re really serious about it, investing in a bluetooth keyboard to go alongside your iPad will make life much easier. So which apps should you download if you’re serious about your writing? The first app a blogger will probably want to download is Blogsy. The $4.99 app supports WordPress, Blogger and Posterous, while also allowing you to access your Flickr, Picasa and YouTube accounts. Blogsy makes it easy to write up and share media-rich posts straight from your iPad, allowing you to upload images and drag-and-drop videos, and also comes with text formatting.

Verbs - Verbs is a pretty slick chat app for the iPad with Google Talk, AIM, Facebook and MobileMe support, and allows you to add multiple accounts. For easy photo-sharing you can also connect your Cloud or Droplr accounts. Normally a paid app, you can get it free for now, but the free version does have its limits. The paid upgrade at $4.99 will get you push notifications, whereas the free version will only notify you of messages for the first 10 minutes that the app runs in the background.

stumbleupon Got a new iPad? These are the first apps you should install on it

StumbleUpon -  When StumbleUpon finally launched an updated iPad app this year, one of our observations when reviewing it was that the iPad was made for this kind of site. And the app certainly lives up to that. We should warn you though, if you’re a fan of StumbleUpon, installing this app on your new iPad will suck you right in and make it hard to put your iPad down.

PhotoSync - For all of your photo uploading needs, rather than download a different app for each account, PhotoSync supports Dropbox, Picasa, Facebook, SmugMug and Flickr. Select multiple photos to upload simultaneously to all of your accounts. You have extensive control over how each account is configured – from privacy settings, folders and more.

500px - Want to browse gorgeous photos on your brand new iPad? Then the first app you need to download is the official 500px iPad app, which we reviewed here. The quality of photography on 500px is pretty impressive, and where better to check it out than on the iPad. If you have a 500px account, you can log in to follow other users and add their photos to your favourites, but unfortunately, you can’t actually upload any images to the site from your iPad using this free app. If you want that capability, you’re going to have to opt for the $0.99 app, PhotoStackr for 500px.

Productivity

Polkast Got a new iPad? These are the first apps you should install on it

Polkast - If you prefer not to have to upload files to the cloud, Polkast is a great alternative. The only drawback is that you’ll have to stay logged in to your computer in order to access your files. On the other hand, there are no complicated settings to fiddle with in order to get the connection to work. Simply install the app on your Mac or Windows computer, install the app on your iPad, and you’re good to go. You can play videos and music straight from the app, or download the files to your iPad. If you modify files on your iPad, you can save the new file back to your computer right from your iPad.

Wunderlist - If you’re looking for a multiplatform productivity app to keep track of your task list, look no further than Wunderlist. The app is completely free regardless of which OS you use, and also syncs to the cloud, so you’ll be sure never to lose your task list, and have access to it on all of your mobile devices and on your computer as well. It’s no surprise the app has over 1 million users now.

iA Writer – While Blogsy has you covered for most of your online writing needs, if you need something to save notes locally on your iPad, we’ll tell you right now, the native notes app that the iPad ships with simply isn’t going to cut it. The $1.99 app iA Writer is a great option because it comes packed with some pretty decent features without locking your content on your iPad. You can sync all of your content on the distraction-free iPad writing app with its Mac counterpart, using Dropbox or iCloud.  For those of you who don’t use Macs, you can simply take advantage of the Dropbox backup to get your documents from your iPad to your computer.

Switch - If you know for a fact that your new iPad is going to become a form of entertainment for the whole family, you’ll be disappointed to find that you can’t password protect your apps or create user accounts. The closest you’ll come is to use the $4.99 app Switch which allows you to create browser-related accounts. With Switch, each user can password protect their browsing, bookmarks and logged-in accounts. Of course this applies only to the browser app and doesn’t affect any other apps on your iPad.

Utilities

FlipClock Got a new iPad? These are the first apps you should install on it

Flip Clock HD - Your new iPad is going to look great sitting on your bedside table so you might as well put it to good use. With Flip Clock HD you can turn your iPad into a stylish clock which not only lets you keep track of the time and date, but you can also check out your local weather at a glance. Using the app as an alarm clock will give you the option of waking up to tracks on your iPad. When using the app as an alarm clock you will have to keep it open, since it doesn’t work in the background, but the screen does dim if you leave it running without touching the screen.

Skitch - If you’ve been using Skitch on your Mac, you’ll be happy to know that it’s available on the iPad as well. The free app is the best option available for annotating your images and screenshots and instantly save them to your Evernote account.

Games

TheSims Got a new iPad? These are the first apps you should install on it

The Sims 3 - If you’re looking for a game to keep you busy for endless hours, Sims might be the right game for you. The best part is that EA has made a really decent free version available so you can try it out first. Although right now the full version is available for just $0.99 so get it on sale while you can.

Temple Run - Before downloading Temple Run, be warned. It’s extremely addictive. The current version of the free iPad game has been rated over 55,000 times and is standing strong on a 5 star rating. And for good reason. The 3D adventure will have you making your way through a maze, collecting coins, and swiping left and right in the high speed game.

Machinarium - You might be surprised to find that the $4.99 app Machinarium, which only works on the iPad 2, was created using Adobe Flash. The game has received rave reviews, and with the amazing graphics and music, you’ll understand why. The game is a gorgeous experience of puzzles slipped into a point-and-click adventure, and makes for a fun and challenging experience if you have a bit of time to kill.

The Next Web

Of course, we also recommend The Next Web’s iPad app which makes it easy to keep up with all the latest tech news, using a beautiful app designed specifically for Apple’s tablet.

What are your favourite must-have iPad apps? Let us know in the comments.

Filed under  //  apple   apps   ipad   mobile   tablet  
Nov 23 / 11:29pm

Introduction to Hibernate framework. Hibernate Architecture tutorial

Hibernate was started in 2001 by Gavin King as an alternative to using EJB2-style entity beans. Its mission back then was to simply offer better persistence capabilities than offered by EJB2 by simplifying the complexities and allowing for missing features.

Early in 2003, the Hibernate development team began Hibernate2 releases which offered many significant improvements over the first release.

JBoss, Inc. (now part of Red Hat) later hired the lead Hibernate developers and worked with them in supporting Hibernate. Hibernate is part of JBoss (a division of Red Hat) Enterprise Middleware System (JEMS) suite of products.

1. Introduction

Hibernate is an Object-relational mapping (ORM) tool. Object-relational mapping or ORM is a programming method for mapping the objects to the relational model where entities/classes are mapped to tables, instances are mapped to rows and attributes of instances are mapped to columns of table.

A “virtual object database” is created that can be used from within the programming language.

Hibernate is a persistence framework which is used to persist data from Java environment to database. Persistence is a process of storing the data to some permanent medium and retrieving it back at any point of time even after the application that had created the data ended.

2. Hibernate Architecture

hibernate-architecture-mini

The above diagram shows minimal architecture of Hibernate. It creates a layer between Database and the Application. It loads the configuration details like Database connection string, entity classes, mappings etc.

Hibernate creates persistent objects which synchronize data between application and database.

hibernate-architecture-compre

The above diagram shows a comprehensive architecture of Hibernate. In order to persist data to a database, Hibernate create an instance of entity class (Java class mapped with database table). This object is called Transient object as they are not yet associated with the session or not yet persisted to a database. To persist the object to database, the instance of SessionFactory interface is created. SessionFactory is a singleton instance which implements Factory design pattern. SessionFactory loads hibernate.cfg.xml file (Hibernate configuration file. More details in following section) and with the help of TransactionFactory and ConnectionProvider implements all the configuration settings on a database.

Each database connection in Hibernate is created by creating an instance of Session interface. Session represents a single connection with database. Session objects are created from SessionFactory object.

Hibernate also provides built-in Transaction APIs which abstracts away the application from underlying JDBC or JTA transaction. Each transaction represents a single atomic unit of work. One Session can span through multiple transactions.

2.1 SessionFactory (org.hibernate.SessionFactory)

A thread-safe, immutable cache of compiled mappings for a single database. A factory for org.hibernate.Session instances. A client of org.hibernate.connection.ConnectionProvider. Optionally maintains a second level cache of data that is reusable between transactions at a process or cluster level.

2.2 Session (org.hibernate.Session)

A single-threaded, short-lived object representing a conversation between the application and the persistent store. Wraps a JDBC java.sql.Connection. Factory for org.hibernate.Transaction. Maintains a first level cache of persistent the application’s persistent objects and collections; this cache is used when navigating the object graph or looking up objects by identifier.

2.3 Persistent objects and collections

Short-lived, single threaded objects containing persistent state and business function. These can be ordinary JavaBeans/POJOs. They are associated with exactly one org.hibernate.Session. Once the org.hibernate.Session is closed, they will be detached and free to use in any application layer (for example, directly as data transfer objects to and from presentation).

2.4 Transient and detached objects and collections

Instances of persistent classes that are not currently associated with a org.hibernate.Session. They may have been instantiated by the application and not yet persisted, or they may have been instantiated by a closed org.hibernate.Session.

2.5 Transaction (org.hibernate.Transaction)

(Optional) A single-threaded, short-lived object used by the application to specify atomic units of work. It abstracts the application from the underlying JDBC, JTA or CORBA transaction. A org.hibernate.Session might span several org.hibernate.Transactions in some cases. However, transaction demarcation, either using the underlying API or org.hibernate.Transaction, is never optional.

2.6 ConnectionProvider (org.hibernate.connection.ConnectionProvider)

(Optional) A factory for, and pool of, JDBC connections. It abstracts the application from underlying javax.sql.DataSource or java.sql.DriverManager. It is not exposed to application, but it can be extended and/or implemented by the developer.

2.7 TransactionFactory (org.hibernate.TransactionFactory)

(Optional) A factory for org.hibernate.Transaction instances. It is not exposed to the application, but it can be extended and/or implemented by the developer.

3. Hibernate Configuration

Hibernate configuration is managed by an instance of org.hibernate.cfg.Configuration. An instance of org.hibernate.cfg.Configuration represents an entire set of mappings of an application’s Java types to an SQL database. The org.hibernate.cfg.Configuration is used to build an immutable org.hibernate.SessionFactory. The mappings are compiled from various XML mapping files or from Java 5 Annotations.

Hibernate provides following types of configurations

  1. hibernate.cfg.xml – A standard XML file which contains hibernate configuration and which resides in root of application’s CLASSPATH
  2. hibernate.properties – A Java compliant property file which holds key value pair for different hibernate configuration strings.
  3. Programmatic configuration – This is the manual approach. The configuration can be defined in Java class.

3.1 hibernate.cfg.xml

This is an alternate way of configuring hibernate. The hibernate.cfg.xml file is a standard XML file which contains all the configuration parameters like database connection, class mappings etc. This file needs to be placed root of CLASSPATH of application.

Below is the sample hibernate.cfg.xml file:

01
02
03
04
05
06
07
08
09
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
<?xml version='1.0' encoding='utf-8'?>
<!DOCTYPE hibernate-configuration PUBLIC
    "-//Hibernate/Hibernate Configuration DTD//EN"
    "http://www.hibernate.org/dtd/hibernate-configuration-3.0.dtd">

<hibernate-configuration>

    <!-- a SessionFactory instance listed as /jndi/name -->
    <session-factory
        name="java:hibernate/SessionFactory">

        <!-- properties -->
        <property name="connection.datasource">java:/comp/env/jdbc/MyEmployeeDB</property>
        <property name="dialect">org.hibernate.dialect.MySQLDialect</property>
        <property name="show_sql">false</property>
        <property name="transaction.factory_class">
            org.hibernate.transaction.JTATransactionFactory
        </property>
        <property name="jta.UserTransaction">java:comp/UserTransaction</property>

        <!-- mapping files -->
        <mapping resource="net/viralpatel/hibernate/Employee.hbm.xml"/>
        <mapping resource="net/viralpatel/hibernate/Department.hbm.xml"/>

        <!-- cache settings -->
        <class-cache class="net.viralpatel.hibernate.Employee" usage="read-write"/>
        <class-cache class="net.viralpatel.hibernate.Department" usage="read-only"/>
        <collection-cache collection="net.viralpatel.hibernate.Department.employees" usage="read-write"/>

    </session-factory>

</hibernate-configuration>

Once the hibernate.cfg.xml file is created and placed in root of application’s CLASSPATH, the same can be loaded in Hibernate using following API.

1
SessionFactory sessionFactory = new Configuration().configure().buildSessionFactory();

This above code will load default hibernate.cfg.xml file and all the configuration mentioned in it.

In case you want to override default naming convention and want to have your own configuration file like “employeedb.cfg.xml”, following API can be used:

1
2
3
SessionFactory sf = new Configuration()
    .configure("employeedb.cfg.xml")
    .buildSessionFactory();


Note: Both hibernate.cfg.xml and hibernate.properties files can be provided simultaneously in an application. In this case hibernate.cfg.xml gets precedence over hibernate.properties.

3.2 hibernate.properties

This is the easiest way to get started with Hibernate. Create a file hibernate.properties and place it in root of your applications CLASSPATH.

Below is the sample hibernate.properties file:

01
02
03
04
05
06
07
08
09
10
hibernate.connection.driver_class=com.mysql.jdbc.Driver
hibernate.connection.url= jdbc:mysql://localhost:3306/employee
hibernate.connection.username=root
hibernate.connection.password=swordfish
hibernate.connection.pool_size=1
hibernate.transaction.factory_class = \
    org.hibernate.transaction.JTATransactionFactory
hibernate.transaction.manager_lookup_class = \
    org.hibernate.transaction.JBossTransactionManagerLookup
hibernate.dialect = org.hibernate.dialect.MySQLDialect

For detail description of Configuration parameters, refer this article Hibernate configuration properties.

3.3 Programmatic configuration

We can obtain a org.hibernate.cfg.Configuration instance by instantiating it directly and specifying XML mapping documents. If the mapping files are in the classpath, use addResource(). For example:

1
2
3
Configuration cfg = new Configuration()
    .addResource("Employee.hbm.xml")
    .addResource("Department.hbm.xml");

An alternative way is to specify the mapped class and allow Hibernate to find the mapping document for you:

1
2
3
Configuration cfg = new Configuration()
    .addClass(net.viralpatel.hibernate.Employee.class)
    .addClass(net.viralpatel.hibernate.Department.class);

Hibernate will then search for mapping files named /net/viralpatel/hibernate/Employee.hbm.xml and /net/viralpatel/hibernate/Department.hbm.xml in the classpath. This approach eliminates any hardcoded filenames.

A org.hibernate.cfg.Configuration also allows you to specify configuration properties. For example:

1
2
3
4
5
6
Configuration cfg = new Configuration()
    .addClass(net.viralpatel.hibernate.Employee.class)
    .addClass(net.viralpatel.hibernate.Department.class)
    .setProperty("hibernate.dialect", "org.hibernate.dialect.MySQLInnoDBDialect")
    .setProperty("hibernate.connection.datasource", "java:comp/env/jdbc/test")
    .setProperty("hibernate.order_updates", "true");

4. Building a SessionFactory

Once the instance of org.hibernate.cfg.Configuration is created using any of the above method, the singleton instance of SessionFactory can be created as follow:

1
SessionFactory sessions = cfg.buildSessionFactory();

Hibernate does allow your application to instantiate more than one org.hibernate.SessionFactory. This is useful if you are using more than one database.

5. Getting Session instance

As noted above, Session represents a communication channel between database and application. Each session represents a factory of transactions. Session can be created from SessionFactory as follows:

1
Session session = sessions.openSession(); // get a new Session

Thus, in this article we saw an overview of Hibernate ORM and its architecture. Also we noted its different components like SessionFactory, TransactionFactory, Session etc and APIs to instantiate these objects in your application.

In next tutorial, We will write a Hello World Hibernate program using both XML file based configuration and Annotations.

Stay tuned! :)

Filed under  //  Hibernate   Java  
Nov 21 / 2:41pm

Android Developers Blog: Multithreading For Performance

[This post is by Gilles Debunne, an engineer in the Android group who loves to get multitasked. — Tim Bray]

A good practice in creating responsive applications is to make sure your main UI thread does the minimum amount of work. Any potentially long task that may hang your application should be handled in a different thread. Typical examples of such tasks are network operations, which involve unpredictable delays. Users will tolerate some pauses, especially if you provide feedback that something is in progress, but a frozen application gives them no clue.

In this article, we will create a simple image downloader that illustrates this pattern. We will populate a ListView with thumbnail images downloaded from the internet. Creating an asynchronous task that downloads in the background will keep our application fast.

An Image downloader

Downloading an image from the web is fairly simple, using the HTTP-related classes provided by the framework. Here is a possible implementation:

static Bitmap downloadBitmap(String url) {    final AndroidHttpClient client = AndroidHttpClient.newInstance("Android");    final HttpGet getRequest = new HttpGet(url);    try {        HttpResponse response = client.execute(getRequest);        final int statusCode = response.getStatusLine().getStatusCode();        if (statusCode != HttpStatus.SC_OK) {             Log.w("ImageDownloader", "Error " + statusCode + " while retrieving bitmap from " + url);             return null;        }                final HttpEntity entity = response.getEntity();        if (entity != null) {            InputStream inputStream = null;            try {                inputStream = entity.getContent();                 final Bitmap bitmap = BitmapFactory.decodeStream(inputStream);                return bitmap;            } finally {                if (inputStream != null) {                    inputStream.close();                  }                entity.consumeContent();            }        }    } catch (Exception e) {        // Could provide a more explicit error message for IOException or IllegalStateException        getRequest.abort();        Log.w("ImageDownloader", "Error while retrieving bitmap from " + url, e.toString());    } finally {        if (client != null) {            client.close();        }    }    return null;}

A client and an HTTP request are created. If the request succeeds, the response entity stream containing the image is decoded to create the resulting Bitmap. Your applications' manifest must ask for the INTERNET to make this possible.

Note: a bug in the previous versions of BitmapFactory.decodeStream may prevent this code from working over a slow connection. Decode a new FlushedInputStream(inputStream) instead to fix the problem. Here is the implementation of this helper class:

static class FlushedInputStream extends FilterInputStream {    public FlushedInputStream(InputStream inputStream) {        super(inputStream);    }    @Override    public long skip(long n) throws IOException {        long totalBytesSkipped = 0L;        while (totalBytesSkipped < n) {            long bytesSkipped = in.skip(n - totalBytesSkipped);            if (bytesSkipped == 0L) {                  int byte = read();                  if (byte < 0) {                      break;  // we reached EOF                  } else {                      bytesSkipped = 1; // we read one byte                  }           }            totalBytesSkipped += bytesSkipped;        }        return totalBytesSkipped;    }}

This ensures that skip() actually skips the provided number of bytes, unless we reach the end of file.

If you were to directly use this method in your ListAdapter's getView method, the resulting scrolling would be unpleasantly jaggy. Each display of a new view has to wait for an image download, which prevents smooth scrolling.

Indeed, this is such a bad idea that the AndroidHttpClient does not allow itself to be started from the main thread. The above code will display "This thread forbids HTTP requests" error messages instead. Use the DefaultHttpClient instead if you really want to shoot yourself in the foot.

Introducing asynchronous tasks

The AsyncTask class provides one of the simplest ways to fire off a new task from the UI thread. Let's create an ImageDownloader class which will be in charge of creating these tasks. It will provide a download method which will assign an image downloaded from its URL to an ImageView:

public class ImageDownloader {    public void download(String url, ImageView imageView) {            BitmapDownloaderTask task = new BitmapDownloaderTask(imageView);            task.execute(url);        }    }    /* class BitmapDownloaderTask, see below */}

The BitmapDownloaderTask is the AsyncTask which will actually download the image. It is started using execute, which returns immediately hence making this method really fast which is the whole purpose since it will be called from the UI thread. Here is the implementation of this class:

class BitmapDownloaderTask extends AsyncTask<String, Void, Bitmap> {    private String url;    private final WeakReference<ImageView> imageViewReference;    public BitmapDownloaderTask(ImageView imageView) {        imageViewReference = new WeakReference<ImageView>(imageView);    }    @Override    // Actual download method, run in the task thread    protected Bitmap doInBackground(String... params) {         // params comes from the execute() call: params[0] is the url.         return downloadBitmap(params[0]);    }    @Override    // Once the image is downloaded, associates it to the imageView    protected void onPostExecute(Bitmap bitmap) {        if (isCancelled()) {            bitmap = null;        }        if (imageViewReference != null) {            ImageView imageView = imageViewReference.get();            if (imageView != null) {                imageView.setImageBitmap(bitmap);            }        }    }}

The doInBackground method is the one which is actually run in its own process by the task. It simply uses the downloadBitmap method we implemented at the beginning of this article.

onPostExecute is run in the calling UI thread when the task is finished. It takes the resulting Bitmap as a parameter, which is simply associated with the imageView that was provided to download and was stored in the BitmapDownloaderTask. Note that this ImageView is stored as a WeakReference, so that a download in progress does not prevent a killed activity's ImageView from being garbage collected. This explains why we have to check that both the weak reference and the imageView are not null (i.e. were not collected) before using them in onPostExecute.

This simplified example illustrates the use on an AsyncTask, and if you try it, you'll see that these few lines of code actually dramatically improved the performance of the ListView which now scrolls smoothly. Read Painless threading for more details on AsyncTasks.

However, a ListView-specific behavior reveals a problem with our current implementation. Indeed, for memory efficiency reasons, ListView recycles the views that are displayed when the user scrolls. If one flings the list, a given ImageView object will be used many times. Each time it is displayed the ImageView correctly triggers an image download task, which will eventually change its image. So where is the problem? As with most parallel applications, the key issue is in the ordering. In our case, there's no guarantee that the download tasks will finish in the order in which they were started. The result is that the image finally displayed in the list may come from a previous item, which simply happened to have taken longer to download. This is not an issue if the images you download are bound once and for all to given ImageViews, but let's fix it for the common case where they are used in a list.

Handling concurrency

To solve this issue, we should remember the order of the downloads, so that the last started one is the one that will effectively be displayed. It is indeed sufficient for each ImageView to remember its last download. We will add this extra information in the ImageView using a dedicated Drawable subclass, which will be temporarily bind to the ImageView while the download is in progress. Here is the code of our DownloadedDrawable class:

static class DownloadedDrawable extends ColorDrawable {    private final WeakReference<BitmapDownloaderTask> bitmapDownloaderTaskReference;    public DownloadedDrawable(BitmapDownloaderTask bitmapDownloaderTask) {        super(Color.BLACK);        bitmapDownloaderTaskReference =            new WeakReference<BitmapDownloaderTask>(bitmapDownloaderTask);    }    public BitmapDownloaderTask getBitmapDownloaderTask() {        return bitmapDownloaderTaskReference.get();    }}

This implementation is backed by a ColorDrawable, which will result in the ImageView displaying a black background while its download is in progress. One could use a “download in progress” image instead, which would provide feedback to the user. Once again, note the use of a WeakReference to limit object dependencies.

Let's change our code to take this new class into account. First, the download method will now create an instance of this class and associate it with the imageView:

public void download(String url, ImageView imageView) {     if (cancelPotentialDownload(url, imageView)) {         BitmapDownloaderTask task = new BitmapDownloaderTask(imageView);         DownloadedDrawable downloadedDrawable = new DownloadedDrawable(task);         imageView.setImageDrawable(downloadedDrawable);         task.execute(url, cookie);     }}

The cancelPotentialDownload method will stop the possible download in progress on this imageView since a new one is about to start. Note that this is not sufficient to guarantee that the newest download is always displayed, since the task may be finished, waiting in its onPostExecute method, which may still may be executed after the one of this new download.

private static boolean cancelPotentialDownload(String url, ImageView imageView) {    BitmapDownloaderTask bitmapDownloaderTask = getBitmapDownloaderTask(imageView);    if (bitmapDownloaderTask != null) {        String bitmapUrl = bitmapDownloaderTask.url;        if ((bitmapUrl == null) || (!bitmapUrl.equals(url))) {            bitmapDownloaderTask.cancel(true);        } else {            // The same URL is already being downloaded.            return false;        }    }    return true;}

cancelPotentialDownload uses the cancel method of the AsyncTask class to stop the download in progress. It returns true most of the time, so that the download can be started in download. The only reason we don't want this to happen is when a download is already in progress on the same URL in which case we let it continue. Note that with this implementation, if an ImageView is garbage collected, its associated download is not stopped. A RecyclerListener might be used for that.

This method uses a helper getBitmapDownloaderTask function, which is pretty straigthforward:

private static BitmapDownloaderTask getBitmapDownloaderTask(ImageView imageView) {    if (imageView != null) {        Drawable drawable = imageView.getDrawable();        if (drawable instanceof DownloadedDrawable) {            DownloadedDrawable downloadedDrawable = (DownloadedDrawable)drawable;            return downloadedDrawable.getBitmapDownloaderTask();        }    }    return null;}

Finally, onPostExecute has to be modified so that it will bind the Bitmap only if this ImageView is still associated with this download process:

if (imageViewReference != null) {    ImageView imageView = imageViewReference.get();    BitmapDownloaderTask bitmapDownloaderTask = getBitmapDownloaderTask(imageView);    // Change bitmap only if this process is still associated with it    if (this == bitmapDownloaderTask) {        imageView.setImageBitmap(bitmap);    }}

With these modifications, our ImageDownloader class provides the basic services we expect from it. Feel free to use it or the asynchronous pattern it illustrates in your applications to ensure their responsiveness.

Demo

The source code of this article is available online on Google Code. You can switch between and compare the three different implementations that are described in this article (no asynchronous task, no bitmap to task association and the final correct version). Note that the cache size has been limited to 10 images to better demonstrate the issues.

Future work

This code was simplified to focus on its parallel aspects and many useful features are missing from our implementation. The ImageDownloader class would first clearly benefit from a cache, especially if it is used in conjuction with a ListView, which will probably display the same image many times as the user scrolls back and forth. This can easily be implemented using a Least Recently Used cache backed by a LinkedHashMap of URL to Bitmap SoftReferences. More involved cache mechanism could also rely on a local disk storage of the image. Thumbnails creation and image resizing could also be added if needed.

Download errors and time-outs are correctly handled by our implementation, which will return a null Bitmap in these case. One may want to display an error image instead.

Our HTTP request is pretty simple. One may want to add parameters or cookies to the request as required by certain web sites.

The AsyncTask class used in this article is a really convenient and easy way to defer some work from the UI thread. You may want to use the Handler class to have a finer control on what you do, such as controlling the total number of download threads which are running in parallel in this case.

Filed under  //  android