Coding for performance and avoiding garbage collection in Android

Source code (LICENSE)

Post updates:

  • [December 18th, 2011] added TEST 10: OBJECT POOL

You know how much important it is to always keep in mind the performance of your application when developing for mobile devices with limited resources. This is important especially for game developers that need to reach two goals:

  • write high performance code
  • avoid having the garbage collector work too much because it might have a negative impact on the user experience (you don’t want the users to see small pauses while playing a game)

In this post I’m going to show some tests I made for the Android platform. You can repeat the same tests on your own downloading the test app source code through the link on top of this post.

Here is my test environment:

  • Samsung Galaxy S II
  • Android 2.3.3 (Gingerbread)
  • Airplane mode on (to avoid network connections that could affect the test results)

NOTES:

  • these tests are not meant to be a 100% exact measure of performance differences between different solutions, they are made to show and verify what could be the best implementation choice depending on the context and they give only a rough measure of the difference in performance (specific to the test cases); the final conclusion is still true for similar implementation cases so a faster implementation is always faster, maybe not as much faster as shown in a test case, but still faster;
  • these tests are made for the Android platform, so a common good practice for a Java virtual machine might not be valid for the Dalvik virtual machine.

The structure of this post is simple. A series of tests will be described giving information about what is the best choice for a software developer and why. Are you ready? Let’s go!


TEST 1: FOR LOOP OVER ARRAY

When writing a for loop over arrays, the common way of implementing it is use the length property of the array in the termination condition. But instead of asking the array its length on every single iteration of the loop, we could just read it before entering the loop and cache it in a local variable.

What we test: is there a difference between using a local variable to cache the length of an array instead of using the length property in a termination condition of a for loop?


Standard implementation

// arr is an int[] array

for (int i = 0; i < arr.length; i++)
{
	// Do something
}


Optimized implementation

// arr is an int[] array

int arrLength = arr.length;

for (int i = 0; i < arrLength; i++)
{
	// Do something
}


Test results

The results show the average time taken to loop over an array with length = 3000000. The loop is repeated 100 times to check the average time taken for each complete loop.

  Average loop time Result
Standard 58 ms  
Optimized 43 ms 26% faster than standard


Conclusion

Caching the array length in a local variable is faster than reading the value for every single loop iteration. If you know that the length of an array will never change during the loop execution, then caching its value is a good idea to improve the execution speed.


TEST 2: FOR LOOP OVER ARRAYLIST

Like TEST 1, when writing a for loop over ArrayList, the common way of implementing it is use the size method of the ArrayList in the termination condition. But instead of asking the ArrayList its size on every single iteration of the loop, we could just read it before entering the loop and cache it in a local variable.

What we test: is there a difference between using a local variable to cache the size of an ArrayList instead of using the size method in a termination condition of a for loop?


Standard implementation

// arr is an ArrayList<Object>

for (int i = 0; i < arr.size(); i++)
{
	// Do something
}


Optimized implementation

// arr is an ArrayList<Object>

int arrLength = arr.size();

for (int i = 0; i < arrLength; i++)
{
	// Do something
}


Test results

The results show the average time taken to loop over an ArrayList with size = 3000000. The loop is repeated 100 times to check the average time taken for each complete loop.

  Average loop time Result
Standard 262 ms  
Optimized 233 ms 11% faster than standard


Conclusion

Caching the ArrayList size in a local variable is faster than reading the value for every single loop iteration. If you know that the size of an ArrayList will never change during the loop execution, then caching its value is a good idea to improve the execution speed. An ArrayList is generally slower than a simple array because of the method invocations involved in reading and writing values from and to the collection. Different collections have different performances (think about the access time to the elements of a HashMap), so different tests are necessary to test them.


TEST 3: VARIABLE FIELD ACCESS

If you add a variable field to a class, you can choose to make it public so other classes can read and write its value directly, or you can choose to implement getter and setter accessor methods and make the field private or protected.

What we test: is there a difference between accessing a variable field through getter and setter methods instead of accessing it directly?


Getter and setter methods

private long var;

public long getVar()
{
	return var;
}

public void setVar(long value)
{
	var = value;
}


Direct access

public long var;


Test results

The results show the average time taken to read and write the value of var in a loop executed 3000000 times. The loop is repeated 100 times to check the average time taken for each complete loop.

  Average loop time Result
Getter and setter 96 ms  
Direct access 60 ms 37% faster than getter and setter


Conclusion

Accessing directly a field of a class instance is always faster than getting and setting its value through accessor methods. If the performance is very important and you don’t need to perform any check while reading or writing the value of the field, then making it public and accessing it directly highly increases the speed (methods invocations are slow).


TEST 4: VARIABLE VS CONSTANT

A value can be stored in a variable field or in a constant field. If a value is expected to never change, storing it in a constant field is not only a good practice, but it also improves the speed of the implementation. Here we are going to verify this and we are going to test whether the use of a static field makes a difference in terms of performance.

What we test: is there a difference between reading a value of a variable field compared to the one of a constant field and does declaring it as static make a difference in terms of performance?


Variable field

public int var;


Static variable field

public static int var;


Constant field

public final int CONST;


Static constant field

public static final int CONST;


Test results

The results show the average time taken to read the value of var or CONST in a loop executed 5000000 times. The loop is repeated 500 times to check the average time taken for each complete loop.

  Average loop time Result
Variable field 63 ms  
Static variable field 67 ms slightly slower than variable
Constant field 55 ms faster than variable or static variable;
no difference between constant and static constant
Static constant field 55 ms faster than variable or static variable;
no difference between constant and static constant


Conclusion

If you know that a value will never change, then you should declare it as constant through the final keyword. Using constants instead of variables is always faster because the compiler can replace their value in the compiled code to avoid the variable field lookup time. In this test we also see that a static variable is slightly slower than an instance variable probably due to the different lookup procedure used by the virtual machine at runtime.


TEST 5: METHOD INVOCATIONS

To have a better structure of a software implementation that helps reuse and maintainability, it’s good to create a modular architecture with multiple classes and methods that solve specific tasks instead of writing very long methods with repeated pieces of code that solve exactly the same problem. This leads anyway to more method invocations instead of having all the necessary code in a single method to complete the task. We want to see what’s the impact of multiple method invocations instead of a single invocation.

What we test: how is the performance affected by multiple method invocations instead of a single invocation to complete the same task?


Multiple method invocations

private void execMethod()
{
	execMethod1();
}

private void execMethod1()
{
	execMethod2();
}

private void execMethod2()
{
	execMethod3();
}

private void execMethod3()
{
	// Do something
}


Single method invocation

private void execMethod()
{
	// Do something
}


Test results

The results show the average time taken to execute execMethod inside a loop repeated 3000000 times. The test is repeated 100 times to check the average time taken for each complete loop.

  Average loop time Result
Multiple method invocations 656 ms  
Single method invocation 175 ms 73% faster than multiple method invocations


Conclusion

Multiple method invocations can really slow down the execution speed, so if you need to write high performance code, you should keep in mind that sometimes it’s more important to put all the necessary code in a single method instead of having the same code spread all over different methods or classes. The code structure will not be optimal, but the performance benefit is notable.


TEST 6: POLYMORPHIC METHODS

Like we already said in TEST 5, to have a better structure of a software implementation that helps reuse and maintainability, it’s good to create a modular architecture with multiple classes so we can have the ability to implement polymorphic methods that execute a task in a different way depending on the specific class implementation. This is obtained through class inheritance and methods overriding. In this test we want to see what’s the impact of methods overriding on performance.

What we test: how is the performance affected by methods overriding?


Classes inheritance and methods overriding

public class PolyClass1
{
	public void execMethod()
	{
		// Do something
	}
}

public class PolyClass2 extends PolyClass1
{
	@Override
	public void execMethod()
	{
		super.execMethod();
	}
}

public class PolyClass3 extends PolyClass2
{
	@Override
	public void execMethod()
	{
		super.execMethod();
	}
}

public class PolyClass4 extends PolyClass3
{
	@Override
	public void execMethod()
	{
		super.execMethod();
	}
}

PolyClass4 polyClass = new PolyClass4();
polyClass.execMethod();


Direct method invocation

public class PolyClass1
{
	public void execMethod()
	{
		// Do something
	}
}

PolyClass1 polyClass = new PolyClass1();
polyClass.execMethod();


Test results

The results show the average time taken to execute execMethod inside a loop repeated 3000000 times. The test is repeated 100 times to check the average time taken for each complete loop.

  Average loop time Result
Classes inheritance and methods overriding 674 ms  
Direct method invocation 195 ms 71% faster than methods overriding


Conclusion

Classes inheritance and polymorphism are good to improve the source code structure, but the cost is the reduced performance compared to direct methods invocations. This is basically the same thing we saw in TEST 5 so we didn’t expect something different because each method in a subclass calls the corresponding one in the superclass (a method invocation). Since overriding a method is done to create a different version (with a different behavior) of the same method in a specific class, if you absolutely need high performance you could implement different methods that do something in a different way and call them directly instead of creating inherited classes with overridden methods to have the same result (of course some code will be repeated in this case).


TEST 7: VIRTUAL VS STATIC METHODS

If you declare a method in Java, it is virtual by default. In case it doesn’t need to access other instance methods or fields, it should be declared as static not only as a good programming practice (you clearly say that the method will not modify anything in a class instance), but also to improve performance.

What we test: is there a difference between virtual and static methods in terms of performance?


Virtual method

public void virtualMethod()
{
	// Do something
}


Static method

public static void staticMethod()
{
	// Do something
}


Test results

The results show the average time taken to execute virtualMethod or staticMethod inside a loop repeated 3000000 times. The test is repeated 100 times to check the average time taken for each complete loop.

  Average loop time Result
Virtual method 152 ms  
Static method 137 ms 10% faster than virtual method


Conclusion

Static methods are faster than virtual methods and since there are no drawbacks in declaring a method as static if it doesn’t use fields or methods specific to a class instance, you should always do that when you can.


TEST 8: ITERATION OVER ARRAY

Starting with Java 1.5, there’s an easy syntax to implement an iteration over arrays or collections instead of hand-writing the usual for loop. This gives performance improvements compared to some hand-written versions of the iteration, while it is indistinguishable from others.

What we test: what’s the difference between using an iterator over an array instead of a hand-written loop in terms of performance and objects allocation?


Iterator

// arr is an Object[] array declared as a class field

for (Object obj : arr)
{
	// Do something
}


Manual iteration

// arr is an Object[] array declared as a class field

int arrLength = arr.length;

for (int i = 0; i < arrLength; i++)
{
	// Do something with arr
}


Manual iteration with local array

// arr is an Object[] array declared as a class field

// We declare a variable localArr that is local to the
// method that implements the iteration (we avoid
// accessing the arr class field inside the loop).
Object[] localArr = arr;

int arrLength = localArr.length;

for (int i = 0; i < arrLength; i++)
{
	// Do something with localArr
}


Test results

The results show the average time taken to loop over an array with length = 5000000. The test is repeated 500 times to check the average time taken for each complete loop. It is also shown the number of objects allocated in the different loop implementations.

  Average loop time Objects allocation count Result
Iterator 90 ms 0 the same as manual iteration with local array
Manual iteration 102 ms 0 13% slower than iterator or manual iteration with local array
Manual iteration with local array 90 ms 0 the same as iterator


Conclusion

To understand this, you should take a look at the “Designing for Performance” document in the Android developers guide (look at “Use Enhanced For Loop Syntax”). The Iterator case is translated by the compiler in a way identical to the Manual iteration with local array case, that’s why the performance is the same. With simple arrays there is no difference between a hand-written optimized loop and the iteration syntax, so you should always prefer the latter to improve the code readability. In every case there are no objects allocations so we have no problems with the garbage collector. In this test we also see a performance improvement if we store a reference to the array locally in the method that executes the loop instead of accessing directly the array as a class instance field. This is another thing to keep in mind to improve performance (the lookup time for the array variable is reduced).


TEST 9: ITERATION OVER ARRAYLIST

We want to try the same test we did in TEST 8, but with an ArrayList instead of a simple array. When using the iteration syntax introduced in Java 1.5 with collections (like ArrayList) instead of simple arrays, there’s a difference in the compiled code. In this case an Iterator object is used to implement the iteration so we must also consider the extra objects allocation compared to a hand-written loop without the iteration syntax.

What we test: what’s the difference between using an iterator over an ArrayList instead of a hand-written loop in terms of performance and objects allocation?


Iterator

// arr is an ArrayList<Object> collection declared as a class field

for (Object obj : arr)
{
	// Do something
}


Manual iteration

// arr is an ArrayList<Object> collection declared as a class field

int arrLength = arr.size();

for (int i = 0; i < arrLength; i++)
{
	// Do something with arr
}


Manual iteration with local ArrayList

// arr is an ArrayList<Object> collection declared as a class field

// We declare a variable localArr that is local to the
// method that implements the iteration (we avoid
// accessing the arr class field inside the loop).
ArrayList<Object> localArr = arr;

int arrLength = localArr.size();

for (int i = 0; i < arrLength; i++)
{
	// Do something with localArr
}


Test results

The results show the average time taken to loop over an ArrayList with size = 3000000. The test is repeated 100 times to check the average time taken for each complete loop. It is also shown the number of objects allocated in the different loop implementations.

  Average loop time Objects allocation count Result
Iterator 481 ms 100 slower than both manual iterations and 100 objects allocated (an Iterator instance for each test repetition)
Manual iteration 246 ms 0 49% faster than iterator and 5% slower than manual iteration with local ArrayList
Manual iteration with local ArrayList 235 ms 0 51% faster than iterator and 5% faster than manual iteration


Conclusion

You might be surprised to see that the Iterator performance is much slower than a hand-written iteration loop, but the reason is simple: to implement the Iterator syntax, the compiler uses an Iterator object, so the loop is made invoking its hasNext and next methods and you know from our previous tests that method invocations are slow. A hand-written loop just uses a standard for syntax to implement the loop and doesn’t need to invoke any method resulting in a much faster execution. There’s another problem with the Iterator syntax: a new Iterator object is allocated for each complete loop execution and this means more objects to collect for the garbage collector. This can lead to a poor performance in games for example, because there could be many small pauses every time the garbage collector executes while the user is playing, making the overall experience not really great. You should choose the Iterator syntax only for applications that don’t need high performance code or continuous rendering (like games) because it helps to keep the source code cleaner and easier to read, but you should always prefer hand-written iterations when developing games or other similar applications. This test has been made with an ArrayList collection and might give different results with other collections, but it shows that even though the Iterator syntax is the same as the one for a standard array (like we saw in TEST 8), the compiled code is much different and we must consider the drawbacks of this syntax while dealing with collections.


TEST 10: OBJECT POOL

In order to reduce the work of the garbage collector, all you can do is reduce the amount of objects you create and recycle the instances you’ve already created. You can do this with the implementation of the Object Pool design pattern. There are different ways to implement an Object Pool, depending on the features you need, and a possible implementation is the one you can find in my other post titled “Recycling objects in Android with an Object Pool to avoid garbage collection“. That will be the implementation used in this test.

What we test: what’s the impact of an Object Pool in terms of performance and objects allocation compared to the standard object creation without recycling the instances?


Standard object creation

// Example of a standard object creation of an android.graphics.Point instance

Point point = new Point(x, y);


Object Pool

// Example of a creation of an android.graphics.Point instance through an Object Pool

// Object Pool initialization with a maximum capacity or NEEDED_POINTS_COUNT instances.
// The PointPoolObjectFactory is a factory class that creates PointPoolObject
// instances.
ObjectPool pointsPool = new ObjectPool(new PointPoolObjectFactory(), NEEDED_POINTS_COUNT);

// The PointPoolObject class extends android.graphics.Point and implements the
// interface needed to make it work with the Object Pool.
PointPoolObject point = (PointPoolObject)pointsPool.newObject();
point.x = x;
point.y = y;

// When the point instance is not needed anymore, we put it back in the Object Pool
pointsPool.freeObject(point);


Test results

The results show the average time taken to create and initialize 1000 Point instances inside a loop repeated 100 times to simulate a situation where we need 1000 Point instances to be ready to use at the same time and we need all of them 100 times during our application execution (note: with the Object Pool we also need to free the instances and store them back in the pool, while without the pool that’s the job of the garbage collector). The test is repeated 100 times to check the average time taken for each complete loop. It is also shown the number of objects allocated in the different loop implementations.

  Average loop time Objects allocation count Result
Standard object creation 180 ms 10000000 slower than the Object Pool implementation and 10000000 objects allocated (the total time varies depending on the speed of the garbage collector execution)
Object Pool 47 ms 1000 faster than the standard object creation (the garbage collector doesn’t need to work) and only 1000 objects allocated


Conclusion

As you can see, the Object Pool allows us to allocate only the total amount of Point instances that are needed at the same time (1000) while with a standard object creation we would allocate the objects multiple times without reusing them (1000 points x 100 times x 100 repetitions of the test = 10000000 instances). Of course we could find a strategy to recycle the points even without the Object Pool, but the reason to have an Object Pool is to make this simple and standardized across the source code of our application. As you surely noticed, the test execution takes even less time with the Object Pool and the reason for this is the garbage collector: with the Object Pool and only 1000 instances created, the garbage collector doesn’t need to work at all, while with the standard object creation it needs to collect many instances and it executes a lot of times making everything slower (you can see this if you try to run the test and check in the LogCat view of the DDMS perspective of Eclipse).

That’s it. This is of course only a subset of what could be tested and more tests could be added, but it covers most of the common scenarios that you have to face every time you need to make a decision about the implementation of an application.

When implementing applications (like games) that need to avoid garbage collection because of the small pauses it introduces every time the garbage collector executes, all you can do is avoid to keep on creating objects and try to reuse as much as possible the objects instances you already have. This can be done in different ways (like through Object Pools as we saw in TEST 10) and depends a lot on the specific implementation and application context, but it’s important to know the cases where you don’t create objects directly and you don’t realize that they are created anyway (like we saw with the Iterator syntax in TEST 9). It can be hard to discover such a hidden source of objects creation when you’ve already completed your source code implementation, so it’s better to know it in advance before any implementation decision.

If you think that something is wrong with these tests or you would like me to implement new ones, feel free to write a comment to this post. Every feedback is appreciated.

Comments

Jeff Akpunonu
Reply

Very well written article, good job!

m haneef
Reply

awesome! especially the object pool pattern & array-list vs array object looping. very usefu;

Deepthi
Reply

Very useful .. Thank you !

vishnu
Reply

Very well written article, good job!

maciek
Reply

Good article, After this my thinking about Java as an sh*** is more obvious then before. And I will create game using JNI , Java can be only poor(maybe not poor) “interface” for real programs, and games. Games in Java are majestic things for people who can’t C and assembler. That’s why I’m happy that most of universities are learning more Java then ANSI C.
but article is real good. Thanks man

SuperHich
Reply

Hi Andrea,
I really like your article and it’s sooo useful & of course I recommand it for all …

thanks a lot :)

Khawar Raza
Reply

Nice explanation…

to
Reply

Hello!

Amit Android
Reply

Nice Job buddy ….. it will help

ArthurFem
Reply

- ケイトスペード 財布 新作 2014 smart marketing strategy.

Leave a comment

name*

email* (not published)

website