Detecting memory leaks with Java VisualVM

Today we will discuss about memory leaks, what are these, what is causing them and how to find them using a tool called Java VisualVM.

What is a memory leak

First we need to understand the HEAP. When a new Java application is started, some amount of dynamic memory space (customizable) is reserved to this process. This memory is shared between all the threads and all objects created in the application have references that point to some memory address that is part of this allocated memory. This allocated memory is called the HEAP. On platforms like JVM, the management of the HEAP is done automatically, meaning that when we remove all assignments to an object (there is no way to access this object anymore using variables) the JVM will run the Garbage Collector to deallocate any memory that is no directly referenced in the application.

So memory leaks are practically chunks of memory that were not allocated because some variable was not destroyed. Reasons for this could be many like: a function or block call was not terminated so it still exists on the Stack (the Stack is a data structure that holds all the running methods, static type variables and object references that run on a Thread, piled up using the FILO logic). Worst case is when these lost variables are Collections that continue to grow during the life of a running application consume all the available memory.

Because Java applications run on the JVM, any memory leaks coming from the application will be cleared by the JVM when the application exits. So the application does not take memory directly from the OS, the JVM does. This way it is ensured that any memory is cleared after the application exits even when the application fails.

Luckily we have a great tool called Java VisualVM that comes with the JDK and it is great to investigate memory leaks.

Dummy application implementation

I prepared a simple dummy application to simulate a memory leak. This runs on SpringMVC and we can create resources using the POST /resources, see the resources collection using GET /resources and clear all resources using DELETE /resources.

The resourceDb collection will act as a persistance store for all resources. The ids Set will hold all existing ids that are randomly generated by the application when calling the POST method.

@RestController(value = "/resources")
public class ResourceController {

    public static List<Resource> resourceDb = new ArrayList<>(); // simulates persistence store
    public static Set<Integer> ids = new HashSet<>();  // stores IDs of resources
    private static final int MAX_ID_VALUE = 10; // max ID value

    /**
     * Get the list of all resources.
     * @return {@Code List<Resource>}
     */
    @RequestMapping(method = RequestMethod.GET)
    public List<Resource> getResources() {
        return ids.stream()
                .map(this::findResourceById)
                .collect(Collectors.toList());
    }

    /**
     * Create a new resource with the selected data and randomly generated id.
     * @param data the resource data
     */
    @RequestMapping(method = RequestMethod.POST)
    public synchronized void setResource(@RequestParam String data) {
        int id = new Random().nextInt(MAX_ID_VALUE);
        ids.add(id);
        Resource resource = new Resource(id, data);
        resourceDb.add(resource);
        System.out.println(resourceDb);
        System.out.println(ids);
    }

    /**
     * Clear all resources.
     */
    @RequestMapping(method = RequestMethod.DELETE)
    public synchronized void clearResources() {
        List<Integer> idsClone = new ArrayList<>(ids);
        idsClone.forEach(this::deleteResource);
    }

    /**
     * Remove a resource by it's id.
     * @param id the id of the resource to remove.
     */
    private void deleteResource(int id) {
        Resource resource = findResourceById(id);
        resourceDb.remove(resource);
        ids.remove(id);
    }

    private Resource findResourceById(int id) {
        for (Resource resource : resourceDb) {
            if (resource.getId()==id) return resource;
        }
        throw new RuntimeException("Resource with id " + id + " not found.");
    }
}

And the Resource class:

public class Resource {

    private int id;
    private String data;

    // constructors, getters, setters, equals, hashCode and toString here
}

So when the POST /resources?data=some_data_here is called, it will generate a new ID as an integer, it will save this ID in the ids Set, it will create a new Resource with this ID and the data prom params and will save this Resource in the resourceDb collection.

When calling DELETE /resources it will iterate over all IDs from ids Set and for each ID it will remove the corresponding Resource from resourcesDb.

The memory leak happens here when an ID is generated that already exists in the ids Set and another Resource with same ID is saved in the resourceDb collection. When calling the DELETE method, it will remove only one Resource per ID leaven the other resources in the resourceDb collection while ids Set is empty. So we will have objects in resourceDb that we don’t have access to through our API but they are consuming memory. And these resources can grow infinitely resulting in an application crash with an Out Of Memory exception.

Dummy application in action

Let’s create some resources:

curl -v -X POST localhost:8080/resources?data=a1
curl -v -X POST localhost:8080/resources?data=a2
curl -v -X POST localhost:8080/resources?data=a3
curl -v -X POST localhost:8080/resources?data=a4

It will create the following resources:

resourcesDb: [Resource{id=1, data='a0'}, Resource{id=7, data='a1'}, Resource{id=6, data='a2'}, Resource{id=0, data='a3'}, Resource{id=7, data='a4'}]
ids:         [0, 1, 6, 7]
  • Notice we have Resource{id=7, data='a1'} and Resource{id=7, data='a4'} with same ID and in ids there is 7 only once.

Let’s clear resources:

curl -v -X DELETE http://localhost:8080/resources

It will result in the following:

resourcesDb: [Resource{id=7, data='a4'}]
ids:         []

And reading all resources will result in empty array.

curl -X GET localhost:8080/resources
[] 

It is clear that Resource{id=7, data='a4'} is a object that leaked from our system. And if we scale this to millions of requests then these leaking resources will occupy a significant part of the memory. This can heart the application very badly.

Simulating thousands of requests with JMeter

In our case we know that the resourceDb is the collection that holds the leaked objects, but usually when a memory leak happens in an application we do not know the cause of it and have to find these objects that are leaking. Since Spring MVC will also consume a lot of memory, create thousands of objects by simulating thousands of POST requests.

First I will set the MAX_ID_VALUE to 10000 so we have a bigger margin for ID values. To set a memory limit per running java application, I will start the java app with the flag -Xmx256m (maximum allocated memory will be 256 Mb). This way the GC will be invoked more often by the JVM.

I prepared a configuration to do this using JMeter, a great tool to stress test any application. The JMeter config file can he found here. It will use 1000 concurrent threads to call POST /resources?data=${random_data} every 100 ms. It will do this until stopped manually.

Inspecting the memory with Java VisualVM

In the JDK installation folder, we can find bin/jvisualvm application. Let’s start it. While the memory leak application is running we will see it listed in jvisualvm UI. Let’s open it and go to the Monitor tab. We will investigate the Heap graph.

While JMeter will fire thousands of requests, creating thousands of resources, the consumed memory will increase gradually from around 50mb to around 150mb. See results in next image:

Generate resources

By firing the DELETE /resources request while JMeter is still running we notice the memory is cleared from around 150mb to around 75Mb. But it is not at 50mb as it was initially, 25Mb were stuck somewhere. And memory continues to increase while JMeter is running. The results after a DELETE request can be seen in the following image:

Generate resources

Stopping JMeter and then calling the DELETE method again will stabilize the consumed memory to around 125Mb. So we have around 75Mb of leaked memory. Results can be seen in the fallowing image:

Generate resources

In jvisualvm we can force a GC to run by clicking the Perform GC. We can see the allocated memory decreasing. Now let’s do the Heap Dump, a new tab will open and in the Classes window we can see all the classes with number of instances and allocated memory that are used in the app. Result can be seen in the fallowing image:

Generate resources

We notice our com.memoryleak.demo.Resource class holding 14771 instances. This does not look right because by calling the GET /resources request we receive an empty array. So it is a clear result that we have a leak of 14771 Resource instances.

Tip

The Resource class can overwrite this finalize method from Object class that will be called every time the object is garbage collected.

public class Resource {

    private int id;
    private String data;

    // constructors, getters, setters, equals, hashCode and toString here

    @Override
    protected void finalize() {
        System.out.println("GC on resource: " + this.toString());
    }
}

Having this method in place when calling the DELETE /resources we can notice that even if we destroy the references to our Resource objects, the GC is not invoked immediately. We have to wait a few seconds or even minutes until we notice the objects were cleared from memory and finalize is called:

...
GC on resource: Resource{id=2953, data='data_6094'}
GC on resource: Resource{id=5108, data='data_5925'}
GC on resource: Resource{id=2295, data='data_659'}
...

Investigating memory leaks is a tedious job on big applications because we need to dig through a lot more instances. Using tools la JMeter we can simulate heavy load so our memory leak is a log bigger, being easier to spot in a pool of millions of objects. There are other tools that help with investigating the heap dump better but I will leave this for another article.

I hope this was educational.

Happy coding!

Written on June 8, 2016
← back