GC overhead in Java

(Disclaimer: I'm not a Java expert).

I've just read this blog post by Julia Evans which is about how the Java GC can die a slow death if there is memory pressure. While implementing an algorithm from my bachelor's thesis in Java, GC bit me as well.

Garbage collection is wonderful if there's no need for close control over object allocation / deallocation. I also think that it's not the GC's fault if the air gets stuffy amongst millions and millions of objects. In this case there's something wrong with the language or the software design.

In my case the underlying problem was Java's fundamentalist OO approach. The only value types (~ "unboxed" types) Java has are the primitives (byte, int, long, ...). There are also arrays thereof (e.g. int[]), which are reference types (objects), but the values contained inside are still (unboxed) primitives. The elements in an array of primitives are all contained in a single contiguous sequence of memory, which is good.

But the problem with Java arrays is that they are about as convenient to use as malloc'ed memory in C. Are dynamically resizing arrays of primitives so much to ask for? Sorry, they don't ship with Java. There are Generics but only reference types are allowed as parameters: There is no ArrayList<int> but only ArrayList<Integer>. Which adds probably like 300% memory overhead, and worse, tremendous GC overhead.

Since we can't rely on ArrayList<Integer> we have to implement our own IntArray class.

class IntArray {
    private int[] buf;
    private int size;
    private int capacity;

    IntArray(...) {...}

    int size() { return size; }
    void add(int x) {...}
    int pop() {...}
    int at(int pos) {...}
    void clear() {...}
    void resize(int newSize) {...}
    void shrink_to_fit() {...}

    /* also implement the Iterable interface here */
}

Much better performance now. It's unfortunate that we can't have nice subscript syntax, but even the built-in Generics don't have it.

But what if an array of small flat structs is needed? Like

class MyStruct {
    int a, b, c;

    MyStruct(...) {...}
}

void myAlgo(...) {
    ArrayList<MyStruct> array = new ArrayList<MyStruct>();
    ...
}

As a user-defined type MyStruct is a reference type (~ "boxed" type), which implies object overhead, and again, GC overhead. (And of course it also imposes the semantic misfit which is object identity for all but the few true OO-style classes).

We're hosed. So let's build another class ArrayOfMyStructs.

class ArrayOfMyStructs {
    private IntArray a;
    private IntArray b;
    private IntArray c;
    private int size;
    
    ArrayOfMyStructs(...) {...}

    void at(int pos, MyStruct out) {
        out.a = a.at(pos);
        out.b = b.at(pos);
        out.c = c.at(pos);
    }

    void add(int x, int y, int z) {
        a.add(x);
        b.add(y);
        c.add(z);
        size++;
    }

    /* and so on. Also implement Iterable! */
}

It seems there is no way around making a SOA (struct of (primitive) arrays) instead of the straightforward AOS (array of structs) if both large amounts of small records and performance are required.

Needless to say, this gets tedious very quickly, and more so if we need data structures that are more complex than simple resizable arrays (which are boilerplate but easy to implement).

In C++ we don't need to go through all that trouble. A std::vector<MyStruct> does its job perfectly. And while C++ is a super complex language it's still the only mainstream one (that I know of) which gets the following things about right:

Which is why C++ might still be our best bet if we're having a somewhat performance-sensitive problem that involves dealing with large amounts of records :(.

References

Other people consider this a problem: State of the Values, coauthored by Guy Steele.


Created: 2016-04-23
Last update: 2016-04-24
Last minor correction: 2017-05-25