Integer collections

Summary

Integer collections are specialized java classes that replace generalist collection classes provided by the Java language. The counterpart of their specialization is better speed and memory footprint.

Autoboxing primitive int type to Integer class and unboxing Integer class to int primitive type is programmaticaly practical, but it leads JVM to create many tiny Integer objets into heap. This consumes memory and gives extra work to garbage collector. Each time you add an Integer into a Set<Integer> you create a new Integer objet into heap. With intcolls classes no Integer or Long objects are created. BinaryIntSet and BinaryLongSet use fixed sized Blocks objets, which contain long arrays. Each long value in a slot can store up to 64 values of the set.

Concurrent version of binary integer sets are not synchronized on themselves : they are not Collections.synchronizedSet(b). Instead, they are synchronized on inner blocks, and size is kept into an AtomicInteger.

Classes of this library are fully integrated with java collections framefork and are generics aware.

Tested with JUnit, 100% test coverage is enforced by Emma.

Concurrent classes are tested with ConTest.

Class Interface Standard implementation
BinaryIntSet NavigableSet<Integer> TreeSet<Integer>
ArrayIntList List<Integer> ArrayList<Integer>
DualArrayIntMap<V> NavigableMap<Integer,V> TreeMap<Integer,V>
BinaryLongSet NavigableSet<Long> TreeSet<Long>
ArrayLongList List<Long> ArrayList<Long>
DualArrayLongMap<V> NavigableMap<Long,V> TreeMap<Long,V>
ConcurrentBinaryIntSet NavigableSet<Integer> ConcurrentSkipListSet<Integer>
ConcurrentBinaryLongSet NavigableSet<Long> ConcurrentSkipListSet<Long>

Code samples

On a first level, you just have to replace your HashSet<Integer> and TreeSet<Integer> instanciations by BinaryIntSet instanciations. For example, with Set<Integer> :

Set<Integer> myset = new HashSet<Integer>(); myset.add(10); myset.add(12); for (int i : myset) { System.out.println("value="+i); }
Set<Integer> myset = new BinaryIntSet(); myset.add(10); myset.add(12); for (int i : myset) { System.out.println("value="+i); }

With sets of Long :

Set<Long> myset = new HashSet<Long>(); myset.add(10L); myset.add(12L); for (long i : myset) { System.out.println("value="+i); }
Set<Long> myset = new BinaryLongSet(); myset.add(10L); myset.add(12L); for (long i : myset) { System.out.println("value="+i); }

You can inprove your code by adding directly primitive int type rather than Integer objects, and transforming foreach loops into explicit specialized iterator :

BinaryIntSet myset = new BinaryIntSet(); myset.addInt(10); myset.addInt(12); for (BinaryIntSetIterator i = myset.iterator(); i.hasNext()) { System.out.println("value="+i.nextInt()); }

Performances

All those 6 classes are tested with JUnit tool

I have tested BinaryIntSet against HashSet with Sieve of Eratostenes (cf Wikipedia)

Implementation

Integer values are stored into Blocks, witch contain slots of 64 bits (long primitive type). Blocks map is a DualArrayMap. Keys are block numbers. The block number is a multiple of block size, witch is always a power of two.

Empty blocks are neved kept into the map : they are removed from the map as soon as the last element is removed from the block.