During past months we verified one of our future development hypothesises on Collection overhead. The research was based on the fact that the framework provides default values for e.g. initial size of the collections. So we had a hypothesis that most people do not bother managing the sizes of their collections by themselves, and therefore end up with under-utilised collections wasting memory. If this were so, we could implement a solution that tells people where their half-empty collections are created and what they could to avoid wasting memory.
To test our hypothesis, we decided to measure the usage of thirteen most common java.util.Collection members, such as java.util.ArrayList and java.util.HashMap.
For all objects deriving from those classes we measured once every 30 seconds the amount of objects contained in the collection and the number of empty spaces in the array(s) underlying the collection. The amount of memory consumed of such empty spaces is a waste of the collection. This data was collected from 556 different real-world applications using a special version of Plumbr.
To estimate of the extent of the problem, we calculated the total waste over all collections after each 30 seconds, and took the maximal and average value of these totals for each session. As also seen from the following diagram, most of the sessions (397 out of 556) had a maximal summary waste of less than a megabyte, and only 32 sessions had a maximal summary waste of over 10 MB, with a maximum of around 60 MB.
To see the full post with the full data from the study, see the original post about collection overhead measurement.