1. 10 = Highest ranking? what is that, the linux kernel? Java's runtime library (it's open-source now)? Highest theoretical ranking?
Highest ranking is 10 (on a logarithmic scale from 1 to 10). 10 means that 100% of the open source projects in our data base (which include right now more than 162.000 projects from SF an GoogleCode), reuse at least one artifacts from the project being ranked.
2. The scaling aspect seems weird. I assume that you are analysing a graph of dependencies between OS systems.
right :something like Reuse of open source components by the open source community
The degrees of these graphs follow a power-law distribution. Thus, I would expect that a project with a higher score would have an order of magnitude more dependencies. I can assume that junit is/was being used by the majority of healthy java open-source programs. I do not see how it is only one order of magnitude more used than ehcache, a niche open-source library.
I agree, we've ben suprised too. By the way, we compare ehcache ranking to JBoss-cache and memcache. I would just say that "The last steps are always the most difficult to climb"
3. That brings up the issue of sampling. If you consider that ehcache as very popular, then you are obviously biased towards JEE open-source. I doubt many end user apps (eg. JavaME) programs use this.
Our sampling is a random selection of 162.000 projects from GoogleCode and sourceforge. About one third are java based.
4. You are only analysing Java open-source ("upload your jar...")
NO, try with dll, source file, gif, it works too. Java open source projects are just very poular, and then given lot of relevant example.
5. Instead of focusing on reuse only, I would also try using graph analysis algorithms like HITS to find the "meta-packages"
Very Interesting idea. I just checked and find some references, I will read. I talked with roberto di cosmo who is working within EU funded mancoosi project They have done some great work about graph of dependencies within package distribution.
Many thanks for this feed back.
Good luck on your project,
Stephane