News: Only 3%? Really? I Thought it Would Be Higher

  1. Antelink just launched a service that monitors the rate of reuse of open source libraries within a large set of open source projects. This free service is a byproduct of their R&D activity and is based on a large sample including more than 162.000 projects, from SourceForge and GoogleCode. It covers projects from the very beginning of open source history till up to now, without filtering
    based on technologies.

    According to their research, 3% of all the open source software ever created uses the Apache Commons libraries.

    Also according to their research, here are the top five most reused components from Apache Commons:

    * Logging: Wrapper around a variety of logging API implementations.

    * Collections: Extends or augments the Java Collections Framework.

    * Lang: Provides extra functionality for classes in java.lang.

    * BeanUtils: Easy-to-use wrappers around the Java reflection and introspection APIs.

    * Httpclient: HttpClient is a HTTP/1.1 compliant HTTP agent implementation based on HttpCore (Httpclient is now an independent
    project) »

    Most Reused Apache Components, According to Antelink

    Edited by: Cameron McKenzie on Apr 20, 2010 9:25 AM
  2. Short answer: Dependencies kill.  This is the biggest hurdle with reuse and it's often ignored by those intent on creating reusable APIs.  Adding a costly dependency to a project in order to avoid writing a two line method (once) is not good idea.
  3. It is enterprises' internal projects that use these libraries. Let me tell you this: one of the most important factors in my evaluation of any open source library (after the license) is its dependencies.

    So, looking at the results above, I see it as a positive sign. That's very good! Oped source libraries should not have any dependencies (if they can help it).
  4. Defining a "metric" is always something tricky.
    For instance, this rate of reuse is done counting projects since "the origin" of open source development. Some projects in our database started more than 15 years ago. So there is a dilution effect, but we find it interesting to compare new libs with older ones (instead of doing benchmark per year for instance). Java is a younger technology tha C, and is disadvantaged.
    And after all, Java  is only number 2  ;-) (see your last post about TIOBE's April report)

  5. 3% of 11% is a lot[ Go to top ]

    Looks like 11% of all open source projects are written in Java (http://www.linuxplanet.com/linuxplanet/newss/6831/1/), so 27% of all Java open source projects use Apache Commons.
  6. Licensing[ Go to top ]

    It might be interesting to get detailed figures by project license.

    AFAIK, you can't mix GPL and Apache code.
  7. We get feedbacks about software developers wondering which other open source projects from the java world are widely  re-used. We have extracted the "TOP 100" of the most reused open source java archive. http://bit.ly/bymw2V