Discussions

News: JBoss Serialization

  1. JBoss Serialization (30 messages)

    Java serialization has traditionally had a negative performance impact. In a blog entry, Clebert Suconic explains the JBoss Serialization project, which attempts to improve the base performance and scalability to this fundamental Java feature by changing how objects are streamed.
    Originally, I started JBoss Serialization as a way to smart clone objects, but then I realized that I could easily save the DataContainer to a regular stream (like saving the actual state of a transformation). I expected this to be as expensive as serialization due to the data transformations, but to my surprise in most cases JBoss Serialization was faster than Java Serialization. (about 15% on serialization over the wire, and 70% on call by value operations). The result is a really nice project that can be used to copy objects and serialize objects (even non serializable objects now, as I can ignore the tag interface if specified so).

    Threaded Messages (30)

  2. Interesting Read[ Go to top ]

    Last year I was trying to think of a way to improve java serialization, but couldn't think a good solution. Glad to see someone else has stumbled across solution that is better than stock java serialization.

    peter
  3. synchronization bottlenecks[ Go to top ]

    Clebert forgot to mention that he removed the synchronization bottlenecks as well. This is more important IMO, as my previous measurements on Java serialization showed that performance started to degrade massively once you reached 100-200 concurrent threads.

    What annoys me is that JBoss had to start this project in the first place. Java serialization is such a fundamental feature that has been neglected for so long. I'd like to see Sun put some effort into core Java features instead of BS like a web services implementation that nobody needs. Or, they could just open source the Java source (not the VM itself) so we could fix , distribute, then contribute back to Sun the parts we're interested in improving...

    Bill
  4. synchronization bottlenecks[ Go to top ]

    since you are implementing your own serialization, wouldn't it be feasible to implement deltas only? input would be a base metadata and object instance and the serialized result would be the delta between the metadata and the object instance, so we could end up only managing or passing differences of known based metadata instances?
  5. Deltas[ Go to top ]

    since you are implementing your own serialization, wouldn't it be feasible to implement deltas only?

    This begins looking like SDO...
  6. I was thinking the same thing[ Go to top ]

    since you are implementing your own serialization, wouldn't it be feasible to implement deltas only?

    This begins looking like SDO...

    Although SDO is interesting and useful, I'm not sure it would be good to combine the two. I can see cases where a fast serialization is needed without extra stuff. It does bring up the question, "how easy would it be to enhance jboss serialization so that it's pluggable and makes it easy to add SDO on top."

    peter
  7. Deltas[ Go to top ]

    since you are implementing your own serialization, wouldn't it be feasible to implement deltas only?
    This begins looking like SDO...

    Yeah, but SDO is a pile of... XML. I would want something low-level and fast with minimal byte output-- not some monster API.
  8. synchronization bottlenecks[ Go to top ]

    since you are implementing your own serialization, wouldn't it be feasible to implement deltas only? input would be a base metadata and object instance and the serialized result would be the delta between the metadata and the object instance, so we could end up only managing or passing differences of known based metadata instances?

    Could you explain the usecase for this?

    Peter, as far as contributing, don't believe everything you read. Although we do have some committer policies, we're a bit more leniet granting access than ASF or Ecplise. All we look for is initiative. Only downside to contributing is that we may try and recruit you if you're good.

    I suggest contacting Clebert to see what dependencies JBoss Serialization has. AFAIK, its pretty self-contained.

    Bill
  9. synchronization bottlenecks[ Go to top ]

    since you are implementing your own serialization, wouldn't it be feasible to implement deltas only? input would be a base metadata and object instance and the serialized result would be the delta between the metadata and the object instance, so we could end up only managing or passing differences of known based metadata instances?
    Could you explain the usecase for this?

    It would be great for remote, stateful invocations where the client would only need to preserve their delta, even though they operated over a *much* larger object graph. If we have the eden state of a base object graph, then each thread can freely operate over a duplicate graph without worrying about needing to preserve everything upon completion (I posted on the Seam-Dev this morning with an actual application of this).
  10. synchronization bottlenecks[ Go to top ]

    since you are implementing your own serialization, wouldn't it be feasible to implement deltas only? input would be a base metadata and object instance and the serialized result would be the delta between the metadata and the object instance, so we could end up only managing or passing differences of known based metadata instances?

    Could you explain the usecase for this?

    It would be great for remote, stateful invocations where the client would only need to preserve their delta, even though they operated over a *much* larger object graph. If we have the eden state of a base object graph, then each thread can freely operate over a duplicate graph without worrying about needing to preserve everything upon completion (I posted on the Seam-Dev this morning with an actual application of this).

    sounds like you need something that's SDO like without all the XML and UML stuff associated with SDO. Having read the SDO spec, it feels rather heavy weight compared to simple java classes.

    peter
  11. synchronization bottlenecks[ Go to top ]

    This is true, I mean, look at handheld to server communication. I have a client with state that has a unique set of hashes or identifiers. The server has it's own changed state, all the server has to do is use the same method of hash/id generation to do comparisions on and then only return deltas. Having this done at the low level within object serialization itself would be amazing-- think about clustering or MVC applications with shared resources which can now become stateful without worrying about communication overhead. I'm probably saying too much... but it would be really cool if you could implement this.
  12. synchronization bottlenecks[ Go to top ]

    since you are implementing your own serialization, wouldn't it be feasible to implement deltas only? input would be a base metadata and object instance and the serialized result would be the delta between the metadata and the object instance, so we could end up only managing or passing differences of known based metadata instances?
    Could you explain the usecase for this?
    It would be great for remote, stateful invocations where the client would only need to preserve their delta, even though they operated over a *much* larger object graph. If we have the eden state of a base object graph, then each thread can freely operate over a duplicate graph without worrying about needing to preserve everything upon completion (I posted on the Seam-Dev this morning with an actual application of this).

    I guess this would be very useful for maintaining the Model of a GUI widget or something across the wire? I'll check Seam Dev...
  13. synchronization bottlenecks[ Go to top ]

    JBossCache uses AOP to synchronize objects between servers.

    I guess we could use a similar idea to synchronize objects between client and servers, sending only deltas to the server.

    But I think this should be something in top of serialization. A project combining AOP, Remoting and Serialization to manage Deltas. It's a really nice idea IMO.
  14. synchronization bottlenecks[ Go to top ]

    A project combining AOP, Remoting and Serialization to manage Deltas. It's a really nice idea IMO.

    Remove the requirement on Serializable (but add requirements for true transparency and preservation Java's pass-by-reference semantics) and you end up with Terracotta DSO, you can download a fully functional distribution today.

    /Jonas
  15. synchronization bottlenecks[ Go to top ]

    Or with JBossCacheAop (POJOCache) which has been available for quite some time too... :-)
  16. Can't contribute?[ Go to top ]

    What annoys me is that JBoss had to start this project in the first place. Java serialization is such a fundamental feature that has been neglected for so long. I'd like to see Sun put some effort into core Java features instead of BS like a web services implementation that nobody needs. Or, they could just open source the Java source (not the VM itself) so we could fix , distribute, then contribute back to Sun the parts we're interested in improving...Bill

    You can! https://mustang.dev.java.net/
  17. Can't contribute?[ Go to top ]

    Oooh can I fix your bugs for you and then pay you for the priviledge of using my own fixes? Awesome! That would be really great....for you!
  18. Can't contribute?[ Go to top ]

    Oooh can I fix your bugs for you and then pay you for the priviledge of using my own fixes? Awesome! That would be really great....for you!

    Me? Well, I don't work for Sun. But even if I did, how would you (JBoss guys?) be forced to pay for using a fix in the JDK, which is free? AFAIK, your business is not around the runtime, but the app server, isn't it?

    And I think that it would be great for everyone (including yourself and your colleagues at JBoss Inc., that probably use the Sun-JDK on a daily basis, and will use Mustang/Dolphin when it comes out) if good coders like you guys contributed with those types of bug fixes.
  19. JRL doesn't allow commercial use[ Go to top ]

    so one would have to pay Sun Microsystems to be able to use one's code commercially as part of a patched JRE one'd ship to customers until it was bundled in some future release, if at all.

    That makes contributing to Mustang/Dolphin very unattractive if you have a commercial use in mind for your code.

    cheers,
    dalibor topic
  20. JRL relevance?[ Go to top ]

    so one would have to pay Sun Microsystems to be able to use one's code commercially as part of a patched JRE one'd ship to customers until it was bundled in some future release, if at all. That makes contributing to Mustang/Dolphin very unattractive if you have a commercial use in mind for your code.

    I don't really understand the relevance of the JRL terms at all here. IANAL, but my understanding of IP licensing law is that you can licence your code in whatever ways you see fit. So, you could write code and submit to Sun under the JRL, but also use that same code in other ways, since you're the owner of that code. Cf. Sleepycat, MySQL, Doug Lea's concurrency package, etc. for examples of multi-licence code.

    -Patrick

    --
    Patrick Linskey
    http://bea.com
  21. JRL relevance?[ Go to top ]

    If your code is a derived work of Sun's JDK code, which an implementation of java.io.Object*Stream that improves the existing JRLd, Sun code in the JDK would naturally be [1], then they are authors of the code, too, legally, and get a say in its use. In particular, the JRL says that derived code must be licensed under JRL (or a similar license) and JRL'd code must not be used commercially.

    You can license your own works as you wish, of course. But if you extend other people's works, then they get a say in how the resulting work is licensed.

    cheers,
    dalibor topic

    [1] Unless you write all of it from scratch, like GNU Classpath does. But then Sun would have little interest in accepting your patches, since full rewrites are a pain to audit.
  22. Can't contribute?[ Go to top ]

    Oooh can I fix your bugs for you and then pay you for the priviledge of using my own fixes? Awesome! That would be really great....for you!
    Andrew, I hope you realise that this applies to JBoss as well, for those who have "support contracts". We get the priviledge (sic) of fixing your bugs and then pay pretty darn steep support costs for things that you should be doing in the first place. Awesome!

    You oughta check where your salary comes from before you start biting...
  23. synchronization bottlenecks[ Go to top ]

    Why coudn't you keep the compatibility with the serialization format? Are there more issues besides synchronization?
  24. Contribute back to the JDK[ Go to top ]

    What prevents you from submitting this implementation back to Sun so it will be bundled standard in Mustang or Dolphin? I assume its API is backwards compatible?
  25. 15% or 200%?[ Go to top ]

    The link says:
    With JBossSerialization we have done internal benchmarks and we have realized at least 2 times faster serialization with this library. These benchmarks are commited into our CVS repository (as testcases) and they are publicly available.

    Is it 200% or 15% as it says above?

    Guglielmo

    Enjoy the Fastest Known Reliable Multicast Protocol with Total Ordering

    .. or the World's First Pure-Java Terminal Driver
  26. 15% or 200%?[ Go to top ]

    This text was written on the begiining of the project, when most of the benchmarks were being at least 2 times faster.

    I have added more benchmarks. I will update the website description.
    I prefer to be conservative and say in avarage 70%.
  27. 15% or 200%?[ Go to top ]

    This text was written on the begiining of the project, when most of the benchmarks were being at least 2 times faster.I have added more benchmarks. I will update the website description.I prefer to be conservative and say in avarage 70%.

    Well, 70% still means it was worth doing the work.

    What is the critical reason for the speedup, again (trying hard not to RTA ..) ?
  28. 15% or 200%?[ Go to top ]

    This text was written on the begiining of the project, when most of the benchmarks were being at least 2 times faster.I have added more benchmarks. I will update the website description.I prefer to be conservative and say in avarage 70%.

    I did some local benchmarking too. It showed about a 10% improvement in byte array length (smaller), but I didn't see performance improvements in my case. Would you start seeing higher margins of improvement with multithreaded solutions?
  29. JBoss Serialization[ Go to top ]

    Last week I've replaced java ObjectInput/Output streams with this solution from JBoss. I've achieved a performance increase of just about 10% in a single-thread environment for in-memory deep cloning. jboss-serialization.jar depends on logging from jboss so I also had to put jboss-common.jar on classpath - is it a bug or a feature ;)?. Optimizing read/writeStreamHeaders+read/writeClassDescriptor increased throughput on 10-20% for in-memory serialization/deserialization operations.
    Finally I've replaced java serialization solution with the alternative one based on direct usage of reflection. This solution works about 20 times faster (200000 objects/second vs 10000 o/s) than java serialization for in-memory deep cloning, stream serialization/deserialization is about 3(serialization) to 5(deserialization) times faster. Memory consumption is also lower.
    One important note: my reflection based cloner/serializer/deserializer is not capable of cloning any object only objects annotated similar to EJB3 entities are supported.
  30. JBoss Serialization[ Go to top ]

    You are talking specifically about deep copy, right?

    I would need to know what method are you using (from JBoss Serialization) to give you some hints.


    As I don't want to make TSS a development forum, maybe you could use jboss-serialization forum, so I could give you some hints.

    http://www.jboss.com/index.html?module=bb&op=viewforum&f=233


    Also, I'm trying to avoid dependencies to jboss-commons. If you get the latest release you won't need jboss-commons.
  31. Externalizable[ Go to top ]

    I think that serialization was measured against java.io.Serializable. Are there any numbers when compared against java.io.Externalizable ? My experiences are that implementing Externalizable is much faster than Serializable because the Reflection overhead is skipped.