Why not build a distributed VM?


EJB design: Why not build a distributed VM?

  1. Why not build a distributed VM? (3 messages)

    You know, the other day I was thinking about EJBs and it occurred to me that the way we write large distributed applications today is not unlike the way we wrote regular programs 20 years ago. Back then, we didn't have object-oriented programming languages. Instead, we wrote a bunch of stand-alone functions in C, and we used simple 'structs' or 'unions' to build aggregate data types. Every function took a long list of parameters to do its job, and you never had any conversational state between function calls (not unless you used global variables, but then your program became a mess).

    Today we have many new object-oriented techniques for writing programs but when it comes to distributed applications, we quite often use stateless EJBs and Data Transfer Objects. When you think about it, a stateless EJB isn't much different than a collection of old C functions, and a typical DTO is not much better than a C struct because it just holds some values and doesn't implement business logic. Why are we using 20-year old design patterns in a modern object-oriented Java environment? Even when we use a stateful EJB, which makes a lot of people cringe, we still tend to compromise important object-oriented features like polymorphism and class hierarchies.

    I'm sure many of the pundits out there will say, we're doing it for efficiency. And maybe it's true that certain applications really need all the speed they can get. But over the years, efficiency has never been the most important thing in enterprise application development. Productivity, reuse, reliability, and flexibility are all much bigger issues. If they weren't, we'd still be writing all our apps in C, or better yet, machine language, instead of Java.

    Now, I've read about EJB 3.0, and I really like the direction in which they're moving the technology. They're letting us go back to writing more object-oriented code, with "plain old java objects", and making the container smarter about how it injects the remoteness aspect into our work. This is great, but I say, why not take the whole concept a step further: why not make a distributed Java virtual machine?

    I'm not talking about a bunch of separate JVMs like we might have today in a typical J2EE container. I'm talking about an actual clustering JVM, a program that can take all the CPUs and all the memory on multiple machines and make them appear as one giant machine to the programmer. After all, the virtual machine is designed to abstract hardware from a Java program; there's no real reason why we couldn't make one that actually manages all the hardware on multiple machines at the same time.

    Think of the advantages such a VM would provide. No longer would we have to write EJBs, to any specification: we could just make regular Java objects, with no particular annotations, and let the VM decide how to distribute them over the available memory. We could simply start a thread when we needed a thread, and let the VM figure out which machine to run it on. Any object could refer to any other object in the unified memory space, without special semantics, the same way we refer to objects in memory today without worrying about which physical memory chip they happen to reside upon (or even if they're actually on disk because our OS uses virtual memory). Let the VM figure out the messy details and let the programmer worry only about the business logic.

    Not only would the distributed JVM make programming easier, it couldactually make applications more efficient. Today, the decision to make an object an EJB or not occurs at design time, and it almost never changes over the lifetime of an application. Even if loads change dramatically or vastly new hardware is introduced, our application can't easily reconfigure itself. By contrast, the distributed JVM would constantly tune itself at runtime, always figuring out the best way to arrange objects and threads among physical machines to optimize performance, regardless of what the programmer originally thought would be the best configuration.

    I know the detractors will say it can't be done. They'll claim it would be too slow, or too buggy, or that things like open sockets and file handles would cause too many problems. And I'm sure the early versions would have lots of problems, just like the early JVMs did. But eventually some smart vendors would compete to produce the best JVMs with the smartest distribution algorithms. When automated distribution works better than what the average programmer can do manually, that's when the product will have real business value.

    Let me know what you think.

    Threaded Messages (3)

  2. I remember an old Sun project, something about Intelligent Agents. I think the subject ran out of fashion after the hype-buzzword-phase, but that's another kind of container too, in wich software agents could migrate from server to server.
  3. What you propose sounds a lot like what JINI and JavaSpaces already have today, take a look at them:

    I don't know, but even if all this EJB thing ends up failing, we will already have an option: JINI. Or maybe both concepts and API could merge in the furure, but that is just dreaming of my part.

    Henrique Steckelberg
  4. Distributed VM[ Go to top ]

    The first step would be to make use of a single JVM instance by all java applications in the same physical machine. Currently each java application runs in its own JVM instance, even if they run on the same physical machine. This makes many JVM activities redundant. I dont understand what's the reason behind having a JVM instance for every application. Why cant all applications running in the same physical machine, share the same run time instance.