Java's been ready for many-core era; how about you?

Discussions

News: Java's been ready for many-core era; how about you?

  1. Patrick Logan posted a blog entry entitled "Many-Node has many more implications than Many-Core," saying that the many-core discussions miss a crucial point: many cores imply many nodes. Programming models that rely on cores sharing system resources like RAM or hard drive space are going to have to change.
    The many-core era will also be a many-node era. You will not have a C: drive except for those legacy systems running in their little jars of formaldehyde. You will have a lot of computing power "out there" and several kinds of displays that are not directly attached to "your computer". You probably will not be able to locate an "application" as being installed on your C: drive and running on the CPU on the mother board that has a ribbon running out to the disk that has C: formatted on it. Hardware will be too cheap to be organized this way. We need to begin reorganizing our software now or five years from now, we'll be really bad off, without any excuses. If your current programming models distinguish between "these threads threads" and "those other processes over there" then it's almost certainly the wrong model for today, not to mention tomorrow.
    The good news for Java is that the "Pure Java" initiative - apparently downplayed in the last few years - is ideally geared for a "many-node" deployment, with an emphasis on platform neutrality. Java EE takes that one step further, recommending that filesystem access not be used. See section 17.2.3 in the EJB 3 specification:
    The EJB architecture does not define the operating system principal under which enterprise bean methods execute. Therefore, the Bean Provider cannot rely on a specific principal for accessing the underlying OS resources, such as files. (See Subsection 17.6.8 for the reasons behind this rule.) We believe that most enterprise business applications store information in resource managers such as relational databases rather than in resources at the operating system levels. Therefore, this rule should not affect the portability of most enterprise beans.
    The reference to principals obscures the remote nature of EJBs in the first place. A clearer example might be an illustration of a remote EJB deployment. Imagine an enterprise application deployment, where web application 'A1' is on machine 'M1', and it looks up EJBs that are hosted in a cluster of machines 'M2' and 'M3'. On the first invocation ("I1") of A1, it finds an instance of an EJB on M2, which stores information on the local filesystem, in - let's say - a Lucene index. The second invocation ("I2") of A1 finds an instance of the EJB on M3 instead, which then looks for the information to update on the local filesystem. However, the update from I1 isn't on the local filesystem of machine M3; it's on M2. You now have an error. This is why one uses relational databases or remotable resources more than filesystems. Even JCR uses a client/server model for content repositories, to address this issue. No matter how well the platform is prepared for multi-node deployments, though, programmers have to anticipate the requirements. Java's there to help, but you have to know what you're about. How well are you prepared to deploy to a cluster, now that they're becoming commodities rather than specialized deployments?
  2. Well, I think that in most cases there are some things that must change at the same time as the new "many core-node" scenario. As you pointed out non-distributable resources as the filesystem can't be used, but also a central database won't be the best solution. The necessary step is going to large, distributed RAM caches. Now let's listen to Nati, Jonas, Cameron... :-)
  3. Well, I think that in most cases there are some things that must change at the same time as the new "many core-node" scenario. As you pointed out non-distributable resources as the filesystem can't be used, but also a central database won't be the best solution. The necessary step is going to large, distributed RAM caches. Now let's listen to Nati, Jonas, Cameron... :-)
    Allow me to disagree. The vast majority of systems will also in the future run against a central rdbms (in the logical sense - might very well be implemented using Oracle RAC or similar). There are systems that need distributed (or, more likelly, partitioned) caches, but it will take along time before it becomes the "mainstream" solution.
  4. There are systems that need distributed (or, more likelly, partitioned) caches, but it will take along time before it becomes the "mainstream" solution.
    I didn't try to make any prediction, only said that the centralized database won't be the best solution for those contexts. Actually I don't think it's even _now_ the best solution, but a legacy. As any kind of new architecture approaches, one only hopes to be able to get rid of legacies in as many cases as possible.
  5. I didn't try to make any prediction, only said that the centralized database won't be the best solution for those contexts. Actually I don't think it's even _now_ the best solution, but a legacy.
    A centralized database may not be the best solution, but it's probably the best abstraction. Why not pretend that all the data you need to access is in one place, and can be accessed in a uniform manner? I keep hoping that we'll see a resurrection of the S/38 concept of pretending that everything's an object and lives at a fixed address. No more worrying about files or databases or remote systems, you simply instantiate (or get a handle to) the object you need and start using it.
    As any kind of new architecture approaches, one only hopes to be able to get rid of legacies in as many cases as possible.
    Yes, but let's not get rid of them just because they're legacies. Let's keep them around until we have a suitable replacement. And by "suitable" I mean "better", not just "different" or "new".
  6. There are systems that need distributed (or, more likelly, partitioned) caches, but it will take along time before it becomes the "mainstream" solution.


    I didn't try to make any prediction, only said that the centralized database won't be the best solution for those contexts. Actually I don't think it's even _now_ the best solution, but a legacy. As any kind of new architecture approaches, one only hopes to be able to get rid of legacies in as many cases as possible.
    Which contexts? Applications deployed on many cores? I´d say it is is pretty damn unlikely that we will see paradigm change from centralized database to distributed shared memory for the vast majority of applications. Complementing RDBMS with a partitioned cache for read-only/read-mostly data is something quite different than replacing the RDBMS all together.
  7. Well, I don't think Java is even ready for the "many-core era" as the author puts it. First off, a majority of developers (estimates put it 95+ %) fail miserably in writing code that can take advantage of multiple processors or cores. With imperative programming languages (like Java 5/6, C# 2.0, C++) the developer is forced to deal with low level constructs like threads, mutexes, locks, synchronization, etc. Functional programming constructs that let developers specify the "what" instead of the "how", I believe is the future. For example, this is direction Microsoft has taken with its Language Integrated Query (LINQ) in the upcoming version of .NET (C# 3.0). By moving the abstraction level another step up, and letting developers specify their "intent" leaves the runtime a lot of leeway in optimizing the runtime execution of the "intent". For example, its hard for a JIT infrastructure to take bytecode that consists of for-loops, jumps, conditional statements, creation of temporary data structures (e.g. maps, dictionaries, lists, arrays etc) and apply parallelization techniques because the bytecode does not capture the developer's intent anymore (for that matter neither does the source code). So until Java moves into functional programming constructs and integrates data and functions that operate on data in a syntactically abstract manner that captures programmer intent, taking advantage of multiple cores or CPUs is going to remain a challenge. Moore's law is not giving us faster processors anymore, but more cores. In a few years time, it is not going to be uncommon to see 32 or 64 cores on a chip. How are (Java) applications developed today (or the next few years) going to take advantage of that? Regards, -krish [I work for Microsoft, but views expressed are my own]
  8. For example, its hard for a JIT infrastructure to take bytecode that consists of for-loops, jumps, conditional statements, creation of temporary data structures (e.g. maps, dictionaries, lists, arrays etc) and apply parallelization techniques because the bytecode does not capture the developer's intent anymore (for that matter neither does the source code)
    Isn't that what PLINQ is supposed to enable? (source code level hints to the runtime to parallelize actions like Parallel.ForEach etc.)
  9. Yes, PLINQ is one of the projects that is an implementation of how to parallelize execution of "intent". A future version of .NET might just take the lessons learnt from PLINQ and apply them to the core LINQ APIs and extensions (LINQ to Objects, LINQ to XML, etc). Regards, -krish
  10. For example, this is direction Microsoft has taken with its Language Integrated Query (LINQ) in the upcoming version of .NET (C# 3.0). By moving the abstraction level another step up, and letting developers specify their "intent" leaves the runtime a lot of leeway in optimizing the runtime execution of the "intent".
    I read a research paper from Microsoft demonstrating that implicit parallelism in functional code is a lot lower than is often assumed. Applying implicit parallelism with many processors only provided a speed up of around 2X in well known program. Code must be crafted specifically for concurrency. It's not going to come for free. Erlang for example, forces you to develop your code in a very specific way in order to take advantage of concurrency. It's not "capturing intent". I'm pretty dubious about "intentional" programming. We already have the word 'abstraction'. We don't need new hype in language design. We need calm rational thinking.
    Moore's law is not giving us faster processors anymore, but more cores. In a few years time, it is not going to be uncommon to see 32 or 64 cores on a chip. How are (Java) applications developed today (or the next few years) going to take advantage of that?
    Many Java applications (e.g. ones that I write) already take advantage of multiple CPUs. This isn't to say that this can't be improved upon but it's already the case. Honestly, I don't see Java as a likely candidate for the concurrent future that we expect. Languages like Scala that provide functional programming features with OO and provide powerful concurrency abstractions like Actors are more promising.
  11. Jim, Parallelizing code is always hard to do programatically and is harder to get right - given the advances in multi-core technology. For example, assume you apply predicates on a set of [in-memory] data (e.g. a cache be it distributed or local), there is a lot of boiler plate you'd have to write in order to take advantage of multiple cores/CPUs. For example, you'd have to take into account the cost of parallelism itself. Assume you operate on a set of few hundred objects, then the cost of parallelizing the execution of a predicate over a few hundred objects will likely underperform a simple solution that utilizes just a single thread. Now consider the case where you operate over hundreds of thousands of objects. What then is the optimal solution? Do I create as many threads as there are CPUs? Do I take into account how busy the system currently is and therefore create fewer threads? Or do I instead take into account the size of the data set and optimize the number of threads for that? And then of course you have all the work of scheduling threads out of pools, making sure that they exit cleanly, thread synchronization, data structures, etc, etc. And in few years time, when the number of cores doubles or triples, is my code still optimal in its execution? So, I believe putting the onus of solving the above problem on the application developer is definitely asking for too much. (It is an interesting problem and developers would love to get their hands dirty on that one, but at what cost?) I too am skeptical about new hype and buzzwords, therefore I'd leave "intentional programming" out. But Microsoft's move in embracing functional programming constructs into (traditionally) highly imperative languages makes it easier for the runtime to decide on an optimal approach in executing the developer's intent. It is after all a logical extension of JIT technology. We do the basics like inlining methods, precalculating or guessing execution paths, eliminating dead code, etc. But there is only so far JIT technology can go and it can't magically parallelize (application) bytecode that has not been written with parallel processing/execution in mind. With the upcoming .NET version, all the pieces are in place for the runtime to apply parallelization techniques. We're not there today, but that is what projects like PLINQ are investigating. And my point was that Java isn't there (yet?). It all is still overly low level [for an application developer]. Regards, -krish
  12. Jim,

    Parallelizing code is always hard to do programatically and is harder to get right - given the advances in multi-core technology.

    For example, assume you apply predicates on a set of [in-memory] data (e.g. a cache be it distributed or local), there is a lot of boiler plate you'd have to write in order to take advantage of multiple cores/CPUs. For example, you'd have to take into account the cost of parallelism itself. Assume you operate on a set of few hundred objects, then the cost of parallelizing the execution of a predicate over a few hundred objects will likely underperform a simple solution that utilizes just a single thread.
    I am very familiar with the basics of concurrency. I have been writing concurrent code professionally for years. My code is run on servers with multiple processors. Multiple CPUs are not new.
    So, I believe putting the onus of solving the above problem on the application developer is definitely asking for too much.
    I agree and I thought I made that clear in my previous response. Perhaps I did not.
    But Microsoft's move in embracing functional programming constructs into (traditionally) highly imperative languages makes it easier for the runtime to decide on an optimal approach in executing the developer's intent. It is after all a logical extension of JIT technology. We do the basics like inlining methods, precalculating or guessing execution paths, eliminating dead code, etc. But there is only so far JIT technology can go and it can't magically parallelize (application) bytecode that has not been written with parallel processing/execution in mind.
    What I wrote above and will repeat here is that the amount of free parallelization you get with functional languages tends to be vastly overestimated. All of the tricks that allow state to be used in FP (e.g. monads) without admitting that you are using state create bottlenecks. And from what I understand these features are fairly generic and don't offer much in the way of locking optimization like the way you can create a shared, lock-free hashtable in Java using finite-state-verification. http://www.detreville.org/papers/Limits.pdf Functional approaches are superior in terms of parallelism but programs must be still designed with concurrency in mind.
    With the upcoming .NET version, all the pieces are in place for the runtime to apply parallelization techniques.
    And again, the problem is that 'real' code doesn't tend to offer a lot of free parallelization. The way that I maximize the ability to parallelize code is to use more objects, not less, as I believe you are implying. Each object has everything it needs to do it's work. This means each thread can have a separate object that doesn't have to deal with any locks or synchronization issues. Of course, some communication across boundaries will be required (just like in FP) and where that happens can be managed fairly easily. Hidden dependencies are also a problem and poorly constructed libraries can be a real headache. But the basic approach provides the same kind of benefits that FP offers without pretending that there is no state in the program. In short, whether you use FP or imperative approaches, the easiest way to improve parallelism is to limit the communication between threads. Often that means constructing the application in non-obvious ways. Again, I don't think Java is an ideal language for concurrency (neither is its descendant: C#). Multi-paradigm languages like Scala are much more promising.
  13. oh no![ Go to top ]

    And my point was that Java isn't there (yet?). ..
    Krish, you've gone to the dark side ;-) Peace, Cameron Purdy Oracle Coherence: The Java Data Grid
  14. Re: oh no![ Go to top ]

    Hi Cameron, Hehe, I get that a lot these days. But it definitely is a whole lot 'lighter' on this side though ;-) Congrats on the Oracle deal though. Regards, -krish
  15. i disagree completely[ Go to top ]

    many core is the new technology.. many node has been around for 30 or 40 years.. since we have seemingly reached the limit of moore's law the trend is to put multiple cores on one chip and share memory.. the challenge is to increase computing power when there are 4, 8, or 16 cores on a single chip.. the comments about the C drive leave me stupified.. how does this trash get published on this site.. most of the postings here are proof that the quality and education of software engineers has declined to such an extent than any form of nonsense will be entertained..