Hyperthreading Java, article from Javaworld.com

Discussions

News: Hyperthreading Java, article from Javaworld.com

  1. Randall Scarberry has written "Hyper-threaded Java," posted to Javaworld.com, which discusses converting a clustering algorithm (i.e., clustering a large dataset) into something that can benefit from systems with multiple cores, yet still runs on single-core systems. The code uses the java.util.concurrent package to manage threads, so it requires nothing but understanding (and, of course, an algorithm that benefits from the technique, with some willingness to benchmark to prove out the results.) It's an interesting concept, especially for Java, Enterprise Edition. Enterprise systems can benefit from multiple cores (arguably, moreso than desktop or laptop computers) but most enterprise programmers think in terms of single-threaded processes. How would you go about using this technique, or where? Is there a standard API mechanism that would provide the benefits without having to explicitly manage concurrency in client code?
  2. It's an interesting concept, especially for Java, Enterprise Edition. Enterprise systems can benefit from multiple cores (arguably, moreso than desktop or laptop computers) but most enterprise programmers think in terms of single-threaded processes.
    Any Java EE compliant application server manages threads and makes usage of multi-threading/cores/processors in case of MULTI-USER access. At the same time Java EE developers don't have to think about concurrency issues (I know it's a bit an overstatement :-) ) because the server serializes access to a single instance of EJB. The article describes a different access pattern: a single TIME-CONSUMING task. In Java EE such situations can be solved using MDBs (the container manages a pool of MDBs and threads). The advantage of this solution is that workload can be distributed to multiple servers in a cluster. Custom multi-threading is not encouraged by Java EE specification as it may interfere with EJB container thread-related actions. http://www.enterpriseware.eu
  3. So I see a number of issues with this article and the blurb at the top that says this can have application in the J2EE space. First of all, most J2EE applications are heavy transaction based applications. At least the ones that I work on in the financial industry. Because of the J2EE specification, you can't be spawning threads. You also can't have multiple threads accessing the same transaction. It just doesn't work. A way of breaking up large chunks of work into smaller managable is to use JMS to leverage the container to do the multi-threading for you. But once again, you have to take into account the transactions and keep track of those chunks of work in diffrent transactions. J2EE applications already take advantage of multi-core, multiple CPUs already. Every applications server uses some sort of multiple threads, weather that is in an execute queu e or (gasp) spawning a new thread for each connection. I think the article is interesting, but as far as it's application to J2EE, I think it is fud.
  4. I beg to differ. If only the simple threading models offered by typical containers were adequate for the diversity of applications out there. I added the WorkManager spec to WebSphere and then standardized it using the commonj APIs for exactly this purpose. There are clever people out there who can be trusted to write threaded code and commonj lets them build J2EE applications using a threading model thats appropriate for the application rather than the boiler plate models provided by the vendors. Yes, JMS can be used as a way to do it also but it's millions of times slower than native threading. It's not appropriate for everything, it can absolutely be abused and lead to applications that performance poorly but driving a car can be dangerous also for the unskilled but we still let people drive cars...
  5. I added the WorkManager spec to WebSphere and then ...
    I'm interested to read more about the problem that the desribed solution is intended for.
    Yes, JMS can be used as a way to do it also but it's millions of times slower than native threading.
    For time-consuming tasks (eg. 10 hours using single thread) it doesn't matter if spawning eg. 1000 threads takes 3sec or is "millions of times faster" :-) http://www.enterpriseware.eu
  6. commonj WorkManager implementations[ Go to top ]

    The commonj WorkManager specification is also supported on BEA's WebLogic Application Server platform in both 8.x (a backport project) and 9.x versions. Tangosol (http://www.tangosol.com) has also implemented this specification on top of their grid computing solution - Coherence. JXInsight has implemented a distributed trace and profile extension for this technology with the ability to track the resource consumption and metrics (cpu, memory, service time, thread blocking/waiting and GC) across multiple client & server JVM's. The following article highlights some of the features available with this extension. http://www.jinspired.com/products/jxinsight/coherencework.html regards, William Louth JXInsight Product Architect JInspired "Java EE tuning, testing, tracing, and monitoring with JXInsight" http://www.jinspired.com
  7. I think the article is interesting, but as far as it's application to J2EE, I think it is fud.
    Hello, J2EE doesn't mean 'application server'... The limitations concerning creating user threads or having multiple threads accessing the same transaction concern only application servers... -Patrick
  8. What!?!? How can you program without an application server? There's no way to access a database, or send messages, or invoke web services, or serve up web pages without a super-powered, do everything application server! At least that's what our contractors from IBM told us...
  9. What!?!?

    How can you program without an application server? There's no way to access a database, or send messages, or invoke web services, or serve up web pages without a super-powered, do everything application server! At least that's what our contractors from IBM told us...
    You are clearly being ironical here but I think it's a good point to mention. I haven't used an application server in years and it's been wonderful. Of course, little of what I have been doing would benefit greatly from running on an appserver. I remember when everything was EJBs because they 'provided so much benefit'. I have a feeling that many people are using app servers because they've never considered not using one.
  10. So I see a number of issues with this article and the blurb at the top that says this can have application in the J2EE space.

    First of all, most J2EE applications are heavy transaction based applications. At least the ones that I work on in the financial industry. Because of the J2EE specification, you can't be spawning threads. You also can't have multiple threads accessing the same transaction. It just doesn't work.
    Why not? It works the same way as it does with XA. Just make your threads enlist as an XA resource. Ilya
  11. First of all, most J2EE applications are heavy transaction based applications. At least the ones that I work on in the financial industry. Because of the J2EE specification, you can't be spawning threads. You also can't have multiple threads accessing the same transaction. It just doesn't work.

    Why not? It works the same way as it does with XA. Just make your threads enlist as an XA resource.

    Ilya
    Well, in the J2EE world, the application server is responsible for all XA enlistment. The application itself is not supposed to perform XA enlistment, neither in the same thread nor in some other thread, and it doesn't have access to XAResource handles in the first place. Outside of a J2EE environment, you're free to perform custom XA handling, of course, just like you're free to perform custom thread handling. Juergen
  12. Simpler solutions exist.[ Go to top ]

    For algorithm parallelization, simpler and more elegant solutions exists such as Javolution's ConcurrentContext. No thread pool, synchronization block or exception propagation to deal with. Just leave it to the framework to do the dirty job for you.
  13. The problem I see with Javolution and similar (see e.g. java.util.concurent etc.) is that you still incur the full overhead of Java (and underlying OS) threads. With 100-core CPU why would one have to create 100 threads to speed up matrix multiplication or other computational task or just uniform processing of objects in a container? There must be a better solution for that and it must be built into the language. There is the option to try hide this all beneath the surface of (an optimizing) compiler but I would like to have some control in the language too. So I pray that somebody at Sun is already looking into this even though 100-core CPU may be 10 years away;)
  14. Any advantage over SEDA?[ Go to top ]

    Does this approach provide any advantage over using something like ServiceMix which implements SEDA?
  15. No much hyper[ Go to top ]

    With all due respect to author's efforts and achievements, I do not see much "hyper" in this threading. It is seems to me as just manual split to tasks then assigned to threads. What we are going to need over time with massive multi-core CPU chips is something lighter and more automated than threads. Perhaps on operator/block level rather than method. Perhaps some hardware techniques (speculative evaluation?) may make their way to high level languages. Perhaps we will look back/forward to vectorization. Does anybody feel we will be or are moving this way?
  16. Re: No much hyper[ Go to top ]

    With all due respect to author's efforts and achievements, I do not see much "hyper" in this threading.
    Agree but... What I gather from this (I did not dive into the minute details) is a single thread algorithm converted to multiple threads using a thread pool. So just basic how-to multithread here. However, what does seems to happen is that appearently Java does maps its threads to the CPU's hyperthreading. Because if Java ran only on one "hardware" thread (time slicing it), splitting one very CPU intensive task over multiple Java threads would be fairly useless. But since we have incremental gain up to 64% when increasing the number of threads, I must conclude that Java is indeed hyperthreading... So thereis hyperthreading going on, but from the Java perspective it's just threading as usual.
  17. Uhh.... "Hyper-Threading"?[ Go to top ]

    It looks like this really has nothing to do with Intel's trademarked hyper-threading, which also has nothing to do with dual-core CPUs. (The author clearly has little grasp on the subject matter and the difference between SMT and SMP.) Basic multi-threading... Woop de doo. Even if it did have something to do with dual-core CPUs, threads are abstracted enough in Java that the idiosyncrasies of dual-core CPUs and SMT barely even matter. They both look like an SMP system. In the SMT case, _maybe_ two threads will run at the same time if there's some fault in the pipeline. Even then, your cache is probably going to get chewed up. So, ignoring the whole misnomer, this is nothing new. Even running threads on a single core works [almost] the same. The developer rarely needs to care. If everything is written to run as a single thread, this article should have been written as "writing decent code with parallelization in mind". Maybe then it would have been an OK article, albeit one that's been written too many times in the past. What's with the bullshit the article opens with? "Until recently, true concurrency has been impossible on most computers marketed to consumers. Most have been one-processor models capable of executing only a single thread in any given time slice." SMP systems only came onto the market this year? HT hasn't been available in consumer level CPUs since 2002? Consumers are "bigtime parallel" Java developers? Consumer level CPUs are good examples of the SMP Sun given as an example in the article? If you do any threaded programming, something that is often treated with a "magical" quality that only the "enligntened few" get to know about, there are only a few basic problems that you need to avoid (e.g., double-checked locking.) Again, nothing new. Also, great article. It fits right on the pile with the other "yawn articles" about threading. Amdahl's law wasn't even mentioned?
  18. SMP systems only came onto the market this year? HT hasn't been available in consumer level CPUs since 2002?
    Frankly speaking, I don't understand this statement. HT is different from multi-core, as in HT just a few parts of the CPU were replicated and there are a lot of bottlenecks, as you say. Multi-cores have less bottlenecks. So, basically, a multi-thread application can gain the 40% on a dual core and only around 5% on a HT (these figures have been experimentally measured - I'm working on the same subject as Randall, even though I'm focusing on the client side). And, yes, multi-core processors came out this year.
    It fits right on the pile with the other "yawn articles" about threading.
    Well, sure, this is by no means research stuff, it's only that most of desktop applications existing today aren't capable of fully exploiting multi-cores. When the first dual-core Mac appeared, it looked clear that a lot of existing applications, in their first Intel port, weren't able to exploit both cores. Some people doubts that stuff such as iPhoto and such are able today to exploit say 4 or 8 cores - so, if Apple is not doing, I bet most of Java desktop developers aren't too. So I think they'd better reading these articles instead of yawning. And James Gosling has been beating this argument for a while, I don't think he talks about yawning stuff ;-) Does the desktop need all this computing power? Yes. Today there are application classes, such as digital photo and movie processing, where computing power is never enough. But most of all if you give people more computing power, somebody soon finds a way to use all of it.
    There must be a better solution for that and it must be built into the language.
    Why? Understand me, I'm not against it, on the contrary I hope and expect that future Java compilers will be capable of exploiting at least part of the parallelism at code-level. But: first, this is a future scenario, and if you need to address the problem now you have to deal with explicit threading; second, parallelism at code-level is good if you need fine-grain parallelism, while for medium and coarse grain an architectural approach such as threading can be the best for some classes of applications. Moreover, mid and coarse-grain parallelism is a must if one of the targets is a networked cluster, and you're thinking of code that can exploit parallelism in different contexts. For what concerns on-purpose frameworks, such as Javolution and others, they are a good thing, but - again - I don't think that they are always the best solution. I don't think a one-fits-all solution exists, and the architect should use a catalog of "blueprints" where different solutions are used according to the context.