I’m headed off to the GPU computing conference in San Jose, which is featuring the Nvidia Compute Unified Device Architecture (CUDA) processor architecture. CUDA can be created and used with a variety of Nvidia GPUs, including the including GeForce, Quadro and the Tesla lines (http://www.nvidia.com). The advantage is that increasingly GPUs offer significantly higher raw performance than industry-standard processors. The relatively (within the last year) new Tesla offers the highest performance of the processors, with some pretty incredible floating point results. Nvidia and others offer Tesla boards or full Tesla computers with up to 960 GPUs for parallel operations. Nvidia offers such as system with an Intel main processor for under just under $10,000. Developers program using “C for CUDA,” an extension (and limitation) of the C language with Nvidia extensions to take advantage of the features of the architecture. In particular, CUDA exposes a very fast shared memory region (16KB in size) that can be shared among multiple threads, making inter-thread communication quick and easy. Despite the C focus, bindings are available for Java, too, most commonly JaCuda at http://jacuda.wiki.sourceforge.net, available under the LGPL. The goal is to provide several CUDA and C based functions which can be easily accessed using Java, Groovy and Python. As you might imagine, the Java bindings are pretty limited right now, supporting some mathematical operations and not much else. Does CUDA make sense for Java? Do we need to do fast math or inter-thread communication with Java? Can we make use of highly parallel operations afforded by some of the massively parallel GPU CUDA systems? It seems to me like the limitation many developers face with Java is memory space, not computational performance. If you had an array of 960 Tesla GPUs and the ability to write Java code against it, what would you do with it? Let me know.
- Posted by: Peter Varhol
- Posted on: September 29 2009 08:14 EDT
- Re: Does Java on CUDA Make Sense? by Otengi Miloskov on September 29 2009 11:26 EDT
- maybe but by Matt Giacomini on September 29 2009 18:17 EDT
- Why not? by Karl P on September 29 2009 21:17 EDT
- Re: Does Java on CUDA Make Sense? by Cameron Purdy on September 29 2009 22:58 EDT
- Please, fix the link (remove ',' from the end). by Nikita Ivanov on September 30 2009 01:32 EDT
- Re: Does Java on CUDA Make Sense? by Faizal Abdoelrahman on September 30 2009 01:39 EDT
- We'd add it to GridGain if... by Nikita Ivanov on September 30 2009 02:01 EDT
- Re: Does Java on CUDA Make Sense? by James Watson on September 30 2009 09:44 EDT
- Re: Does Java on CUDA Make Sense? by Cameron Purdy on September 30 2009 11:52 EDT
- Regex on CUDA by Pete Haidinyak on October 06 2009 11:55 EDT
Why so much effort for a proprietary solution as CUDA. It is better people begin to invest more in OpenCL it is an open standard and it will be more powerful. The binginds could be very nice but Im will still coding in C
"It seems to me like the limitation many developers face with Java is memory space, not computational performance." I agree I think this is the limitation that most developers face, but some of us do deal with computational bottlenecks too. For example I write financial calculations and found many calculations that have to be performed async and cached because they are intensive to run 'on the fly' for large pools of loans. One example is a recursive IRR calculation. I have friends who work in biotechs that do almost nothing but number crunching. I'm sure if you turned over a few rocks you could find shops that could benefit from better computational performance.
CUDA / OpenCL is about the embarrassingly parallel problems. The question shouldn't be does Java need the X/Y/Z "best fit" feature or tech solution that GPU processing provides, but at what point are the easily parallelizable problems that we're already solving (text parsing, pattern matching, BI rulesets) become more efficient with a CUDA-type platform than with only Java. I'm very confident that CUDA-driven regex and JSON/XML kernels could help accelerate app servers and maybe even Drools.
We've had customers use video (GPUs) and similar (FPGA) parallelized hardware for compute offload. It's more work, but it can drop problems from days to seconds .. I guess that makes sense ;-) Peace, Cameron Purdy | Oracle Coherence http://coherence.oracle.com/
See subj... By the way, the CAPTCHA for this was "Mr. Purdy". Cameron, is this you? :)
There are some neural net implementations that achieve significant speedup through CUDA, to achieve very fast pattern recognition. AI algorithms make small inroads. I am sure they can be put to good use when thought about carefully. CUDA will help to slip them in. Cheers, Faizal Abdoelrahman (who just typed in "open oracles" for captcha, I kid you not)
... quality Java binding would exist. I'm not sure I'd invest our time right now into binding coding - but if it would be available I think GridGain can easily provide native support for that. Idea is certainly viable. Best, Nikita Ivanov. GridGain - Cloud Development Platform
It seems to me like the limitation many developers face with Java is memory space, not computational performance.This is maybe a little off subject but while I think this is true, I think it is largely a result of the idioms that have been adopted by Java developers at large. For example, building a large list of items and then serializing the list (a common pattern) is extremely wasteful of memory and also slow. By simply serializing the items in as they come available, the memory requirements are vastly reduced. If we eliminate this artificial bottleneck around memory, the computational performance will often become more of a limiting factor for the resulting improved performance.
Can we make use of highly parallel operations afforded by some of the massively parallel GPU CUDA systems? It seems to me like the limitation many developers face with Java is memory space, not computational performance.I wrote a bit about this topic on my blog. This challege is something that we'll likely see more of, not less of. Peace, Cameron Purdy | Oracle Coherence http://coherence.oracle.com/ p.s. My captcha was "$58-million tizzy" .. now that's a tizzy worth having ..
I have a need to parse Syslog messages at wire speed and have been looking at Regex Hardware Boards. If this could be done with CUDA in Java that would be great. -Pete