Polyglot programming on the Java Platform

Home

News: Polyglot programming on the Java Platform

  1. Polyglot programming on the Java Platform (28 messages)

    Recently, polyglot programming has become a bit of a meme, with many pointing out how different languages bring different strengths. Polyglot, in this context, allows programmers to use scripting for some tasks, Java for other tasks, with other languages like C# providing other external services. JSR 223 is the JSR providing the Scripting API for Java. The concept behind it is that a programmer gets a ScriptEngine, which refers to a mechanism executing scripts in a given language, and then passes that ScriptEngine source code in some fashion – through a Reader or InputStream, or even as individual lines of code. Thus, the Java platform itself can be used as a launch point for polyglot programming, after a sense, in that you can write modules in various languages (JavaScript, Ruby, Python, Groovy, BSF) and use them from within a Java framework and on the Java platform. A test was written using the exact same algorithms, using Jython and Java. The code was taken from "Programming Collective Intelligence," by Toby Segaran, chapter six. The test created a naïve bayesian classifier, trained the classifier with sample ham and spam, and then ran two thousand sets of ten tests against the classifiers. The only timed elements were the loops getting the classifications. While the results varied somewhat, an example of a run showing good results for both Java and Jython yielded the following: the Java code ran in 312 ms, and the Jython code ran in 37500ms. This is the Python code taken directly from the book, ported almost line for line into Java (with some variations due to different semantics in the languages.) The only optimization in the Java code was the use of Javolution for collections, and even in that case, Javolution's fast iterator mechanisms were not used. There are potentially ways to fix the script's performance:
    • a Compilable script engine is possible
    • The Python code itself could be optimized somehow
    • A different language with better performance could be used
    With that, though, it's apparent that the effort taken to rewrite the algorithm in Java was valuable effort, with a vast increase in performance. While this test only used Jython as an alternate platform, other (simpler) tests using other languages yielded similar results on much smaller scales. If you're using the Java platform as a basis for multiple languages, then, make sure that the savings are worth your time in terms of maintenance. At the very least, you should consider running some tests to validate performance - otherwise, the tradeoffs are likely to not be worth the investment.

    Threaded Messages (28)

  2. I don't see where I can find the code for the script or the Java version. It would be nice to see that if only to make the discussion more concrete. One of the nice things I find about using Jython is that it's really very easy to use Java libraries and Objects. So where most Python developers probably know the python XML libraries in-and-out, I use the Saxon-b library with some basic Python wrappers to get an XML API with Xquery support. I guess my point is that if the script is running slow, there's probably a bottleneck in the code that could either be changed to use a 3rd party Java library or some custom Java code. Also, the addition of an invokedynamic bytecode should theoretically make Jython much faster in addition to making the lives of the Jython development team (you rock!) a lot easier.
  3. I don't see where I can find the code for the script or the Java version. It would be nice to see that if only to make the discussion more concrete.
    I can send it to you if you'd like. The java code will be in the ci-bayes project on java.net fairly soon (i.e., as soon as I can be bothered creating a decent build.xml for it) and the python code... is just one file.
  4. I don't see where I can find the code for the script or the Java version. It would be nice to see that if only to make the discussion more concrete.
    I can send it to you if you'd like. The java code will be in the ci-bayes project on java.net fairly soon (i.e., as soon as I can be bothered creating a decent build.xml for it) and the python code... is just one file.
    I don't even really care to see the Java. Can you post the python script? I'm just curious if I can see where the slowdown is coming from and if I'm right in my wild guess that there's a bottleneck that could be resolved by using the Java libraries instead of the python libraries.
  5. I don't see where I can find the code for the script or the Java version. It would be nice to see that if only to make the discussion more concrete.
    I can send it to you if you'd like. The java code will be in the ci-bayes project on java.net fairly soon (i.e., as soon as I can be bothered creating a decent build.xml for it) and the python code... is just one file.


    I don't even really care to see the Java. Can you post the python script? I'm just curious if I can see where the slowdown is coming from and if I'm right in my wild guess that there's a bottleneck that could be resolved by using the Java libraries instead of the python libraries.
    As I said, it's posted - and that's an interesting idea. I didn't consider using the Java library from within the python code, because that would have reduced the portability of python... I wonder how valid I consider that to be.
  6. As I said, it's posted - and that's an interesting idea. I didn't consider using the Java library from within the python code, because that would have reduced the portability of python... I wonder how valid I consider that to be.
    I think it's valid.
  7. As I said, it's posted - and that's an interesting idea. I didn't consider using the Java library from within the python code, because that would have reduced the portability of python... I wonder how valid I consider that to be.


    I think it's valid.
    In the context of writing polyglot stuff on the java platform... I think you're right. I'd be interested in seeing if the python code could be optimized for jython.
  8. I didn't consider using the Java library from within the python code, because that would have reduced the portability of python... I wonder how valid I consider that to be.
    Well, the entire thing rewritten in Java isn't portable Python either ;-) But one thing you can do to mitigate this is create a module that hides the use of Java libs so that if you needed to port it, you could implement it with pure python or with the corresponding 'scaffold' language in the target environment.
  9. I don't see where I can find the code for the script or the Java version. It would be nice to see that if only to make the discussion more concrete.
    I can send it to you if you'd like. The java code will be in the ci-bayes project on java.net fairly soon (i.e., as soon as I can be bothered creating a decent build.xml for it) and the python code... is just one file.
    And the project is now populated in SVN on java.net: https://ci-bayes.dev.java.net . If you need the python code, it's there in SVN too, under /lib/.
  10. Nothing is popping-out at me when looking at the python script. Maybe if I looked more closely. Probably the python code itself could be optimized a little.
  11. Simplification[ Go to top ]

    I think the more 'prefessional' term used is generally 'Metaprogramming'. There is a new industry niche forming, understandable so, of coders using tools such as GWT which generate code from Java. The advantage? You *don't have* to use Javascript with GWT, just Java. Now imagine if the same is done with all other languages a developer needs to use today - SQL for a RDBMs (being obsoleted by Hibernate?), GLSL for hardware shading in the 3d world (why? Java metaprogramming library would work) and C++ or C for small-footprint applications? Why not just generate the code from Java? So the questions really isn't about performance - if you need performance, Java is there and proven especially on the server platform. The question is about Simplification. Why 'polyglot' for AJAX? When you can do it all in Java with something like GWT? Extending this further - Java could be generating/interfacing any other language as well. Soon the 'source' language won't be Java anymore, but likely something more suitable for Metaprogramming. I'll put $$ on this one =)
  12. Re: Simplification[ Go to top ]

    Soon the 'source' language won't be Java anymore, but likely something more suitable for Metaprogramming. I'll put $$ on this one =)
    What language would you bet on? The thing is I think static typing is required but I can't think of any statically typed languages with really good metaprogramming capabilities. C++ has metaprogramming it's confusing. I know there's interest in adding metaprogramming to Scala, but I don't think there's anything concrete being done.
  13. Re: Simplification[ Go to top ]

    Soon the 'source' language won't be Java anymore, but likely something more suitable for Metaprogramming. I'll put $$ on this one =)


    What language would you bet on?

    The thing is I think static typing is required but I can't think of any statically typed languages with really good metaprogramming capabilities. C++ has metaprogramming it's confusing.

    I know there's interest in adding metaprogramming to Scala, but I don't think there's anything concrete being done.
    Of, course it my opinion - albiet a stubborn one, but, If you look at all the toolsets, Java to Javascript via GWT or j2s, Actionscript via j2as, etc. the common thread is they all work of of Java's AST (Abstract Syntax Tree). Java (at lease pre Java 5) has such a 'pure' syntactic structure that it makes it easy for AST-based metaprogramming tools to be made. And Java has proven itself the industry leader for Enterprise Stuff, and even if you don't agree with that you have to admit its successful to some degree. The point being - Java's AST is proven to 'Work'. So, if all tools work from an Abstract Syntax of Java, not actually Java itself, then theres no reason for the first step. Why start from Java? Just like Eclipse allows you to edit Java code via a representation of the syntax tree (per JST mechanics), for all the developer knows that AST could be persisted in some other format... XML, for example. You can 'create' a Java AST from UML editors, or from a parser of any other OOP language, really. So the developer isn't limited to creating the Java AST from Java, with the proper toolset, he'll be able to create the AST from a variety of sources... choose your language? Or don't use one at all? UML w/state mechanics works fine =)
  14. Re: Simplification[ Go to top ]

    Or don't use one at all? UML w/state mechanics works fine
    UML is a language. That's what the 'L' stands for: 'language'.
  15. Re: Simplification[ Go to top ]

    Or don't use one at all? UML w/state mechanics works fine


    UML is a language. That's what the 'L' stands for: 'language'.
    You're right - I should clarify. 'Domain Specific Languages', are distinctly different from 'programming languages' in the traditional sense. UML is a sortof 'Domain Specific Language' catering to modelling objects and their relations. As opposed to the traditional sense, a 'typed' language which details procedural logic.
  16. Re: Simplification[ Go to top ]

    If you look at all the toolsets, Java to Javascript via GWT or j2s, Actionscript via j2as, etc. the common thread is they all work of of Java's AST (Abstract Syntax Tree).
    This is just code generation, not metaprogramming.
  17. Re: Simplification[ Go to top ]

    If you look at all the toolsets, Java to Javascript via GWT or j2s, Actionscript via j2as, etc. the common thread is they all work of of Java's AST (Abstract Syntax Tree).


    This is just code generation, not metaprogramming.
    I would argue, metaprogramming *is* code generation =) Or more accurately - procedural logic that creates code which, embodies other procedural logic. Whether it crosses language boundries (generate 'other' languages) or stays within the current language (say, reflexive languages) it still classifies as a metaprogram.
  18. Re: Simplification[ Go to top ]

    Or more accurately - procedural logic that creates code which, embodies other procedural logic. Whether it crosses language boundries (generate 'other' languages) or stays within the current language (say, reflexive languages) it still classifies as a metaprogram.
    Why procedural logic? That seems rather limiting to me.
  19. Re: Simplification[ Go to top ]

    Or more accurately - procedural logic that creates code which, embodies other procedural logic. Whether it crosses language boundries (generate 'other' languages) or stays within the current language (say, reflexive languages) it still classifies as a metaprogram.


    Why procedural logic? That seems rather limiting to me.
    As opposed to... fuzzy logic? :)
  20. Re: Simplification[ Go to top ]

    Or more accurately - procedural logic that creates code which, embodies other procedural logic. Whether it crosses language boundries (generate 'other' languages) or stays within the current language (say, reflexive languages) it still classifies as a metaprogram.


    Why procedural logic? That seems rather limiting to me.


    As opposed to... fuzzy logic? :)
    As opposed to declarative and functional.
  21. Re: Simplification[ Go to top ]



    Why procedural logic? That seems rather limiting to me.


    As opposed to... fuzzy logic? :)


    As opposed to declarative and functional.
    I was kidding :) Since a microprocessor only understands procedural statements, ultimately even todays 'declarative' and 'functional' languages are just ultimately wrappers around procedural machine code. If a declarative language was used to generate code I would still consider it a metaprogram... I hope it makes sense? Would be interested in hearing opinions :) Its an interesting area I *do* think all programmers are going to be dwelling in the next few years...
  22. Re: Simplification[ Go to top ]

    Since a microprocessor only understands procedural statements,
    Is machine code procedural? It doesn't even really have enough structure to be called that.
    ultimately even todays 'declarative' and 'functional' languages are just ultimately wrappers around procedural machine code.
    Assuming for the moment that this is correct, I think you are drawing some pretty shaky conclusions from it. How a high-level language is translated into executable instructions is largely irrelevant when developing in that language. There were a number of machines developed to run LISP natively in the past. If a procedural language were compiled into LISP to run on these machines, does that make the procedural code just a wrapper around functional code?
  23. Re: Simplification[ Go to top ]



    ultimately even todays 'declarative' and 'functional' languages are just ultimately wrappers around procedural machine code.


    Assuming for the moment that this is correct, I think you are drawing some pretty shaky conclusions from it. How a high-level language is translated into executable instructions is largely irrelevant when developing in that language.

    There were a number of machines developed to run LISP natively in the past. If a procedural language were compiled into LISP to run on these machines, does that make the procedural code just a wrapper around functional code?
    We have to be careful about creating a 'false diacotomy', a programmer could be working both in a structural, declarative language that is producing procedural code, or he could use procedural statements directly. But the basic principle is that he can write code which generates other code, this is metaprogramming. Ultimately I think the language barriers will break down, since C++ and Java are really only different syntacticly, by semantically equivalent, programmer toolsets I think will merge similairly. As now 'javascript' programmers are less necessary now that GWT is avaiable. All other languages could follow the same fate... And once the 'destination' languages are modularized, I think the 'source' language will be also, so you're not starting from just Java anymore, but maybe just UML, or some state diagrams - as has already been proven a very effective way to create software... Would be interested to hear more opinion =)
  24. C++ and Java[ Go to top ]

    Ultimately I think the language barriers will break down, since C++ and Java are really only different syntacticly, by semantically equivalent, programmer toolsets I think will merge similairly.
    I think C++ templates make it significantly more expressive than Java, assuming you can make sense of them. I would not call the two languages semantically equivalent beyond the fact that they are both turing complete.
  25. Re: C++ and Java[ Go to top ]



    I think C++ templates make it significantly more expressive than Java, assuming you can make sense of them. I would not call the two languages semantically equivalent beyond the fact that they are both turing complete.
    If you examine how templates work behind the scenes - they're really just a glorified 'copy/paste' of the code, once for each use of a different template type. The C++ compiler basically duplicates code, replacing templatized types with the requested type in the duplicated version(s). So in that sense - it is still just a syntactic difference, a shortcut that just shortens what 'can' be done in Java. Also keep in mind, that Java's 'Object' centric use in collections API and felxible type casting eliminates the need for templatizing.... But I would much much rather *not* move into a language argument, as that is acually contrary to my point, that: - Java is a kind of subset of C++, it is an Object Oriented language, compiles into bytecode, runs under a VM - C++ is a 'pre compiled language' it is object oriented, supports the same things Java does in terms of inheritance, polymorphism - Many other languages are similair 'enough' - Python is basically Java with whitespace instead of brackets, less necessary explicit typing, lots of syntactic differences - but the bottom line, its an Object Oriented language capable of the same patterns Java and C++ are. So if it *is* possible to merge all languages at an abstract level (say, an Abstract Syntax Tree is used on the backend that is similair to Java's), it is theoretically possibly to input and output any other language from that syntax tree. Not all features of all languages are supported obviously, as the programmer is working with a 'least common denominator' but he gains the advantage that his 'Object Oriented' code patterns are adaptable to any environment - a huge plus for sacrificing something small... say, typing brackets instead of braces... I think developer tools are going to go this route. Already it has started, with Java to Javascript via GWT, Java to Actionscript via j2as, and modelling tools that give UML to Java... Of course, just my opinion still =)
  26. Re: C++ and Java[ Go to top ]

    So if it *is* possible to merge all languages at an abstract level (say, an Abstract Syntax Tree is used on the backend that is similair to Java's), it is theoretically possibly to input and output any other language from that syntax tree.
    Hmmm...yes and no. It would probably be possible to create such as AST, but converting many languages to Java's AST would result in a loss of semantic information. For example, consider a tail-recursive method in Scala. There are essentially two potential translations to Java: (1) A tail-recursive Java method (2) A Java method with a loop "for" or "while" loop #1 would result in a Java method that, in source, looks a lot like the Scala method. Semantic information such as variable mutability (val/var in Scala, final/not final in Java) would be preserved. Mostly the changes would be minor shifts in syntax. However, the Java method would compile into a recursive method at the bytecode level, while the Scala method would compile into a loop due to optimization of tail-recursion. So in the translation process from Scala to the Java AST, you either lose some of the semantic information that enables you to reason about the code, or you lose semantic information about how the code will be executed by the machine. Both are very important.
  27. The JVM was written for Java. That's the language it's most suited to run. Other languages generally jump through hoops in order to run on it - with the exception of Groovy of course, which also targets the JVM. It's the same thing in .Net: It's made to run C# code, other languages like managed C++ and Visual Basic .Net end up sufficiently different from their non-CLR-targeted variants to confuse programmers - at least in the beginning. Microsoft even wanted to make VB.Net more different from VB 6 but were deluged by protests from VB users. So, Microsoft can beat their chests about 50 languages (Eiffel, Haskell etc.) or whatever number they use, but in reality there is ONE: C#. (And whatever is used for inline code - VBScript.Net? - in ASP.Net.) As for cross-language scripting engines: Microsoft also has something like that in Windows, where you can register e.g. ActiveState's Python and Perl ports as "Windows Scripting Hosts" and use Python code in web pages rendered in Internet Explorer. However, this is hardly ever used since it would require exctra effort from the user. And further: Of the languages Microsoft provide, just Javascript is generally used simply because other browsers are more cross-platform and do not use WSH. Therefore they cannot run VBScript code. The same will probably be the case with the JSE 1.6 scripting engine: Reduce the usage to just the Sun-provided language(s).
  28. not academic: pnuts[ Go to top ]

    The JVM was written for Java. That's the language it's most suited to run. Other languages generally jump through hoops in order to run on it - with the exception of Groovy of course, which also targets the JVM.
    The same holds for Pnuts https://pnuts.dev.java.net/ Which is the fastest scripting language on the JVM by a *huge* margin (and we tried them all). Just a tad unacquainted... Cheers, Dirk
  29. The JVM was written for Java. That's the language it's most suited to run. Other languages generally jump through hoops in order to run on it - with the exception of Groovy of course, which also targets the JVM.
    This is true but as I understand it, the invokedynamic bytecode instruction, is not useful to Java and remove a lot of the hoops in addition to making scripting languages run faster. I think it's pretty much a done deal that it will be added in 1.7.