Discussions

News: Article: The Pragmatic Code Generator Programmer

  1. Article: The Pragmatic Code Generator Programmer (30 messages)

    According to Sven Efftinge, while powerful frameworks such as openArchitectureWare provide advanced features and a wide range of tools for developing code generators, the exercise can also be performed through a "text manipulation language" or DSL such as Perl. In this article, Efftinge reimplements an exercise taken from the book "The Pragmatic Programmer" in which a text manipulation language is used to develop a code generator. What do you think of this exercise?

    Threaded Messages (30)

  2. Much ado about nothing[ Go to top ]

    I dunno. I haven't looked at this framework. But for the trivial language presented, I've done as much with AWK. For most code generator purposes, you typically don't need a particularly sophisticated tool, especially for internal use. If you're making a product, then perhaps. But for internal use you can make an awful lot of assumptions about the problem domain, your input language, your output language, etc. Combined with something like Velocity or Perl for a "templating" language to handle a bulk of the boiler plate, it's really all straightforward. If you want something more sophisticated, most folks may well just opt for XML and abuse that. And if you need to go beyond that, then you may as well start beating on a BNF grammar and lock and load ANTLR or one of its ilk, and effectively write a compiler. The problem is as hard as you want to make, and programmers being programmers, we tend to not want to make problems particularly hard. That's why we get terse interfaces, strange format files, etc. I appreciate that the article only uses "20%" of the framework, but I look at that sample language, I see that "pluggable compiler module in case you need to conver to another language" and I just shake my head. Most folks simply don't need that much capability. Most folks don't need a variety of target languages.
  3. Re: Much ado about nothing[ Go to top ]

    But for the trivial language presented, I've done as much with AWK.
    You are right, that the example is trivial. We choosed it 1) because it is simply (and we don't need to explain the target platform too much) 2) it's from a popular book 3) which is liked by code-centric and agile people (and we are code-centric and agile, too ;-))
    For most code generator purposes, you typically don't need a particularly sophisticated tool, especially for internal use. If you're making a product, then perhaps. But for internal use you can make an awful lot of assumptions about the problem domain, your input language, your output language, etc.
    The complexity of a code generator depends on the number of concepts and how far it abstracts from the underlying platform. Additionally code generators grow like the target frameworks grow. So most of the time you come to the point when you need good mechanisms, to structure and refactor the generator. Just to keep it alive. If you come to this point, be sure to have a look at openArchitectureWare.
    Combined with something like Velocity or Perl for a "templating" language to handle a bulk of the boiler plate, it's really all straightforward.

    If you want something more sophisticated, most folks may well just opt for XML and abuse that.

    And if you need to go beyond that, then you may as well start beating on a BNF grammar and lock and load ANTLR or one of its ilk, and effectively write a compiler.

    The problem is as hard as you want to make, and programmers being programmers, we tend to not want to make problems particularly hard. That's why we get terse interfaces, strange format files, etc.
    So, you think that Velocity or Perl does the job, but has it's limitations, where XML can do much more for you? And if that is not enough (or you don't like coding XML for some reason) you can use AntLR or the like? So, why shouldn't we use a tool which is more powerful than XML and AntLR (for the special purpose of create DSLs, of course) but is much simpler than all the technologies mentioned above? Ok, one point would be the learning effort. And if you really just want to have some lines of boiler plate code automatically generated, I agree, that it's not worth learning a new tool. But IMHO this is not the case most of the time.
    I appreciate that the article only uses "20%" of the framework, but I look at that sample language, I see that "pluggable compiler module in case you need to conver to another language" and I just shake my head. Most folks simply don't need that much capability. Most folks don't need a variety of target languages.
    I 100% agree. I'm not sure where the citate comes from, but I think it's from the pragmatic book, isn't it?
  4. In other news....[ Go to top ]

    Compilers compile code. Really, what is the point of this article? To show that you can write code generators?
  5. Re: In other news....[ Go to top ]

    DSLs with code generation are much better in abstracting the real world then manual coding using popular frameworks (although popular frameworks can be used with the first approcat too). Like high-level programming languages rise abstraction higher from assembly, the DSLs rise abstraction another step higher. The problem is, however, that DSLs have to be designed and code generators written, which is not an easy task, or at least not many people are experienced in this.
  6. Re: In other news....[ Go to top ]

    The problem is, however, that DSLs have to be designed and code generators written, which is not an easy task, or at least not many people are experienced in this.
    True, except that most DSLs aren't actually "designed", they're "grown". And it shows. A question that should be answered by anybody thinking of creating a Domain Specific Language is "why is a Domain Specific Language more appropriate than a Domain Specific Library in a standard language?". The answer should discuss the ability for enhancement/bugfixes by someone other than the original author, plus cost of training and documentation.
  7. DSL -> High cost?[ Go to top ]

    I don't think it would be all that hard to design a simple DSL and use some tools to implement it. I would see expect the following problems with this: maintenance: now we need to maintain this language. perhaps the author has a strong grasp on it but what everyone else? training: everyone who uses this has to learn what it can and cannot do. suppport: when users have trouble, who do they go to? bugs: There are surely going to be problems with the language design and as it evolves there will be always be more. I'm not saying that it should never be undertaken but I don't think it should be taken lightly or done because it was the cool thing to do. There need to be some clear benefits that will really be realized before this is undertaken. Does anyone with experience creating DSLs have any anecdotes on this? Am I all wet or does what I'm saying here make sense?
  8. Re: DSL -> High cost?[ Go to top ]

    All of these points are very valid. DSLs are much more complicated for general use than a library. The primary reason is that the DSL has different semantics than the "normal" language. If it didn't, then there's not a lot of point in creating the DSL in the first place. Simple DSLs are little more than high level templating languages, so don't have to be that complicated. They tend to be pretty deterministic in what they generate, and the end result is usually some blob of code that someone wants, and a coder can validate that easily. They key up their template, run the code generator, and can look at both and see how X in the template become Y in the generated code. But if you're doing a higher level scripting system, you need to carefully communicate the object model that the language is representing. In a scripting language, you tend to not have a direct result for someone to look at. They key in their code and "execute" it, without any real transparency of the underlying runtime. You tend to need to have fairly detailed knowledge of that opaque runtime to code in the scripting language productivly, particularly when problems occur. With an object library, this is less of an issue -- particularly if you have source code available. At worst, when something wierd happens, you can step through it with a debugger to get the nuances of a libraries operation. The other mistake some folks make is they think that with a DSL that "non-coders" can actually use the DSL. But since a DSL is designed for automation of an underlying objet model, writing in a DSL is just like writing in any language -- whether it's DOS .BAT files or Java, and it takes a coding mindset and understanding of the underlying domain as represented by the computer in order to make effective use. And there's the rub. Most knowledge workers understand their domain in terms of the real world, not necessarily how the computer represents it -- and no doubt, they will be different with the computer having a limited subset, and being picky, stubborn and terse in its enforcement of that subset. Now some DSLs are simply data, and not code. Simply sophisticated markup. That's much easier for most knowledge workers to deal with -- it's simply a formatting problem of taking stuff they know and putting it in to a format the computer understands through the DSL. But in this case, these folks aren't "coding", they're simply doing data entry. Different problem entirely. Finally, as you mentioned, maintenance is a real issue. Generic object libraries tend to be easier for folks to pick up and maintain than a compiler, and not many folks have real compiler experience. So, basically, use DSLs with care. They take time to create and maintain. If you commit to the DSL and can leverage it extensively intenally, then by all means work it. But with modern languages being more and more expressive, and with they simplicity and richness of XML (even with all of its flaws), you need to think long and hard before undertaking the task. Also, with the plethora of OSS scripting languages available that can be extended, it's much harder to justify creating one from scratch.
  9. Re: DSL -> High cost?[ Go to top ]

    The other mistake some folks make is they think that with a DSL that "non-coders" can actually use the DSL.
    Actually that's one of the better uses for a DS Language, unfortunately. Usually, it still fails to meet that objective. A good standard scripting language which can access only a limited API exposed by a DS Library is a better way to achieve the objective.
  10. Re: DSL -> High cost?[ Go to top ]

    A good standard scripting language which can access only a limited API exposed by a DS Library is a better way to achieve the objective.
    I agree. Any advice/references on how to accomplish this effectively and safely? For some reason I'm thinking of Javascript but any ideas are welcome.
  11. Re: DSL -> High cost?[ Go to top ]

    If you're happy with the expressiveness of JavaScript, it's a really good solution. Rhino works really well, is easy to get started with, and sophisticated enough to pretty much do what you want with it. We use it for our home built forms system, validation, and what not. I don't particularly care for it from an interactive point of view. JS is too wordy and syntax heavy for interactive use. It's also a full pop programming language, so it's not particularly "easy to use" for a domain, you can't make your own syntax and such for example. But if you want to be able to store "code" in a Database, say, then JS is wonderful as it's pretty Java Like and interfaces really easily with Java code. I haven't tried it yet, but if you simply need an Expression Language, then you may want to consider embedding the Java EL. That's nice for things like simple validation expressions or calculated values, but not so much for actual logic.
  12. Re: DSL -> High cost?[ Go to top ]

    ...but if you simply need an Expression Language, then you may want to consider embedding the Java EL...
    OGNS or any script (Jython, Beanshell, etc) is way better than EL in my opinion
  13. Re: DSL -> High cost?[ Go to top ]

    ...but if you simply need an Expression Language, then you may want to consider embedding the Java EL...


    OGNS or any script (Jython, Beanshell, etc) is way better than EL in my opinion
    What I am thinking of is Jython because (to me) it's very easy to read and understand for most people, at least for imperative code. It looks like pseudo code. What I would like to have is a restricted version. Of course now that I start thinking about what I want to restrict I see how. It's easy to get rid of the Python standard libs in Jython, just delete them. I imagine I would only want bodies of methods to be allowed... that's easy. Check for and disallow global keyword. Anything else that might be dangerous?
  14. Re: DSL -> High cost?[ Go to top ]

    The other mistake some folks make is they think that with a DSL that "non-coders" can actually use the DSL.

    Actually that's one of the better uses for a DS Language, unfortunately. Usually, it still fails to meet that objective.
    It's a fine goal, but in the end, a DSL is still a computer language. While syntax is a barrier to many folks, syntax can easily be learned by rote and example if nothing else. But beneath it all is the computer, the most aggravating, nit picky, detail oriented, OCD beast on the planet. That's the barrier that has to be overcome, and it's easier said than done. For simple automation, no big deal, particularly if they do the tasks by hand anyway. Simple DOS .BAT files are an easy example. Do this, this, and then this. The object model there is pretty much the file system and it's a pretty simple one. Folks needs to master the file system to do anything anyway, and then combined with arbitrary "commands" that, essentially, make new files or delete new files. But add in a richer data model, and trying to keep track of the interrelationships of the model, and things get complicated and snowball very quickly. So, we end up with menu systems, F keys, and what not. That becomes our "DSL". Users master these relatively quickly because the model is so transparent. Consider the old Lotus 1-2-3 Ring Menu system. Very few "hidden" commands there. Your DSL becomes a string of one letter commands. It turns out that for many "advanced" processes, the domains are simple enough to take a very structured and limited set of data for inputs, and then rather than creating a DSL, we simply create an input screen and let users key the data in. In time we find that while business process are routine, we automate only so much, the rest needing manual intervention, and that becomes our user interface. So, most tasks simply don't require the expressiveness of what most would consider a DSL (since I don't think anyone considers an entry screen one).
  15. Re: DSL -> High cost?[ Go to top ]

    I don't think it would be all that hard to design a simple DSL and use some tools to implement it. I would see expect the following problems with this:

    maintenance: now we need to maintain this language. perhaps the author has a strong grasp on it but what everyone else?

    training: everyone who uses this has to learn what it can and cannot do.

    suppport: when users have trouble, who do they go to?

    bugs: There are surely going to be problems with the language design and as it evolves there will be always be more.

    I'm not saying that it should never be undertaken but I don't think it should be taken lightly or done because it was the cool thing to do. There need to be some clear benefits that will really be realized before this is undertaken.

    Does anyone with experience creating DSLs have any anecdotes on this? Am I all wet or does what I'm saying here make sense?
    I don't think you're all wet. I have had bad experiences a couple of times, with a home-grown database access query language and a 'scripting language' which tried to make everything look like SQL (2 different companies). All the drawbacks you mention are very real, but apparently at the time these were developed it sounded like a good idea. The other thing was that in each case it was sold as a relatively easy and quick thing to do. I'd have to agree with you that there would have to be some clear benefits shown before something like this was started. I would argue for a hard look into the future as well - both maintenance and hiring become more difficult if things like this become a significant part of your environment.
  16. DSLs are not that hard to implement[ Go to top ]

    I don't think it's so hard to implement a custom DSL. In article http://rrusin.blogspot.com/2010/01/xquery4j-in-action.html there is a simple way to do it in XQuery and Java. If you select XML as your AST, then you don't need any CompilerCompiler tool for parsing and you can interpret it easily in XQuery. Also, you can bind easily Java methods to it and do whatever with your constructs. So it's a complete tool for compact DSL implementations.
  17. Re: In other news....[ Go to top ]

    Compilers compile code.

    Really, what is the point of this article? To show that you can write code generators?
    Read the article, and you will know that it is not about 'that' but *how* you can write code generators.
  18. There is also additional simple options for building code generators. For example: - use XML for textual representation of DSL, - define DSL with XML schema, - use XML schema-aware XML editor for editing your DSL. This way you get DSL editor with basic support for code completion for free, - use XSLT to present your DSL to business people, - use XML parser to parse XML textual representation of your DSL into AST object model, - use MVC pattern for code generator, - have two-layered Model. Extend AST object model with additional methods/attributes that are not part of DSL but are required to support code generation, instead of putting code generation logic in templates, - use Velocity or StringTemplate for View templates, - your code generator core is the Controller.
  19. - have two-layered Model. Extend AST object model with additional methods/attributes that are not part of DSL but are required to support code generation, instead of putting code generation logic in templates
    - your code generator core is the Controller
    Very valuable advices. About the first one, I came to the same conclusion some time ago when designing my little code generator but never stated it to myself so clearly like you did here. I realized code generation is just another aspect of my object model. By the way, I started with a XML schema but after "eating my own food" for some time I decided to go with a http://martinfowler.com/bliki/FluentInterface.html instead. I am using http://airspeed.sanityinc.com and Jython to generate code and config artifacts and already speculating about to generate behaviour "on the fly" like Rails do by using Jython.
  20. Hi all, for writing code generators it exist a lot of works with Model Driven Archtecture. Eclipse Modeling Framwork provides good tools for doing thing like this.
  21. Eclipse Modeling Framwork provides good tools for doing thing like this.
    openArchitectureWare is a part of the Eclipse Modeling Project. Under the hood the Eclipse Modeling Framework is used. http://www.eclipse.org/gmt/oaw
  22. By the way, I started with a XML schema but after "eating my own food" for some time I decided to go with a http://martinfowler.com/bliki/FluentInterface.html instead.
    This way you have free code completion for your DSL too! But then your DSL is harder presentable to business people, meaning that you need to write 'projecting' viewer for your DSL.
  23. .... FluentInterface.html ...
    Yeap, DSL like text can be written in the general purpose language.

    This way you have free code completion for your DSL too!
    But then your DSL is harder presentable to business people, meaning that you need to write 'projecting' viewer for your DSL.
    Not all kinds of completion can be done achieved in the XML and schema aware editor. For example it cannot suggest suitable values based on already defined nodes. And besides all that XML looks horrible even it is understandable. Personally I think DSL accompanied by something like MPS is much more promising. http://www.martinfowler.com/articles/mpsAgree.html PS: DSL smells telecom and wires, I think DSLang (dee-slang) could be a better acronym that will reflect the essence of this beast: domain specific slang, not the full blown language.
  24. There is also additional simple options for building code generators. For example:

    - use XML for textual representation of DSL,
    - define DSL with XML schema,
    - use XML schema-aware XML editor for editing your DSL. This way you get DSL editor with basic support for code completion for free,
    - use XSLT to present your DSL to business people,
    - use XML parser to parse XML textual representation of your DSL into AST object model,
    - use MVC pattern for code generator,
    - have two-layered Model. Extend AST object model with additional methods/attributes that are not part of DSL but are required to support code generation, instead of putting code generation logic in templates,
    - use Velocity or StringTemplate for View templates,
    - your code generator core is the Controller.
    All that XML. Sounds like you really want Lisp.
  25. There is also additional simple options for building code generators. For example:

    - use XML for textual representation of DSL,
    - define DSL with XML schema,
    - use XML schema-aware XML editor for editing your DSL. This way you get DSL editor with basic support for code completion for free,
    - use XSLT to present your DSL to business people,
    - use XML parser to parse XML textual representation of your DSL into AST object model,
    - use MVC pattern for code generator,
    - have two-layered Model. Extend AST object model with additional methods/attributes that are not part of DSL but are required to support code generation, instead of putting code generation logic in templates,
    - use Velocity or StringTemplate for View templates,
    - your code generator core is the Controller.


    All that XML. Sounds like you really want Lisp.
    XML here is just because there are currently no simpler and better tools available (at least I am not aware of such). XML, XSD, XSLT are widespred ant the tools are matured. Yes, textual representation of DSL in XML can be ugly, but it's still relatively readable, and you have XSLT to make it looks pretty. XML does not play the central role in this recomendation (although it is most mentioned), but AST object model and MVC for code generator (with two-layered model and templating engine).
  26. - use XML for textual representation of DSL,
    - define DSL with XML schema,
    - use XML schema-aware XML editor for editing your DSL. This way you get DSL editor with basic support for code completion for free,
    Of course you can use XML with the openArchitectureWare languages, too, but with Xtext you also can define the conrete syntax of your DSL: Just write a grammar (which is much more simple and abstract than AntLR or JavaCC grammars, and less verbose than XML-Schema) and you get: - A textual DSL - An EMF based metamodel (representing the abstract syntax)) - A parser creating instaqnces of that metamodel - An Editor providing - syntax highlighting - syntax checking - code completion - an outline view - and semantic checking based on simple declarative constraints (You don't have those in a generic XML editor) So you get a bunch of really useful tools, for a few lines of grammar code. Anyway, if there are circumstances, where you want to use XML, just do so.
    - have two-layered Model. Extend AST object model with additional methods/attributes that are not part of DSL but are required to support code generation, instead of putting code generation logic in templates,
    openArchitectureWare has support for so called extensions. (Like the one introduced with C# 3.0) With extensions you can add operations to your AST types (we say metamodel) in an noninvasive manner. So you don't need to polute your AST model with target platform dependent code.
    - use Velocity or StringTemplate for View templates,
    We know most of the alternative template languages, but we need the following three things: 1) static tying. We never needed to do MetaProgramming within a generator, but we always have custom metamodels at hand. So it's very nice to have a template editor which knows the metamodel and provides completion proposals and static type checking. 2) higher-order functions. Within code generators you typically query information from an AST (or a model) which is a tree (or a graph). I need to write something like 'myClass.allFunctions.select(f|f.isAbstract)' 3) support for an extension mechanism(see above)
  27. did you take a look to acceleo http://www.acceleo.org/pages/demo-flash/en
  28. did you take a look to acceleo
    Yes. There are screencasts for openArchitectureWare, too: http://www.openarchitectureware.org/screencasts/externaldsl_part1.htm http://www.openarchitectureware.org/screencasts/easygendev_part1.htm
  29. DSL such as Perl[ Go to top ]

    Excuse me. When did Perl become a DSL?
  30. Interesting MDA framework[ Go to top ]

    I evaluated this days openArchitectureWare by writting a code generator based on UML. Seems to be very good and flexible. However, it should be better as the development team to write some example templates, as starting points for users (not only those which are presented in tutorials).
  31. Technical Remark[ Go to top ]

    To parse a line such .. F id int The rule should be Field : "F" name=ID type=Type; Instead of (see the screenshot in Generating a parser paragraph) Field : "F" type=Type name=ID; Right? I found the approach very interesting and useful, but IMHO it is not a trivial task to use it for non 'core' java people.