P~ 0.9 released, new Java-friendly scripting with novel regex

Discussions

News: P~ 0.9 released, new Java-friendly scripting with novel regex

  1. P~ (pronounced "ptilde") is a new Java friendly scripting language. The principle reason for creating it was to offer a new and more powerful approach to creating regular expressions. Unlike all other regex engines, P~ does not use the Perl-compatible metacharacter syntax, instead using algebraic syntax for regex composition. This decision opens the door to more powerful side-effects than even possible in Perl, but preserving the readability and maintainability of P~ regexes. In other regex engines, your regular expressions become hard to read as the difficulty of the problem increases. Not so in P~. While P~ makes it easy to grapple with matching and transformation problems that are hard for even Perl programmers, its basic grammar is Java-like, more so than even Groovy. This means that Java programmers can quickly learn the basic grammar forms. P~ is also Java friendly because you can import Java classes within your scripts, and use their public apis just like in your Java code. All you have to do is make sure that when you launch the Ptilde scripting application, you include the appropriate Java libraries (jar files) in the classpath. Finally, P~ is Java friendly because its engine is a Java library. Thus, if a Java programmer has a tough matching or transformation problem, solve it first with a P~ script, using the standalone application shell and the novel P~ regex grammars; then make this script available to your Java application as either a file or a resource, and easily invoke it from your Java class. You are allowed to pass arguments and return a result from a scriptlet! If this sounds interesting, take a look at the home page for the documentation, which is found at http://ptilde.pbwiki.com. Start with the Tutorial which will guide you through first the basic grammar of Ptilde and then through the regex grammar forms. P~ is licensed with something similar to Sun's JDK5 license: "you are free to use the product for any deployment purpose, including enterprise deployment, without paying a fee. You may also use the available Java front-end source code, including your own modifications, though not advised to do so... The one significant restriction is that you may not re-license the product on any terms. If you are building a development kit of some kind, and need to re-license this product, please contact us."
  2. This seems interesting but I kept finding pages about how great this is instead of any useful information. I found it but by that time I was a little annoyed and distracted. It's fine to point out your achievements but put that in one place. It's not necessary to preface every section of the site with why this is a good tool. The proof of the pudding is in the eating.
  3. Thanks for the feedback. You've made a good point. We'll fix this.
  4. This seems interesting but I kept finding pages about how great this is instead of any useful information. I found it but by that time I was a little annoyed and distracted.

    It's fine to point out your achievements but put that in one place. It's not necessary to preface every section of the site with why this is a good tool. The proof of the pudding is in the eating.
    May be yes, may be not. May be this case is very special: beside Java developers using native regexp features, they are trying to convince even Perl users (or Java guys with some Perl skills, anyway) to use P~ and to accomplish it, nothing more natural than to claim their advantages everywhere within their docs (btw, Perl is really great and simply much better than Java for some problems, I spent almost 2 years to realize it), does not matter I don't have idea about what f. "automata" means until today ;)
  5. May be yes, may be not. May be this case is very special: beside Java developers using native regexp features, they are trying to convince even Perl users (or Java guys with some Perl skills, anyway) to use P~ and to accomplish it, nothing more natural than to claim their advantages everywhere within their docs (btw, Perl is really great and simply much better than Java for some problems, I spent almost 2 years to realize it), does not matter I don't have idea about what f. "automata" means until today ;)
    Again, I think they should include their claims and put them up front. But once the reader has read the claims and (naturally) wants to investigate, he or she should be able to without skimming through the same claims again and again. And I feel that this might be defeating the purpose. Consider the saying "thou doth protest too much." If you toot your horn too often it can make your claims seem dubious. Moving along, I find this tool very interesting because one of the challenges that I face is executing multi-line regexes against very large files. I have a custom java app that basically acts as a command line wrapper around the Java regex libraries. It works great until the file gets to be very large mainly because I am naively loading the entire stream into memory. I've considered added a sliding window feature to this tool but the forward-only aspect of p~ make me wonder if this tool might address my needs.
  6. is that any benchmark compare with java regular expression?
  7. The focus of developing the P~ regex grammar has been to enhance the ability of the Java programmer to solve hard problems accurately and with readable, reusable regexes. In the simpler cases, where a document-level solution with Java regex is viable, we've often seen a negligible speed difference. We will post benchmarks for such problems. But as the problem gets more complex and the programmer wishes to handle all of the nuances of the document classification, typical use of Java regular expression boils down to fine-grained regexes that are applied iteratively in a parser/ state machine, whereas P~ can properly solve many such problems with one suitably composed document-level regex.