Discussions

News: Davisor Announces MS Word to PDF and HTML API

  1. Davisor has announced Davisor Publishor, a 100% Java API that supports MS Word DOC to PDF and HTML conversions. This allows a server-side programmer to design documents in MS Word, and write custom data into documents from Java on any given operating system, in various formats. Publishor is available under three commercial licenses based on specific need. Publishor also provides mail merge feature where Word document is used as a template. DOC template with mail merge fields provides dynamic data positions which can be filled from different data sources to produce final PDF or HTML document.
  2. Interesting product(s). don't know if there are open source equivallent tools for this type of functions.
  3. Apache POI...[ Go to top ]

    Interesting product(s).

    don't know if there are open source equivallent tools for this type of functions.
    There's http://jakarta.apache.org/poi/, but it's not complete yet. Snowbound has a commercial product, but the word conversion stuff isn't complete yet either. I already grabbed a demo copy, guess I'll give it a whirlll and see how it does.
  4. Interesting product(s).

    don't know if there are open source equivallent tools for this type of functions.
    It is possible to emulate this using free tools, though it is not easy. MS provide a free XSLT transformer to convert from Word 2003 XML format to XSL-FO. By doctoring the XSL they provide, you can get it to produce not XSL-FO but another secondary XSLT that, when run, will embed your data in the required XSL-FO document. Then run the Apache FOP tool to generate the PDF. Effectively, MS Word becomes your report design tool. A commercial offering that makes it easy is probably worth it if you have Word->PDF requirements. Also see Windward Reports IMO, with standard MS XML doc formats coming along, I can see more OSS tools in this area in the future so commercial products only offering basic functionality may have a limited lifetime.
  5. Interesting product(s).

    don't know if there are open source equivallent tools for this type of functions.


    It is possible to emulate this using free tools, though it is not easy. MS provide a free XSLT transformer to convert from Word 2003 XML format to XSL-FO.
    One problem with this is if your trying to do conversions on a non-win box. It seems like if you can use a win enviroment for doc conversion, then I'm just going to do it in VB/C# and use win apis and be done with it.
  6. Interesting product(s).

    don't know if there are open source equivallent tools for this type of functions.


    It is possible to emulate this using free tools, though it is not easy. MS provide a free XSLT transformer to convert from Word 2003 XML format to XSL-FO.


    One problem with this is if your trying to do conversions on a non-win box. It seems like if you can use a win enviroment for doc conversion, then I'm just going to do it in VB/C# and use win apis and be done with it.
    All that's provided in the download are a set of XSLT transforms and a .cmd file to run them. There's nothing to stop you from running the transforms using eg. Xalan on your platform of choice.
  7. <blockquoteAll that's provided in the download are a set of XSLT transforms and a .cmd file to run them. There's nothing to stop you from running the transforms using eg. Xalan on your platform of choice.</blockquote> Cool, got it now. But now we would be doing an xml->xml->pdf transformation vs. a doc format -> pdf transformation? But as office moves to a standard xml format, we won't have this problem, hopefully.
  8. All that's provided in the download are a set of XSLT transforms and a .cmd file to run them. There's nothing to stop you from running the transforms using eg. Xalan on your platform of choice.>
    Cool, got it now. But now we would be doing an xml->xml->pdf transformation vs. a doc format -> pdf transformation?
    That's right. As I said, it's not easy and a tool that simplified it is worth it!
    But as office moves to a standard xml format, we won't have this problem, hopefully.
    Here's hoping...
  9. Also see Windward Reports

    IMO, with standard MS XML doc formats coming along, I can see more OSS tools in this area in the future so commercial products only offering basic functionality may have a limited lifetime.
    ACK! Has anyone here actually USED the MS Office schemas? I've implemented an application which generates MS Visio XML documents and I can tell you it ain't pretty and, most of the time, Xalan can't cut it because of all of the static application content it has to keep in memory: which MS puts into every single MS Visio document! I've ended up using other large text-based file processing technologies to append that crap to the resulting dynamic MS Visio XML content to make it work. Anyway, the MS XML schemas are a direct map of the old overbloated patched MS binary document formats. It has not been redesigned or optimized for space and makes any resulting document look like it has been obfuscated to the human eye: which defeats the whole raison d'ĂȘtre of XML!!! Just my 2cents. No, MS XML is not the answer to all our woes... But it is better than the binary format we used to have.
  10. Also see Windward Reports

    IMO, with standard MS XML doc formats coming along, I can see more OSS tools in this area in the future so commercial products only offering basic functionality may have a limited lifetime.


    ACK! Has anyone here actually USED the MS Office schemas? I've implemented an application which generates MS Visio XML documents and I can tell you it ain't pretty and, most of the time, Xalan can't cut it because of all of the static application content it has to keep in memory: which MS puts into every single MS Visio document! I've ended up using other large text-based file processing technologies to append that crap to the resulting dynamic MS Visio XML content to make it work.

    Anyway, the MS XML schemas are a direct map of the old overbloated patched MS binary document formats. It has not been redesigned or optimized for space and makes any resulting document look like it has been obfuscated to the human eye: which defeats the whole raison d'ĂȘtre of XML!!!

    Just my 2cents.

    No, MS XML is not the answer to all our woes... But it is better than the binary format we used to have.
    Are you referring to the Office 2003 XML format or the newer OpenXML format? The 2 are different. My impression was that the Word 2003 XML format seemed to be based on RTF which was itself a nightmare. So we swap one nightmare for another. I am hoping the OpenXML format is a bit better thought out, though I haven't looked at it.
  11. One could use the JAVA API that comes along with OpenOffice to do the same thing -> MS Word to PDF. One can even programatically get data from backend DB etc to fill in the blanks in the document if there is a need. Why buy when you do it in an "Open" manner :)
  12. Interesting product(s).

    don't know if there are open source equivallent tools for this type of functions.
    Start OpenOffice as server process, and then use Java wrappers to talk to it. OO can then convert from Word to PDF or HTML or OO. The HTML is very clean, which is useful if you want to parse it (which I do, for example). Works quite well.
  13. Links to how-tos?
  14. The "First Steps" section of the Developers Guide provides good information on how to set up this: http://api.openoffice.org/docs/DevelopersGuide/FirstSteps/FirstSteps.xhtml Been using the API to convert Word documents to XML, and to generate custom Word templates from XML.
  15. Open Office..[ Go to top ]

    Interesting product(s).

    don't know if there are open source equivallent tools for this type of functions.

    Start OpenOffice as server process, and then use Java wrappers to talk to it. OO can then convert from Word to PDF or HTML or OO. The HTML is very clean, which is useful if you want to parse it (which I do, for example). Works quite well.
    This might be the best free solution. I was poking at some of the OO api's last year, and it looked like it could be done with some work. When you say 'java wrappers' do you meaning using JNI, or making command line calls from Java (which I hate doing...)
  16. with OO[ Go to top ]

    import officetools.OfficeFile; // available at www.dancrintea.ro/doc-to-pdf/ ... FileInputStream fis = new FileInputStream(new File("test.doc")); // works with xls also FileOutputStream fos = new FileOutputStream(new File("test.pdf")); OfficeFile f = new OfficeFile(fis,"localhost","8100", true); f.convert(fos,"pdf"); All possible conversions: doc --> pdf, html, txt, rtf xls --> pdf, html, csv ppt --> pdf, swf html --> pdf
  17. Comparison with WindWard[ Go to top ]

    Currently we are using windward (http://www.windwardreports.com/)solution for the generation of documents (xls, pdf, doc, rtf, html etc) based of template (in rtf) and data. Previously we have been using some customization of Apache FOP and JFOR (http://sourceforge.net/projects/jfor), but the problem was that it was outdated, obsolete and the performance was a real problem. Creating the template was really complicated and end-users were not able even to update the template. Therefore we are using the WindWard. Is there anyone that have tried both solutions? Thanks Michal