PDA

View Full Version : HTML to PDF conversion



kbaum
Nov 11th, 2004, 07:48 AM
My front end designers created a great looking html front end. This front end is pretty complex containing lots of images and CSS. One of my project requirements is to be able to export this front end to PDF. The PDF has to be pretty much an exact copy of the HTML version. I was looking at the AbstractPdfView examples and they seem to be pretty manual. I doubt I could manually recreate this great looking html front end using itext, although it's a great tool. If I already have the finished HTML, is there an easy way to create a PDF version? Thx.

-karl

Rod Johnson
Nov 11th, 2004, 07:53 AM
Good question. Spring's PDF support isn't designed to do this. But there must be a library out there somewhere...

snpe
Nov 11th, 2004, 09:28 AM
There is htmltopdf library, but it don't work with complex html

regards

cmavilio
Dec 20th, 2004, 03:01 AM
Hello kbaum,
I'd like to know where did you find the example of AbstractPdfView, because 'till now I've just found API at this link: http://www.springframework.org/docs/api/org/springframework/web/servlet/view/document/AbstractPdfView.html.
Thanks in advance,
Claudia

... I was looking at the AbstractPdfView examples and they seem to be pretty manual. ...
-karl

gmatthews
Dec 20th, 2004, 04:09 PM
Can't think of a magic answer to this, but if you used the Apache FOP toolkit, you might do the following:

1. Translate your CSS into an importable XML document containing xsl:attribute-set elements. This would then mirror your existing style.css.

e.g.

myStyle { font-weight: bold; }

vs

<xsl:attribute-set name="myStyle">
<xsl:attribute name="font-weight">bold</xsl:attribute>
</xsl:attribute-set>


2. Maybe use some front end technology that allowed you to get at the rendered HTML, i.e. JSP is likely out, but Velocity might work since you could programmatically render the HTML into some buffer so you could subsequently process it into an XML-FO document , and then into a PDF. You may even have to cleanup the rendered HTML before trying to convert it into an XSL-FO document, i.e. make it well-formed, handle special characters.

If you can't find a magic answer, then depending on how much XSLT/PDF building you've done, I'd also suggest you look at using the Apache FOP toolkit to build a single PDF document so you fully understand the gap between laying out HTML and PDF document -- so that you can be sure it's just not easier to build each PDF separately/manually from it's HTML version.

joe
Aug 15th, 2006, 09:33 AM
Hey karl!

not sure if you are still looking, but:


I cant vouch for any of these, but check out the links on:

http://www.w3.org/Style/CSS/


there are a bunch(including a servlet) that should work (one even passes the ACID test)

farrellr
Aug 15th, 2006, 09:41 AM
Perhaps you are only looking for free software to do what you ask and/or software integrated with your app, however I just thought I'd mention that the Adobe Acrobat product really is great for this kind of thing. I've used it before to create a pdf copy of a site I wanted to perserve - you basically point it to a starting page and tell it how many levels to retrieve and it does - creating linked pdf. Of course you need more than the reader to do this.
Anyway, just thought I'd mention it.

good123
Mar 5th, 2007, 02:32 AM
PDF to HTML (http://www.tomdownload.com/business_finance/word_processing/pdf2html.htm) - Converts PDF files to HTML files while seeking to preserve the original page layout (as best as technically possible).

Nik
Mar 6th, 2007, 12:32 AM
If you are at the XHTML level (e.g JSF/Facelets), then the integration from there on could be done as

http://today.java.net/pub/a/today/2006/10/31/combine-facelets-and-flying-saucer-renderer.html

It is problably simple to integrate in other frameworks, too. Although you should be aware that there is no 1:1 mapping between rendered HTML and e.g PDF.

tatvamasi
Mar 6th, 2007, 08:42 AM
Given 5million times I would use iText over anything else on any given day.

winnovative
May 22nd, 2007, 03:09 AM
Hi,

You can try the HTML to PDF converter from Winnovative Software at http://www.winnovative-software.com .

The HTML to PDF Converter Library for .NET can be used as a general purpose component for converting web pages and HTML code to PDF documents and images in any format or it can be used to easily create PDF reports directly from ASP.NET pages to benefit from the great power of standard ASP.NET databound server controls.

The converter can convert the web pages to PDF documents with selectable texts and objects or to PDF documents with an embedded image.

You can add footers and headers to the rendered document, set the page size, page orientation and PDF compression level. The library can be used from any .NET 2.0 application (windows forms, ASP.NET web sites or command line tools). When using the converter from ASP.NET web sites you can opt for the HTML to PDF Converter ASP.NET server control to benefit from the Visual Studio 2005 design time capabilities.

The .NET 2.0 libraries, ASP.NET server controls and the provided samples are royalty free. Version 2.3 adds the possibility to define custom page breaks in the HTML code. You have to simply put the :pb tag where you want split the page.

Bharani
May 23rd, 2007, 02:07 AM
You can convert your html to postcript using html2ps and then use ghostscript to convert your postcript to pdf.

lloyd.mcclendon
May 23rd, 2007, 11:10 PM
everyone hates pdfs. look into a tool to convert the page to a image. we use htmlsnapshot, works very well. much better than any pdf converter i found, pdf is such a crap format and they're always making it worse. adobe sucks minus photoshop.

Nik
May 24th, 2007, 12:22 AM
everyone hates pdfs. look into a tool to convert the page to a image. we use htmlsnapshot, works very well. much better than any pdf converter i found, pdf is such a crap format and they're always making it worse. adobe sucks minus photoshop.

Well... going OT but. PDF is not that bad. Sure, it has many features that most users won't need (could be called bloat) but the last time I checked doing text search on an image wasn't that straight-forward. Or print formatting an image of hight 120000px for any number of printers.

HTML to PDF conversion is difficult because there is no true 1:1 mapping (most browsers don't even render the same HTML the same way) but that isn't the problem of the PDF format). There will be open (and openish) XML document formats gaining market share but they too will have strengths and weaknesses.

leetgeezer
Jun 2nd, 2007, 02:12 PM
Recently I had the same problem, I haven't found a 100% satisfactory solution, but... you can use Apache FOP which gets the basic work done.
FOP needs a stylesheet for xhtml->fo conversion, grab it from here:
http://www.antennahouse.com/XSLsample/XSLsample.htm.

iText is great but it won't convert your html into pdf.

newyorkhcm
Sep 5th, 2009, 04:52 PM
Just get this program : PDF Creating v2.0

http://pdfcreating.com/images/pdf_03_03.jpg
PDF Creator (http://pdfcreating.com): create PDF (http://pdfcreating.com)documents easily

Download:
http://www.mediafire.com/download.php?dtnyt5zgez1

illbzo1
Jan 7th, 2013, 04:15 PM
If you're looking for high-quality HTML to PDF conversion, I'd recommend DocRaptor. DocRaptor is an HTML to PDF API created by my company, Expected Behavior. We use Prince XML for PDF generation, and provide much better CSS support than similar APIs.

HTML to PDF conversion (http://docraptor.com/html-to-pdf) with DocRaptor.

And a link to our API documentation:

DocRaptor API Documentation (http://docraptor.com/documentation/)

merrychen
Apr 8th, 2013, 03:38 AM
You can convert your html to postcript using html2ps and then use ghostscript to convert your postcript to pdf. use xflip (http://www.xflip.com/) to convert pdf to html, flash swf files easily

well, all in all, you could read pdf files regardless of the computer system.

merrychen
Apr 8th, 2013, 09:47 PM
My front end designers created a great looking html front end. This front end is pretty complex containing lots of images and CSS. One of my project requirements is to be able to export this front end to PDF. The PDF has to be pretty much an exact copy of the HTML version. I was looking at the AbstractPdfView examples and they seem to be pretty manual. I doubt I could manually recreate this great looking html front end using itext, although it's a great tool. If I already have the finished HTML, is there an easy way to create a PDF version? Thx. well, use xflip (http://www.xflip.com/) to covnert pdf to html easily

-karl

well, try to conduct some research on google website, there exist many tools for you!

synodbio
Apr 9th, 2013, 12:50 AM
There are many tools to it namely, HTML to PDF converter, PDfonFly, PDFcrowd etc

CamilleM
Apr 19th, 2013, 01:00 AM
I use pdf sdk (http://www.rasteredge.com/how-to/csharp-imaging/pdf-converting/)in .NET to achieve conversion.

synodbio
May 8th, 2013, 01:32 AM
My front end designers created a great looking html front end. This front end is pretty complex containing lots of images and CSS. One of my project requirements is to be able to export this front end to PDF. The PDF has to be pretty much an exact copy of the HTML version. I was looking at the AbstractPdfView examples and they seem to be pretty manual. I doubt I could manually recreate this great looking html front end using itext, although it's a great tool. If I already have the finished HTML, is there an easy way to create a PDF version? Thx.

-karl

Total HTML Converter converts all types of HTML files. Even if the file has a wrong code (some tags are missing) the program will process it automatically fixing the mistakes. The output PDF will be of highest quality.