PDF can be generated using following ways –
· Apache FOP (Formatting Objects Processor)
It is a print formatter driven by XSL formatting objects (XSL-FO).
It Reads XML and XSL FO and renders to PDF format.
· iText
It is a library that allows you to generate PDF files.
Comparison between Apache FOP and iText
Apache FOP :
FOP is based on MVC pattern and its uses XSL:FO specification
Fop main objective is to convert XML to PDF using XSL: FO. While iText have XML2PDF functionality
Apache FOP is known for slow processing power – If you have to generate above 1000 PDF in very short time.
You can keep the style sheet (xsl:fo or xslt) out from your classes or package and tell FOP to use this xsl:fo or xslt while rendering PDF
iText :
o iText is very reliable from processing perspective and its pretty fast in generating the PDF's
o Post-process, manipulate existing PDF documents.
o encrypt your PDF file
Why Apache FOP ? :
· have XML as data input.
· want separate style sheet file which can maintain setting for PDF.
· are not going to generate very large number of PDF in very short time.
· can show PDF on web browser using APACHE forest project which uses FOP.
· do not want to encrypt PDF files.
. It is open source
How Apache Fop works?
XML +XSL --> XSL FO --> PDF
XML --> Extensible Markup Language. XML was designed to transport and store data.
XSL -- > Extensible Style sheet Language. It is XML-based style sheet language.
XSL – FO
· XSL-FO is a language for formatting XML data
· XSL-FO stands for Extensible Stylesheet Language Formatting Objects
· XSL-FO is a W3C Recommendation
· XSL-FO is now formally named XSL.
It takes XML and XSL as input. Converts it to XSL – FO and generated PDF using XSL-FO input in piping. Here SAX parser is internally used for XML parsing.
Requirements for Code implementation:
Jar –
· fop .jar (available on http://xmlgraphics.apache.org/fop/download.html - Apache fop is open source project of Apache Software Foundation )
· Commons-logging – 1.0.4 jar - Jakarta Commons Logging
Stepwise Sample Code implementation snippet
import org.apache.fop.apps.FopFactory;
import org.apache.fop.apps.Fop;
import org.apache.fop.apps.MimeConstants;
// Step 1 :create Input files – XML , XSL output PDF file .
File xmlfile = new File(baseDir, "xml/xml/courseware.xml");File xsltfile = new File(baseDir, "xml/xslt/coursewarexsl.xsl");File pdffile = new File(outDir, "courseware.pdf");
// Step 2: Construct a FopFactory
// (reuse if you plan to render multiple documents!)FopFactory fopFactory = FopFactory.newInstance();
// Step 3: Set up output stream.// Note: Using BufferedOutputStream for performance reasons (helpful with FileOutputStreams).OutputStream out = new BufferedOutputStream(new FileOutputStream(new File("C:/Temp/myfile.pdf")));
try {
// Step 4: Construct fop with desired output format
Fop fop = fopFactory.newFop(MimeConstants.MIME_PDF, out);
// Step 5: Setup JAXP using identity transformer
TransformerFactory factory = TransformerFactory.newInstance();
Transformer transformer = factory.newTransformer(new StreamSource(xsltfile);
// identity transformer
// Step 6: Setup input and output for XSLT transformation
// Setup input stream Source src = new StreamSource(new File(xmlfile ));
// Resulting SAX events (the generated FO) must be piped through to FOP
Result res = new SAXResult(fop.getDefaultHandler());
// Step 7: Start XSLT transformation and FOP processing transformer.transform(src, res); } finally {
//Clean-up out.close();
}
Step 1: You create input xml , xsl , and output file objects .
Step 2: You create a new FopFactory instance. The FopFactory instance holds references to configuration information and cached data. It's important to reuse this instance if you plan to render multiple documents during a JVM's lifetime.
Step 3: You set up an OutputStream that the generated document will be written to. It's a good idea to buffer the OutputStream as demonstrated to improve performance.
Step 4: You create a new Fop instance through one of the factory methods on the FopFactory. You tell the FopFactory what your desired output format is. This is done by using the MIME type of the desired output format (ex. "application/pdf"). You can use one of the MimeConstants.* constants. The second parameter is the OutputStream you've setup up in step 2.
Step 5: We recommend that you use JAXP Transformers even if you don't do XSLT transformations to generate the XSL-FO file. This way you can always use the same basic pattern. The example here sets up an "identity transformer" which just passes the input (Source) unchanged to the output (Result). You don't have to work with a SAXParser if you don't do any XSLT transformations.
Step 6: Here you set up the input and output for the XSLT transformation. The Source object is set up to load the "myfile.fo" file. The Result is set up so the output of the XSLT transformation is sent to FOP. The FO file is sent to FOP in the form of SAX events which is the most efficient way. Please always avoid saving intermediate results to a file or a memory buffer because that affects performance negatively.
Step 7: Finally, we start the XSLT transformation by starting the JAXP Transformer. As soon as the JAXP Transformer starts to send its output to FOP, FOP itself starts its processing in the background. When the transform() method returns FOP will also have finished converting the FO file to a PDF file and you can close the OutputStream.
----- Nilesh Salpe
1 comment:
Wow. This is some really great info. My boss asked me to give a presentation on how to convert xml to pdf and I had no idea where to start. There isn't a ton of info out on the subject, so I really appreciate your blog. I definitely got a lot out of it. Thanks a million!
Post a Comment