近期需要将 pdf 文件转成高清图片,使用库是 pdfbox、fontbox。可以使用 renderImageWithDPI 方法指定转换的清晰度,当然清晰度越高,转换需要的时间越长,转换出来的图片越大,越清晰。
说明:由于 adobo 软件越来越强大,支持的格式越来越多,这造成了 java 软件有些不能转换。所以对于新的格式可能会有转换问题。
1 引入依赖
org.apache.pdfbox
pdfbox
2.0.16
org.apache.pdfbox
fontbox
2.0.16
2 代码如下
public static voidconvertPdf2Image(String pdfPath, String imageDirPath) {
log.info("start convert pdf file:[{}] to image path:[{}]", pdfPath, imageDirPath);if (!newFile(pdfPath).exists()) {
log.info("pdfFilename:[{}] not exist", pdfPath);return;
}if (!newFile(imageDirPath).exists()) {
log.info("imageDir:[{}] not exist", imageDirPath);return;
}byte[] pdfContent =FileUtil.getFileContentByte(pdfPath);
String filename=FileUtil.getFilename(pdfPath);float dpi = 200;
convertPdf2Image(pdfContent, filename, imageDirPath, dpi);
log.info("convert pdf file:[{}] to image success", filename);
}private static void convertPdf2Image(byte[] pdfContent, String pdfFilename, String imageDirPath, floatdpi) {
log.info("convert pdfFilename:[{}] to imageDir:[{}] with dpi:[{}]", pdfFilename, imageDirPath, dpi);if(ArrayUtils.isEmpty(pdfContent)) {return;
}//为了保证显示清除,至少 90
if (dpi < 90) {
dpi= 90;
}
String baseSir=imageDirPath;if (baseSir.endsWith("/") || baseSir.endsWith("\\")) {
baseSir+= pdfFilename + "_";
}else{
baseSir+= File.separator + pdfFilename + "_";
}
PDDocument document= null;
BufferedOutputStream outputStream= null;try{
document=PDDocument.load(pdfContent);int pageCount =document.getNumberOfPages();
PDFRenderer pdfRenderer= newPDFRenderer(document);
String imgPath;for (int i = 0; i < pageCount; i++) {
imgPath= baseSir + i + ".png";
outputStream= new BufferedOutputStream(newFileOutputStream(imgPath));
BufferedImage image=pdfRenderer.renderImageWithDPI(i, dpi, ImageType.RGB);
ImageIO.write(image,"png", outputStream);
outputStream.close();
log.info("convert to png, total[{}], now[{}], ori:[{}], des[{}]", pageCount, i + 1, pdfFilename, imgPath);
}
}catch(IOException e) {
log.error("convert pdf to image error, pdfFilename:" +pdfFilename, e);
}finally{
IOUtil.closeSilently(outputStream);
IOUtil.closeSilently(document);
}
}//IOUtil.closeSilently 代码
public static voidcloseSilently(Closeable io) {if (io != null) {try{
io.close();
}catch(IOException e) {
e.printStackTrace();
}
}
}
在实际使用中遇到问题
1)ERROR o.a.p.contentstream.PDFStreamEngine 911 - Cannot read JBIG2 image: jbig2-imageio is not installed
2)Cannot read JPEG2000 image: Java Advanced Imaging (JAI) Image I/O Tools are not installed
3) java.lang.IllegalArgumentException: Numbers of source Raster bands and source color space components do not match at java.awt.image.ColorConvertOp.filter
以上两个问题需要使用 JAI 插件和 jbig2 插件支持,通过引入 jai-imageio-core、jai-imageio-jpeg2000、jbig2-imageio
com.twelvemonkeys.imageio
imageio-jpeg
3.4.2
com.github.jai-imageio
jai-imageio-core
1.4.0
com.github.jai-imageio
jai-imageio-jpeg2000
1.3.0
org.apache.pdfbox
jbig2-imageio
3.0.2
参考问题文件
/crazyCodeLove/studentservice/blob/master/sys/src/main/resources/pdffile/000208-p1.pdf
/crazyCodeLove/studentservice/blob/master/sys/src/main/resources/pdffile/001659-p14.pdf
/crazyCodeLove/studentservice/blob/master/sys/src/main/resources/pdffile/main%20doc.pdf
/crazyCodeLove/studentservice/blob/master/sys/src/main/resources/pdffile/573636.pdf
参考文献
/questions/42169154/pdfbox1-8-12-convert-pdf-to-white-page-image
/questions/20424796/pdf-box-generating-blank-images-due-to-jbig2-images-in-it
/qq_15801963/article/details/80746830
/u/2345654/blog/1058192
/questions/18351583/illegalargumentexception-numbers-of-source-raster-bands-and-source-color-space
/questions/10416378/imageio-read-illegal-argument-exception-raster-bands-colour-space-components