Java Common API or programming tools 003--realize the Pdf.js_

Java Common API or programming tools 003--realize the Pdf.js__js of PDF online reading function

Last Update:2018-07-28 Source: Internet

Author: User

Tags base64 locale pdf parser

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Pdf.js Introduction

Pdf.js is an Open-source product based on open HTML5 and JavaScript technology. In short, it's a PDF parser. PDF Reader Pdf.js, which uses html5javascript (that is, pdf.js only use a secure Web language and does not contain any local code blocks that attackers can use), uploads and renders PDF files directly on standard HTML pages, You can also improve security (no Third-party plug-ins need to be installed, security is guaranteed by the browser), and the security measures that the browser has made provide a safe environment for pdf.js to run. Its requirements for IE and Firefox browsers are ie9+, firefox19+.

Online example: Http://jsbin.com/pdfjs-helloworld-v2/1/edit, Http://jsbin.com/pdfjs-prevnext-v2/1/edit

Source: Https://github.com/mozilla/pdf.js

Official website: http://mozilla.github.io/pdf.js/

pdf.js VS Traditional browser read PDF

In general, PDF file format is depicted in the browser by the plug-in, usually Adobe's own PDF reader or from other vendors to describe the tool, but these plug-ins often do not fully use the characteristics of the PDF, and because of the large number of trusted code, so that Google The Chrome browser must use the sandbox sandbox principle to check if the PDF description tool is infected with an unknown virus.

Using Adobe, you must install the software locally to use it, and pdf.js is not dependent on the environment, the rendering speed is fast (tested, really fast), security is high.

pdf.js render PDF file

Pdf.js the process of rendering PDF files: Fetch PDF (Url/buffer)--> canvas--> rendering

If you want to delve into the PDF rendering, you need to study the pdf.js source code. Pdf.js can obtain PDF via the address of PDF file or PDF data stream, the realization is to call interface function Pdfjs.getdoc (url/buffer) to load the PDF into HTML, process it through canvas, then render PDF file. The internet is given by the URL to obtain the PDF example, and I do Project, backstage (Python) request is to send a PDF of data flow to the foreground, the foreground receives the PDF buffer, and then through the pdf.js to render. Of course, the first attempt to buffer a lot of problems, the specific problems summarized as follows:

1 How to receive the buffer data from the backstage to the foreground through $.ajax;

2 How to pass the buffer to Pdf.js to deal with (here I use the viewer.js, so need to consider is how to pass the buffer to the viewer.js to deal with);

3 How to convert pdf.js into the buffer format that pdf.js can receive;

(See code comments for corresponding problem resolution)

Note: viewer.js is an extension of pdf.js, it will print, page, zoom and other functions are implemented, and the interface is very good-looking. In other words, if you introduce the Viewer.js,pdf rendering and rendering of the functional interface has been to help you achieve, you do not have to write the interface.

First from the official website: http://mozilla.github.io/pdf.js/download code, and then use the file viewer.html, my HTML is on the basis of viewer.html modified, the following I give the example of buffer:

<! DOCTYPE html>
<meta charset= "Utf-8" >
<meta name= "viewport" content= "Width=device-width, initial-scale=1, maximum-scale=1" >
<meta name= "google" content= "notranslate" >
<title> Online Preview </title>
{% load static%} {% Get_static_prefix as Static_url%}
<link href= "{{static_url}}css/preview.css" rel= "stylesheet" type= "Text/css"/>
<link rel= "stylesheet" href= "{{static_url}}pdfjs/web/viewer.css"/>
<script type= "Text/javascript" src= "{{static_url}}pdfjs/web/compatibility.js" ></script>
<link rel= "Resource" type= "application/l10n" href= "{{static_url}}pdfjs/web/locale/locale.properties"/>
<script type= "Text/javascript" src= "{{static_url}}pdfjs/web/l10n.js" ></script>
<script type= "Text/javascript" src= "{{static_url}}pdfjs/build/pdf.js" ></script>
<script type= "Text/javascript" src= "{{static_url}}pdfjs/web/debugger.js" ></script>
<script src= "{{static_url}}js/jquery-1.8.3.js" type= "Text/javascript" ></script>
<script type= "Text/javascript" >

Convertdatauritobinary ()

I don't know why. If the background directly to the PDF of the data stream to the front desk, get is garbled, the data converted into Uint8array is always unsuccessful

So let the background will send the data stream before the Base64 code sent to the front desk, then decoded the data is not garbled.

var base64_marker = '; base64, ';

var Prefileid = {{Mark}};

Viewer.js global variable, incoming buffer, answer question 2
var Default_url

$ (document). Ready (function () {
$.ajax ({
Type: "Post",
Async:false,

Ajax receives the PDF data stream, notice whether the setting of the datatype value is incorrect, and if unspecified, jquery will automatically return the message based on the HTTP packet mime information

Responsexml or ResponseText. Answer question 1.
ContentType: "Application/pdf;charset=utf-8",
URL: ' {% URL netpan.file.views.browserfuf%} ',
data:{
Id:prefileid
},
Success:function (data) {
var pdfasdatauri = data;

If Viewer.js is introduced, the processing method
var Pdfasarray = convertdatauritobinary (Pdfasdatauri);
Default_url = Pdfasarray;

Only Pdf.js was introduced, no viewer.js was introduced, and the processing method

var Pdfasarray = convertdatauritobinary (Pdfasdatauri);

Pdfjs.getdocument (Pdfasarray). then (); Write a PDF of your own processing function

}
});
});

function convertdatauritobinary (Datauri) { //encoding conversion, answer question 3

var base64index = Datauri.indexof (base64_marker) + base64_marker.length;
var base64 = datauri.substring (Base64index);
var raw = Window.atob (base64);
var rawlength = raw.length;

//convert to Uint8array type that pdf.js can directly parse, see pdf.js-4068
& nbsp var array = new Uint8array (new Arraybuffer (rawlength))

for ( i = 0; i < rawlength; i++) {
Array[i] = raw.charcodeat (i);

return array;
&NBSP;}
</script>

<script type= "Text/javascript" src= "{{static_url}}pdfjs/web/viewer.js" ></script>

<body>

Omit content

</body>

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More