Chapter 6 Epub File Processing-parsing the container file and. OPF File

Source: Internet
Author: User
Chapter 6, EPUB File Processing -- Analysis Container File and . OPF File

In this chapter, we will continue with openbookinternal of the fbreaderapp class introduced at the end of chapter 3, and begin to introduce parsing the container file and. OPF file.

This chapter involves the content introduced in chapter 2, Chapter 4, and Chapter 5. You can refer to each other for better understanding.

First, let's review chapter 4 "Epub File Processing-internal composition of Epub Files. Hosts file ". These files are all compressed XML files, and different XML files contain different tags, each representing different information. To uniformly process the labels of each XML file, the fbreader Program sets the corresponding classes for each file. Each type is used to process the labels in the corresponding XML file.


The containerfilereader class corresponds to the container. xml file, the oebbookreader class corresponds to the. OPF file, and the ncxreaderclass corresponds to the. ncxfile file, and the xhtmlreaderclass corresponds to the. XHTML file.

PS: these classes are subclasses of the zlxmlreaderadapter abstract class. This chapter covers the containerfilereader class and oebbookreader class.

In chapter 4, we also introduced the functions of the container file and the. OPF file respectively.

The role of the container file is to indicate. OPF file location ";. the OPF file describes the metadata of Epub books, the location of the files corresponding to different chapters in the Epub file, and the order in which each chapter appears.

The two types of files are parsed in this chapter to obtain the information.



Before introducing the parsing process, we need to review the three core classes for parsing XML files. This part has been introduced in chapter 2 "parsing resource files. At that time, we introduced the following:"

Continue to parse the XML file's core classes zmlzmlprocessor, zlxmlparser, and zlxmlreader.

The Calling sequence of these three core classes is generally as follows:

1. The read method in the resourcetreereader class of the zlxmlreaderadapter abstract class calls the read method of the zlxmlprocessor class.

2. The read method of the zlxmlprocessor class obtains a byte stream class (assetinputstream class) for resource files through the getinputstream method of the androidassetsfile class (subclass of the zlresourcefile class ), this byte stream class is used as the parameter to initialize a volume stream class for resource files. Then, the doit method of the zlxmlparser class is called.

3. The zlxmlparser class's doit method converts a file into a char array using the extract stream class. When the for loop is used to iterate byte arrays, The doit method calls the zlxmlreader interface implementation class (resourcetreereader class) in turn) the startelementhandler and endelementhandler Methods operate on different nodes represented by elements in the byte array.

Note that the red part above does not apply in the Process of parsing Epub internal files.

1. The read method in the zlxmlreaderadapter abstract class subclass (containerfilereader class, oebbookreader class, ncxreader class, And xhtmlreader class) calls the read method of the zlxmlprocessor class.

2. The read method of the zlxmlprocessor class uses the getinputstream method of the zlzipentryfile class to obtain a byte stream class (zipinputstream class) for XML files inside the Epub through the localfileheader class corresponding to the zlzipentryfile class ), this byte stream class is used as the parameter to initialize a volume stream class for resource files. Then, the doit method of the zlxmlparser class is called.

3. The zlxmlparser class doit method converts a file into a char array using the render stream class (Chapter 5 describes the conversion process in an entire chapter ). When the for loop is used to iterate byte arrays, The doit method calls the zlxmlreader interface implementation class (resourcetreereader class) in turn) the startelementhandler and endelementhandler Methods operate on different nodes represented by elements in the byte array.

First, the "subclass of the zlxmlreaderadapter abstract class" is no longer the resourcetreereader class, it is a class specifically corresponding to various XML files in Epub (one of the containerfilereader class, oebbookreader class, ncxreader class, And xhtmlreader class ).

PS:ResourcetreereaderA class is a class created by a program to process tags in a resource file.

Second, the program does not obtain the "byte stream class for resource files" when parsing XML files inside Epub ", instead, the getinputstream method of the zlzipentryfile class is called to obtain the byte stream class (zipinputstream class) for the XML file in Epub ). In comparison, this byte stream class will be used to obtain the corresponding bytes stream class.

PS:ZlzipentryfileClassGetinputstreamWe used Chapter 5"XMLFile Processing--Extract "one whole chapter for introduction



Now let's start the parsing process.

The createmodel method of the bookmodel class is called in the openbookinternal method of the fbreaderapp class.



The cratemodel method calls the getplugin method of the plugincollection class.



The getplugin method calls the acceptsfile method of all the subclasses of the formatplugin abstract class, this method compares the myextension attribute of the file class pointed to by the file attribute in the book class (the assignment of this attribute is described in Chapter 3 "Getting book information ), if the built-in variables of the myextension Property Code are the same, the getplugin method returns the current subclass of the formatplugin abstract class. When processing the Epub file, the code will return the oebplugin class.




Oebplugin Class Readmodel Method

After obtaining the oebplugin class, the code will call the readmodel method in the class.



The readmodel method calls two methods: The getopffile method of the oebplugin class and the readbook method of the oebbookreader class.



OebpluginClassGetopffileMethod:
The getopffile method consists of three steps:

Step 1Call the createfile method of the zlfile class

The oebfile parameter of this method is the file attribute of the book class (the assignment process of this attribute is described in Chapter 3 "Getting book information"). In the end, this method returns a container. zlzipentryfile class for XML files.

We once introduced in chapter 2 "parsing resource files": "zlzipentryfile class is used to process XML files inside Epub Files ". The container. xml file has also been introduced in chapter 4 "Epub File Processing-internal composition of Epub Files". The purpose of this file is to "indicate the location of the. OPF File ".

We can see from the code that it represents container. the zlzipentryfile of the XML file contains two attributes (row 85): A zlphysicalfile class that represents the Epub file, and a string variable that represents the seeds of the XML file name in the Epub file (here ).




Chapter 6 Epub File Processing-parsing the container file and. OPF File

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.