Description of PostScript (PS/EPS format) by bobob I. Postscript overview postscript is both a page description language and a high-level interpreted scripting language. Because it has nothing to do with the device, it can faithfully reproduce the original appearance on that platform, so it is widely used in the printing and publishing industry, and because it is an interpreted script, so it can also be used to solve some problems like a general programming language. Compared with the PDF file we are familiar with, there are several obvious differences between the two: 1. PDF has strict file structures (file headers, all objects, cross-referenced tables, file tails, and linear PDF also have fixed formats) and document structures (logical structures worn by catalog ), PS does not have these; 2. there are more than a dozen PS data types, and there are only 8 PDF data types. 3. PS has the structure control condition statements of general programming languages, such as if, ifelse, for, forall, loop, and function, while PDF does not; 4. PDF is to be displayed to others. PS is not limited to this purpose. It can also be used as a script to implement certain non-display functions. 5. there are more than 400 standard operators in the PS language, some of which have several usage methods (different parameter types and quantities), and PS has no reserved words, these standard operators can completely change their original meaning through the PS script; 6. PS files are generally not encrypted, so the files are large and the PDF files are much smaller. Generally, a postscript reader should include the following parts: scanner, interpreter, operand stack, execution stack, Dictionary stack, graphic state stack, virtual memory zone, and Font Processing block, the color processing block and the final output function are described in detail below. Ii. Postscript
1.
Basic Data Structure
Simple objects |
Composite objects |
Boolean |
Array |
Fontid |
Dictionary |
Integer |
File |
Mark |
Gstate (languagelevel 2) |
Name |
Packedarray (languagelevel 2) |
Null |
Save |
Operator |
String |
Real |
|
Array: can accommodate different types of objects. You can access the elements through index. The following table security check must be performed for access to the array, and the maximum length of the array must be exclusive. nesting must be supported; the storage of values conforms to the characteristics of the composite object. String: the value of the element must be between 0 and. The length is restricted by the program implementation. The Escape Character '/' must be processed. The storage of the value must conform to the characteristics of the composite object. Dictionary: used to store key-value pairs. to insert an entry into the dictionary, you can query a key and obtain its associated values. When creating a dictionary, you must specify the maximum number of entries, when an entry is inserted, the maximum number of entries is exceeded, and PS level1 returns a dictfull error. level2 and above are automatically expanded. The maximum number of entries is limited by the dictionary implementation; supports the implementation of dictionary-related operators; the storage of values must meet the characteristics of composite objects. File: a readable or writable pipeline stream used for communication between the interpreter and the running environment. Supports permanent storage and dynamic generation of types such as disk files. A file object must be created and opened, and other operators can be read and written; supports operations such as read, Readline, write, and writeline. The save: Save operator obtains the status snapshot of the local virtual memory and returns the Save object that describes the snapshot, restore restores the local virtual memory to the snapshot status generated by save. Restore must implement the following functions: discard all objects generated in the local virtual memory since the corresponding save and return the occupied space; restore all composite objects (excluding strings) in the local virtual memory when saving; call the grestoreall operator implicitly to restore the graphics state to the state when saving; close all opened files since save (opened when the local virtual memory works ). Restore does not affect the operand stack, Dictionary stack, execution stack, and global virtual memory. Save and restore can be nested. Gstate: a set of graphical control parameters, which are divided into two categories: CTM, position, path, Clipping Path, Clipping Path stack, color space, color, Font, line width, line cap, line join, Miter limit, dash pattern, Stroke Adjustment; device-independent include color rendering, overprint, black generation, undercolor removal, transfer, halftone, flatness, smoothness, and device. The above features must be accessed.
2.
Scanner.This is the basis for implementing the browser. The Paster streams are parsed into one object according to the postscript syntax. There are three encoding methods: ASCII, binary token, and binary object sequence. ASCII encoding is common, so we only consider this encoding method. The scanner must be able to identify several PS elements: A. blank spaces. In addition to comments and strings, blank spaces are treated as the unit of the split object, continuous blank spaces are treated as a processing; CR, lf, AND Cr + LF are all treated as line breaks. B. annotations. All characters except the characters in the string, after %, until the line break. C. Number. Contains 3 types: Symbol integers (such as 0, + 5,-3), real numbers (such -. 002, 34.5,-3.62, 123.6e10, 1.0e-5, 1e6,-1 ., 0.0), index (for example, 8 #1777,16 # fff, 2 #100 ). D. String. There are three forms: () literal text included; <> hexadecimal encoded text included; <~~> Included ASCII-Base85 encoding data. Data processing in () is based on the following principles: If () contains a pair (), no special processing is required; if a single (OR) is contained,/is used for processing; other escape characters processed. E. Name. The Lexical unit that consists of regular characters and cannot be interpreted as a number is treated as a name. All characters except white spaces and delimiters can appear in the name. /Pilot a literal name, but it is not part of the name itself. F. array. [And] define an array and [start to collect elements] to construct an array object containing these elements. [And] Are operators. G. process. {And} define an executable array. H. dictionary. <And> construct a dictionary. The process is almost the same as that of [], including key1 value1 key2 value2... For a pair such as keyn valuen, the constructed dictionary is put into the dictionary stack by the interpreter.
3. operand Stack
,
Execution stack
,
Dictionary Stack
,
Graphic state Stack
,
Clipping Path Stack
.These five stacks and the subsequent virtual memory and gstate are the main parts of the postscript execution environment. A. operand stack. Any postscript object can be stored. Because postscript is the syntax rule of the operator before and after the operand, the interpreter first presses the operand stack when encountering the operator and then obtains the operand from the top of the stack. In addition, the intermediate execution results can be stored in the operand stack. The operand stack is directly controlled by the PS interpreter. Most PS operators can directly push and pop operations. B. execution stack. Stores executable objects. When the interpreter delays the current executable object to execute another object, it pushes the current executable object into the execution stack and then executes it. This is also equivalent to the call stack of the postscript program. Controlled by the PS interpreter, the PS program can read but cannot write. C. dictionary stack. Only store dictionary objects. from the bottom up, the three objects are systemdict, globaldict, and userdict. Systemdict is a read-only dictionary that defines all standard operators (more than 400) globaldict and userdict, including some variables and operators defined by the user in the virtual memory, used with the Save operator. Directly controlled by the PS interpreter, but only the dictionary object can be saved, and the bottom three dictionaries (systemdict, globaldict, localdict) cannot be pop, only bengin, end, clearstack can change the dictionary stack. Typical Example:/average {Add 2 Div} def40 60 avergage: 1. press the name average and process {Add 2 Div} into the operand stack 2. find the dictionary stack when def is encountered. If this operator is not redefined, it means to put the two elements (average and {Add 2 Div}) at the top of the stack }) pop: Add an entry to the current dictionary. The key is average and the value is {Add 2 Div }. 3. Press 40 and 60 into the operand stack in sequence. 4. When an average is encountered, search in the dictionary stack and execute the associated value (here it is) {Add 2 Div }. The execution process of this step is to first execute add, pop two numbers (40 and 60) from the top of the operand stack, add them, and then press result 100 into the operand stack; when 2 is pushed into the operand stack; When div is encountered, search in the dictionary stack. If it is not redefined, pop the two numbers at the top of the operand stack and perform Division operations, then press the result 50 into the operand stack. D. graphical status stack. Maintains a set of parameters that control the text and image display status. Device-independent parameters: 1.ctm. Converts the current conversion matrix from the user to the device coordinate. 2. position. Coordinates of the current user space point. 3. Path. Created and added by the path constructor, and implicitly used as parameters similar to fill, clip, and stroke operators. 4. Clipping Path. Set the current cropping area. 5. Clipping Path stack. A Clipping Path stack stores the clipping path that is saved by clipsave and not released by cliprestore. 6. color space. The type of the color value to be interpreted, such as device gray. 7. color. The color used to depict the operator. It is related to color space. Generally, it has 1 to 4 values. 8. font. Use a dictionary to represent the drawing shape set of the current font. 9. line width, line cap, line join, Miter limit. Line-related features. 10. Dash pattern. Description of the line type when the line is stroke. Device parameters: 1. color rendering. How to convert CIE-based color to a set of parameters of device color. 2. overprint. Whether to overwrite the surrounding color. 3. Black generation. When RGB is converted to CMYK, it is used to generate a black process. 4. undercolor removal. Calculate the total amount of cmy reduction, and use the black to compensate. E. Clipping Path stack. A stack controlled by clipsave and cliprestore stores the Clipping Path. Since this stack is part of the graphics state stack, grestore and setgstate Replace the entire Clipping Path stack.
4.
Interpreter.When the scanner analyzes an object and its attributes and types, the interpreter is responsible for interpreting it. It mainly includes the following aspects: a. Comment processing. When % is not in the string, all characters between % and Line Break (or file Terminator) are filtered out as a blank character. B. Processing numbers. All the analyzed numbers are pushed into the operand stack; c. Processing the names. There are three types of names: the grammar names with/, the executable names without a prefix, and the immediate replacement of names with a prefix. The first type is pushed into the operand stack, the second type is pushed into the execution stack, and the third type is searched by the interpreter for the dictionary, and then pushed into the operand stack. D. process the dictionary. When <execution, a Mark object is pushed into the operand stack,> the dictionary is created during execution, and then pushed into the dictionary stack. E. Processing of arrays and packed arrays. When the interpreter encounters an array, it will sequentially execute the elements in the array. F. Process objects. When the interpreter encounters a process object, it does not execute it immediately. Instead, it is first pushed into the operand stack and will not be executed until it is explicitly called.
5.
Virtual Memory.Virtual Memory is a special concept in psotscript. It does not specify a physical implementation method, but it must comply with the following principles no matter how it is implemented. The Virtual Memory consists of four parts: A. Local virtual memory. Local virtual storage is a storage pool similar to stack. Applying for memory or modifying variables on it is subject to save and restore. In general, the local virtual memory is mainly used to maintain some parameters and variables used on the current page. The function scope is the current page (between SAVE and restore ). B. Global Virtual Memory. Global Virtual Memory is a non-fixed storage pool. In a program, applying for memory or modifying variables in global virtual memory is not affected by Save and restore. It usually maintains some parameters at the file level to ensure that the initial status at the beginning of each page is the same. C. Interaction between local and global. This includes the transition of the local and global states, and the implementation of data operations between local and global storage. D. The local virtual machine has the action affected by the save-Restore mechanism. When a restore occurs, all memory applied in the local virtual memory will be released since the last save, all open files will be closed, and gstate will be restored.