Section 1 Image File Format
To use computers to process digital images, we must first have a clear understanding of the file formats of images, because we have mentioned that natural images exist in the form of analog signals, before using a computer for processing, digitization is required. For example, before sending a camera (CCD) signal to a computer for processing, digital-analog conversion is generally required, this task is often completed by the image acquisition card, and its output is generally in the form of a bare image. If you want to generate a target image file, you must process it according to the file format. With the development of technology, digital cameras and digital cameras have already entered the ordinary homes. We can use these devices as input devices of image processing systems to provide information sources for subsequent image processing. No matter what the device is, it always provides information in a certain image file format. Common formats include BMP, JPEG, and GIF. Therefore, before image processing, first, we need to have a clear understanding of the image format. Only on this basis can we further develop and process the image.
Before describing the image file format, make a simple classification of the image. Except for the simplest image, all the images have colors, while the monochrome image is a relatively simple format in the color image, which generally consists of the black area and the white area, you can use a bit to represent a pixel, "1" to represent black, and "0" to represent white. Of course, you can also use a reverse representation. This image is called a binary image. We can also use 8 bits (one byte) to represent a pixel, which is equivalent to dividing black and white into 256 levels. "0" indicates black, and "255" indicates white, the value of this Byte indicates the gray or Brightness Value of the corresponding pixel value. The closer the value is to "0", The blacker the corresponding pixel. On the contrary, the more white the corresponding pixel, this type of image is generally called a grayscale image. Monochrome images and gray images are collectively referred to as black and white images, which correspond to color images. These images are more complex, indicating that, common color modes include RGB, CMYK, And his. Generally, we only use RGB, r corresponds to red, G corresponds to green, and B Corresponds to blue, they are collectively referred to as three base colors. Different combinations of these three colors can be used together to form various colors in reality. At this time, each pixel of a color image requires a set of data representation composed of three samples, each sample is used to represent a basic color of the pixel.
For all the existing image file formats, we will introduce the BMP Image File Format here, and the image data in the file is not compressed, because the digital processing of images mainly processes each pixel in the image, the pixel value in the uncompressed BMP image exactly corresponds to the actual digital image to be processed, this file format is most suitable for us to digitize it. Remember that compressed images cannot be directly digitalized, such as JPEG and GIF files. At this time, extract the image files first, this involves some complicated compression algorithms. In the subsequent sections, we will discuss how to convert a special file format to a BMP file. After conversion, we can use the uncompressed BMP file format for subsequent processing. For JPEG, GIF, and other formats, the compression algorithm is involved, which requires the reader to have certain information theory knowledge. If it is expanded, you can write a book. Due to space limitations, we will only give a general explanation. If you are interested, you can refer to relevant books and materials.
1. BMP file structure
1. BMP file Composition
A bmp file consists of four parts: File Header, bitmap information header, color information, and graphic data. The file header contains information such as the file size, file type, and image data deviation from the file header length; the bitmap information header contains the size information of the image. The image uses several bits to represent a pixel, whether the image is compressed, and the number of colors used by the image. Color information includes the color table used by the image. You need to use this color table to generate a color palette when displaying the image. However, if the image is true color, each pixel of the image is represented in 24 bits, this information does not exist in the file, so you do not need to operate the palette. The data block in the file indicates the corresponding pixel value of the image. Note that the order of storing the pixel value in the file is from left to right, from bottom to top, that is, in the BMP file, the last line of pixels of the image is first stored, and the first line of pixels of the image is stored, it is stored in the order on the left and right. The other detail that requires attention from readers is: when each line of the image is stored in a file, if the number of bytes occupied by the row's pixel value is a multiple of 4, the row is stored normally. Otherwise, you must add 0 at the backend to make up a multiple of 4.
2. BMP File Header
The data structure of the BMP file header contains information such as the type, size, and start position of the BMP file. Its structure is defined as follows:
Typedef struct tagbitmapfileheader
{
Word bftype; // The type of the bitmap file, which must be "BM"
DWORD bfsize; // the size of the bitmap file, in bytes.
Word bfreserved1; // reserved word of the bitmap file, which must be 0
Word bfreserved2; // reserved word of the bitmap file, which must be 0
DWORD bfoffbits; // the starting position of the bitmap data, expressed in bytes as the offset relative to the bitmap file header.
} Bitmapfileheader; the structure occupies 14 bytes. |
3. Bitmap header
The BMP Bitmap header is used to describe the size and other information of the bitmap. Its structure is as follows:
Typedef struct tagbitmapinfoheader {
DWORD bisize; // number of bytes occupied by the structure
Long biwidth; // The bitmap width, in pixels.
Long biheight; // The height of the bitmap, in pixels
Word biplanes; // The number of planes of the target device is unknown and must be 1.
Word bibitcount // The number of digits required for each pixel. It must be one of 1 (two-color), 4 (16 colors), 8 (256 colors), or 24 (true color ).
DWORD bicompression; // bitmap compression type, which must be 0 (not compressed), 1 (bi_rle8 compression type), or 2 (bi_rle4 compression type)
DWORD bisizeimage; // bitmap size, in bytes
Long bixpelspermeter; // horizontal bitmap resolution, number of workers per meter
Long biypelspermeter; // bitmap vertical resolution, number of bytes per meter
DWORD biclrused; // number of colors in the color table used by the bitmap
DWORD biclrimportant; // number of important colors in the bitmap display process
} Bitmapinfoheader; the structure occupies 40 bytes. |
Note: For BMP file formats, when processing monochrome and true color images, no matter how large the image data is, the image data is not compressed. Generally, if the bitmap adopts the compression format, the 16-color image adopts the rle4 compression algorithm and the 256-color image adopts the rle8 compression algorithm.
4. Color Table
A color table is used to describe the color in a bitmap. It has several table items. Each table item is an rgbquad-type structure and defines a color. The rgbquad structure is defined as follows:
Typedef struct tagrgbquad {
Bytergbblue; // blue brightness (value range: 0-255)
Bytergbgreen; // The brightness of the green color (value range: 0-255)
Bytergbred; // The Red brightness (value range: 0-255)
Bytergbreserved; // reserved, must be 0
} Rgbquad; |
The number of rgbquad structure data in the color table is determined by the bibitcount item in bitmapinfoheader. When bibitcount is 1, 4, and 8, there are 2, 16, and 256 color table items respectively. When bibitcount is 24, the image is true color. The color of each pixel is represented in three bytes, corresponding to the R, G, and B values respectively. The image file does not have a color table. The bitmap information header and the color table form bitmap information. The bitmapinfo structure is defined as follows:
Typedef struct tagbitmapinfo {
Bitmapinfoheader bmiheader; // Bitmap header
Rgbquad bmicolors [1]; // color table
} Bitmapinfo; |
Note: In the rgbquad data structure, a reserved field rgbreserved is added, which does not represent any color and must be set to "0, in the color values defined in the rgbquad structure, the red, green, and blue colors are arranged in the same order as the color data in general true color image files: if the color of a pixel in a bitmap is described as ", FF, 00", it indicates that the pixel is red rather than blue.
5. bitmap data
The bitmap data records each pixel value of the bitmap or the index value of the corresponding pixel color table. The image record sequence is from left to right in the scan row, and the scanning rows are from bottom to top. This format is also called a bottom_up bitmap. Of course, there is a up_down bitmap, which records the order from top to bottom, and there is no compression form for this form of Bitmap. The number of bytes occupied by a pixel value of a bitmap: When bibitcount = 1, 8 pixels constitute 1 byte; When bibitcount = 4, 2 pixels constitute 1 byte; when bibitcount = 8, 1 pixel occupies 1 byte; When bibitcount = 24, 1 pixel occupies 3 bytes, and the image is a true color image. When the image is not true color, the image file contains a color table. The bitmap data indicates the index value of the corresponding pixel in the color table. When the image is true color, each pixel uses three bytes to represent the color values of the corresponding pixels of the image. Each byte corresponds to the values of R, G, and B, respectively. At this time, there is no color table in the image file. As I have mentioned above, Windows requires that the number of bytes occupied by a scanned row in an image file must be a multiple of 4 (in words), and insufficient bytes must be filled with 0, calculation of the number of bytes occupied by a scanned row in an image file:
Datasizeperline = (biwidth * bibitcount + 31)/8; // number of bytes occupied by a scanned row |
The bitmap data size is calculated as follows (without compression ):
Datasize = datasizeperline * biheight. |
The above is a description of the BMP file format. After understanding the above structure, you can operate the image file correctly and read or write it.
Ii. GIF Image File Format
The GIF image format is fully called Graphics Interchange Format. From this name, we can see that this image format is designed to transmit images over the network. GIF files do not support 24-bit true-color images. They can only store up to 256-color images or grayscale images. GIF files cannot store image data of cmy and his models. In addition, generally, there is no fixed Data Length and storage sequence in various data regions of GIF image files. Therefore, the first byte in the Data zone is used as a flag to facilitate the search for data areas; at last, you need to note that the image data stored in GIF files is arranged in two order: sequential or cross-arranged. Cross-arrangement is suitable for network transmission. This allows users to obtain the profile data of the current image before they fully grasp the image data.
The GIF file format is divided into two versions: 87 and 89. For the 87 version, the file consists of five parts, which appear in sequence: the file header block, the logical screen description block, the selectable color section, the image data block, and the final block that marks the end of the file. The block is always set to 3bh. The first and second parts use the GIF Image File Header structure description:
Gifheader :{
DB signature; // This field occupies six bytes. to indicate that the image is in GIF format, the first three characters must be "GIF" and the last three characters must be used to specify the version, 87 or 89.
DW screenwidth ;//
DW screendepth; // occupies two bytes, indicating the width and height of the image in pixels
DB globalflagbyte; // the bit of this byte is used for the color description.
DB backgroundcolor; // The index that indicates the background color of the image.
DB aspectratio; image Aspect Ratio
} |
The color palette in GIF format is divided into a general color palette and a partial color palette, because the GIF format allows a file to store multiple images, there are two color palette, where the general color palette is suitable for all images in the file, the partial color palette only applies to an image. The data area in the format is generally divided into four parts: the image data recognition area, the local color palette data, the image data area obtained using the compression algorithm and the ending Mark area.
In gif89, it contains seven parts: File Header, General color palette data, image data area, and four supplementary data areas. They are mainly used to prompt the program to process images.
3. jepg image files
Jepg is short for joint photography expert group. As a technology, jepg is mainly used for standard encoding of digital images, while JPEG mainly adopts lossy compression encoding methods, it is much more complex than GIF and BMP image files. It is not just a few pages that can be clearly understood. Fortunately, we can convert the format to BMP in some other ways. Readers need to know that when coding jepg file formats, they usually need to be divided into the following four steps: color conversion, DCT conversion, quantization, and encoding.