Use the standard template library STL for file comparison (zz)

Source: Internet
Author: User

Author: winter

Introduction
This article discusses how to use the standard template library (STL), Class templates, function templates, and other programming techniques to solve practical problems. This article involves STL collections and vectors, function templates, class templates, constant checks, error handling, and STL file I/O.

To read this article, you must be familiar with C ++ and class templates and function templates. This article brings together a large amount of relevant information to guide you step by step.

This article guides you through questions, designs, and solutions. Hope you like it.

Question:

Two articles contain many lines of text. We need to build a program to identify the differences between the two and display the rows of these different contents. The program must be a reusable component, that is, the component can be used by other programs without modification.

Design:

Assume that these two files are very large (each file contains thousands of lines). We will design the related solutions as follows:
Read each file into the memory block,
Compare the file content in the memory block,
Put the difference into a new third memory block.

The design scheme also takes into account that the element location of each file may be different, that is, the same element may not necessarily be in the same row. This means that different terms must be searched in the memory and stored in the third memory block.

Considering the reusability of the program, we use the generic programming technology to design it so that the solution can adapt to the changes in storage media.

When a file is large (each file contains thousands of lines), it may be unrealistic to store each file in the memory. It also brings difficulties to the execution process.

Execution details:

It can be designed using containers, such as storing character arrays in containers using arrays or queues. However, this reduces the readability of the program and the reusability of components.

The solution in this article uses the standard template library (STL) container to manage memory blocks. In addition, STL elements are used to manage reading files into memory blocks. This design scheme makes the program readable at the template container level.

To achieve the goal of mutual use, C ++ class templates and function templates must be used. If you are not familiar with these templates or want to review them, refer to the link at the end of the article.

Solutions and guidelines

The program you write is for the end users and program developers. It is written to programmers because someone may make some changes to your program. They must spend time understanding your program. That is, you may need to modify the program later-improve its readability without reducing the running efficiency, or add a series of comments.

For example, let's take a look at the main function main ():

 

Int main (INT argc, char * argv [])
{
// Confirm the correct number of parameters
If (argc! = 3)
{
Cout <"comparefiles-copyright (c) Essam Ahmed 2000" <Endl;
Cout <"this program compares the conents of two files and prints" <Endl
<"The differences between the files to the screen" <Endl;
Cout <"Usage: comparefiles <file_name_1> <file_name_2>" <Endl;
Return 1;
}
// Declare the container to be used
Typedef vector <string> stringset;
Stringset S1, S2, S3;

// Read the first article into the set
Populate_set_from_file (S1, argv [1]);
Cout <"Contents of Set 1" <Endl;
For_each (s1.begin (), s1.end (), printelement );

// Read the second article into the set
Populate_set_from_file (S2, argv [2]);
Cout <Endl <"Contents of Set 2" <Endl;
For_each (s2.begin (), s2.end (), printelement );

/// Compare the set and store the differences to S3
Container_differences <stringset, string> (S1, S2, S3 );

// Display the result
Cout <Endl <"difference is:" <Endl;
For_each (s3.begin (), s3.end (), printelement );

Return 0;
}

Here, we will not discuss how to read and compare the content of a file. These are all encapsulation tasks. Here we are concerned with the role played by the function. In this example, the main () function plays the role of the poster, and other functions perform real work.

You can see the functions of the function, such as populate_set_from_file () and container_differences () functions to execute most core tasks. The for_each () function is the operation rule of STL.

The essence of the main () function is:
Typedef vector <string> stringset;

It defines the container type of a vector and is used to store string objects. If you are not familiar with vector, refer to the Guide on vector at the end of the article. The stringset object is an STL data type, which encapsulates various strings. The Type Definition typedef makes it a reusable data type and makes the code readable.

Stringset S1, S2, S3;
The three containers are declared to point to the contained string set. The first two contain the content of each input file, and the next one stores different strings. Of course, the variable name should be more formal.

The populate_set_from_file () function reads the file content into the container. It is a function template and can use different types of parameters. It consists of the following:

Template <class T>
Bool populate_set_from_file (T & S1, const char * file_name)
{
Ifstream file_in;
String line_from_file;

File_in.open (file_name );
If (file_in.fail ()){
Cout <"error opening file ["
<File_name <"]-Please check file name" <Endl;
Return false;
}
Try {
Getline (file_in, line_from_file );
While (file_in.good ())
{
Addelementtoset (S1, line_from_file );
Getline (file_in, line_from_file );
}
}
Catch (bad_alloc & E)
{
Cout <"error-caught exception:" <E. What () <Endl;
Throw E;
Return false;
}

File_in.close ();
Return true;
}

This is a function template that reads files row by row into the container type it defines. Function to open the specified file, read the file line by line (end with the carriage return line break), and add it to the container (the container can be any type supported by the template. Use the addelementtoset function to add each row of files to the container. This is also a function template.

Use the STL file stream object (ifstream) to read files. Ifstream supports basic file I/O and error handling. When the file operation fails, its fail () member function returns true ). After all files are properly read, the member function good () returns true.

Getline () is an STL function that reads each line of a file and ends with the row terminator (the row Terminator does not read the string ). Its parameters are source file streams and string objects. Note that it does not filter the leading and trailing space characters when reading a line string.

Others are error handling processes-although not ideal, it is used in this example. When the length of the string in the line_from_file object is too long, the bad_alloc error message is thrown.

The file_name parameter of the function file name is a constant (const) parameter, indicating that the parameter is read-only and is not modified. Using constant parameters allows the compiler to generate a read-only fast memory image and make the application smaller.

Addelementtoset is also a template function. The use of containers is sometimes complicated. Some containers use the insert () method to add members, while others use push_back (). [Note: there are many types of containers. The Queue (list) uses the former instead of the stack) use the latter]. More complex is map, which adds members with insert (), but the input parameter is pair <>. Although the container functions can be reloaded, I chose to use the template. This can be more flexible and even used for new or unknown containers.

The code of the addelementtoset function is as follows:

Template <Class C, Class V>
Void addelementtoset (C & C, const V & V)
{
C. insert (v );
}

The container class of the template is C, and the passed parameter is V (V is declared as a constant parameter and is read-only. Remember, a copy of V is added to C ). Use the insert () function to add V to C. This is very convenient for containers that support the insert () method, but there is a problem with some other containers.

For such containers, such as vector, which uses push_back () to add members, the template must be special. The C ++ template supports the category concept, but the class execution will still be optimized to a specific type. The specificity of the template is similar to that of the reload template.

The following code converts the special addelementtoset case into a vector ):

Template <> void addelementtoset <vector <string>, string> (vector <string> & C, const string & V) {C. push_back (V);} note that there is an empty pair of angle brackets behind the "template" keyword, which declares the special case of a class. Multiple special cases can be declared. After the container_differences function template reads the file into the container, it is necessary to use the container_differences function for comparison. This is also a function written in a template and can be used in other applications. It calls the addelementtoset function template to add different strings to the container. Although the function does not use the return value, the contents of the container are constantly changing. Finally, if there is no Member in the container, it means the files to be compared are the same. The following code shows the struct function: Template <class container_type, class value_type> void struct (const container_type & container1, const struct & container2, container_type & result_grp) {container_type temp; temperature :: const_iterator iter_pos_grp, iter_found_at; If (& container1! = & Container2) {iter_pos_grp = container1.begin (); While (iter_pos_grp! = Container1.end () {iter_found_at = find (container2.begin (), container2.end (), (* iter_pos_grp); If (expiration = container2.end () addelementtoset (temp, static_cast <value_type> (* iter_pos_grp); ++ iter_pos_grp;} iter_pos_grp = container2.begin (); While (iter_pos_grp! = Container2.end () {iter_found_at = find (container1.begin (), container1.end (), (* iter_pos_grp); If (expiration = container1.end () addelementtoset (temp, static_cast <value_type> (* iter_pos_grp); ++ iter_pos_grp ;}} temp. swap (result_grp);} You can see that the file comparison process is quite simple, which is the starting point of the design. Function is only one thing, and must be done well. The begin () and end () functions are repeatedly called in the search cycle of each source file container. The end () function ends when the zero (null) character (the end of the C string) is detected. Use the find () function of STL to find the same string. If no string is found, it indicates that there are different strings. Then, end () is returned and the string is added to the result container. The last line of the function uses the swap () function to copy the content of the Temporary container to the result container that references the parameter and release the Temporary container. Taking a closer look, we can see that the iterator uses static_case <> to point to the value type, because the compiler sometimes cannot process the data type required by addelementtoset. Use static_case to make the code clearer. In the function parameters, the first two are the constant (const) parameters, and the last is a non-constant parameter, used to write results. In this way, the program occupies less memory. The code above the different container types supported by the template can support these container types: the queue (list) set vector (vector) as long as the main function main () you can easily change the container type. If you want to change the set type to the vector type, set:

Typedef set <string> stringset;

Changed:

Typedef vector <string> stringset;

That's all. Of course, you need to recompile it (make sure that the container types are included in the file ). The addemenettoset () function template can also be customized to support other types of containers, such as map ). As long as the container supports iterative operations, this code can be used. If you want to use the container_differences function in your application, you must first perform special class processing on the addemenettoset () function. Conclusion This article involves a lot of content. The most important thing is to understand how to use the C ++ template to create the class elements of STL. We also introduced how to split an application into several special functions. Each function does only one thing and works well. This makes the entire execution process simple and easy to understand and maintain. All the work done here is to establish a flexible application system and take full advantage of existing elements to reduce the design, development, and testing time. We recommend that you read other information about STL, C ++ templates, and C ++ language features to make your application more dynamic. The code file that can be downloaded at the end of the Code is applicable to VC ++ 6.0. There is an executable file that can be used directly. I also include a file with two random statements for comparison. Click Download source code
Erik sans Mann (Erik sans Mann's website)
 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.