Introduced
often the client calls and complains: Your program is slow as a snail. You start checking for possible suspects: File IO, database access speed, and even viewing Web services. But these possible doubts are normal, not a problem at all.
You use the most convenient performance analysis tool analysis, found that the bottleneck is a small function, the function is to write a long string list into a file.
You have optimized this function by connecting all the small strings into a long string, performing a file write operation, and avoiding thousands of small string write-file operations.
This optimization has only been done in half.
You first test the speed of large string writing files, and find lightning fast. Then you test the speed of all string concatenation.
Several years.
What's going on? How will you overcome the problem?
You may know that. NET programmers can use StringBuilder to solve this problem. This is also the starting point of this article.
Background
If Google "C + + StringBuilder", you will get a lot of answers. Some will suggest (you) using std::accumulate, which can accomplish almost everything you want to achieve:
#include <iostream>//for std::cout, Std::endl
#include <string>//for std::string
#include < Vector>//For std::vector
#include <numeric>//for std::accumulate
int main ()
{
using namespace Std;
Vector<string> Vec = {"Hello", "", "World"};
string s = Accumulate (Vec.begin (), Vec.end (), s);
cout << s << endl; Prints ' Hello World ' to standard output.
return 0;
}
So far everything is fine: when you have more than a few strings connected, the problem arises and memory redistribution begins to accumulate.
Std::string provides the basis for the solution in the function reserver (). This is exactly what we are trying to do: Assign at once and connect at will.
string concatenation can severely affect performance because of heavy, slow tools. Because of the last hidden danger, this particular freak caused me trouble, I gave up indigo (I want to try some refreshing features in c++11) and wrote a partial implementation of the StringBuilder class:
Subset of http://msdn.microsoft.com/en-us/library/system.text.stringbuilder.aspx template <typename chr>
Class StringBuilder {typedef std::basic_string<chr> string_t; typedef std::list<string_t> container_t;
Reasons not to use vector below. typedef typename STRING_T::SIZE_TYPE Size_type;
Reuse the size type in the string.
container_t m_data;
Size_type m_totalsize;
void Append (const string_t &src) {m_data.push_back (SRC);
M_totalsize + + src.size ();
}//No copy constructor, no assignement.
StringBuilder (const StringBuilder &);
StringBuilder & operator = (const StringBuilder &);
Public:stringbuilder (const string_t &SRC) {if (!src.empty ()) {m_data.push_back (SRC);
} m_totalsize = Src.size ();
} StringBuilder () {m_totalsize = 0;
}//Todo:constructor that takes an array of strings.
StringBuilder & Append (const string_t &src) {Append (SRC); return *this; ALlow chaining.
}//This one lets your add any STL container to the string builder. Template<class inputiterator> StringBuilder & Add (const inputiterator &first, const inputiterator &AF
Terlast) {//Std::for_each and a lambda look like overkill here.
<b>Not</b> using std::copy, since we want to update m_totalsize too.
for (inputiterator f = A/F!= afterlast; ++f) {append (*f); return *this;
Allow chaining. } StringBuilder & Appendline (const string_t &src) {static CHR linefeed[] {0};//C + + 11.
Feel the love!
M_data.push_back (src + linefeed);
M_totalsize + + 1 + src.size (); return *this;
Allow chaining.
} StringBuilder & Appendline () {static CHR linefeed[] {10, 0};
M_data.push_back (linefeed);
++m_totalsize; return *this;
Allow chaining. }//Todo:appendformat implementation.
Not relevant for the article. Like C # StringBuilder.ToString ()//Note the use of reserve () to avoid reallocations.
string_t ToString () const {string_t result;
The whole point of the exercise! If The container has a lot of strings, reallocation (the result grows) would take a serious toll,//both
In performance and chances of failure.
I measured (in code I cannot publish) fractions of a second using ' reserve ', and almost two using + =.
Result.reserve (m_totalsize + 1); result = Std::accumulate (M_data.begin (), m_data.end (), result); This would lose the advantage's ' reserve ' for (auto iter = M_data.begin (); Iter!= m_data.end (); ++iter) {R
Esult + = *iter;
return result; }//Like JavaScript array.join () string_t join (const string_t &delim) Const {if (Delim.empty ()) {RE
Turn ToString ();
} string_t result;
if (M_data.empty ()) {return result;
}//Hope we don ' t overflow the size type. Size_type st = (Delim.Size () * (M_data.size ()-1)) + m_totalsize + 1;
Result.reserve (ST);
If you are need reasons to love c++11, this is one.
struct Adder {string_t m_joiner;
Adder (const string_t &s): M_joiner (s) {//This constructor are not empty.
}//This functor runs under accumulate () without reallocations, if ' l ' has reserved. enough memory.
string_t operator () (string_t &l, const string_t &r) {L + = M_joiner;
L + = r;
return l;
} ADR (DELIM);
Auto iter = M_data.begin ();
Skip the delimiter before the the container.
result = *iter;
Return Std::accumulate (++iter, M_data.end (), result, ADR); }
};
Class StringBuilder
function tostring () uses Std::string::reserve () to minimize redistribution. Below you can see the results of a performance test.
function join () uses std::accumulate (), and a custom function that reserves memory for the first operand.
You might ask, why stringbuilder::m_data use std::list instead of std::vector? Unless you have a good reason for using other containers, you usually use std::vector.
Well, there are two reasons why I do this:
1. Strings are always appended to the end of a container. Std::list allows you to do this without the need for memory redistribution, because vectors are implemented using a contiguous block of memory that can result in memory redistribution for each one.
2. Std::list is advantageous for sequential access, and the only access operation on M_data is sequential.
You can suggest that you test the performance and memory footprint of both implementations, and then select one.
Performance evaluation
To test performance, I fetched a Web page from Wikipedia and wrote a portion of it to the vector of a string.
I then wrote two test functions, the first to use standard functions clock () in two loops and call Std::accumulate () and stringbuilder::tostring (), and then print the results.
void Testperformance (const stringbuilder<wchar_t> &tested, const std::vector
<std::wstring> &tested2) {const int loops = 500; clock_t start = clock ();
Give up some accuracy in exchange for platform independence.
for (int i = 0; i < loops ++i) {std::wstring accumulator;
Std::accumulate (Tested2.begin (), Tested2.end (), accumulator);
Double secsaccumulate = (double) (clock ()-start)/clocks_per_sec;
start = Clock (); for (int i = 0; i < loops ++i) {std::wstring RESULT2 = tested.
ToString ();
Double Secsbuilder = (double) (clock ()-start)/clocks_per_sec;
Using Std::cout;
Using Std::endl; cout << "Accumulate took" << secsaccumulate << "seconds, and ToString () took" << Secsbuilder &L
t;< "seconds." << "The relative Speed improvement was" << ((Secsaccumulate/secsbuilder)-1) * << "%"
;< Endl; }
The second uses a more precise POSIX function clock_gettime () and tests the Stringbuilder::join ().
#ifdef __use_posix199309//To <a href= "Http://www.guyrutenberg.com/2007/09/22/profiling-code-using-clock_
gettime/">guy Rutenberg</a>
Timespec diff (Timespec start, Timespec end) {Timespec temp;
if ((end.tv_nsec-start.tv_nsec) <0) {temp.tv_sec = end.tv_sec-start.tv_sec-1;
Temp.tv_nsec = 1000000000+end.tv_nsec-start.tv_nsec;
else {temp.tv_sec = end.tv_sec-start.tv_sec;
Temp.tv_nsec = end.tv_nsec-start.tv_nsec;
return to temp; } void Accuratetestperformance (const stringbuilder<wchar_t> &tested, const std::vector<std::wstring>
&TESTED2) {const int loops = 500;
Timespec time1, time2;
Don ' t forget to ADD-LRT to the g++ linker command line.
Test std::accumulate ()////////////////clock_gettime (clock_thread_cputime_id, &time1);
for (int i = 0; i < loops ++i) {std::wstring accumulator;
Std::accumulate (Tested2.begin (), Tested2.end (), accumulator); } clock_getTime (clock_thread_cputime_id, &time2);
Using Std::cout;
Using Std::endl;
Timespec tsaccumulate =diff (time1,time2);
cout << tsaccumulate.tv_sec << ":" << tsaccumulate.tv_nsec << Endl;
Test ToString ()////////////////clock_gettime (clock_thread_cputime_id, &time1); for (int i = 0; i < loops ++i) {std::wstring RESULT2 = tested.
ToString ();
} clock_gettime (clock_thread_cputime_id, &time2);
Timespec tstostring =diff (time1,time2);
cout << tstostring.tv_sec << ":" << tstostring.tv_nsec << Endl;
Test join ()////////////////clock_gettime (clock_thread_cputime_id, &time1); for (int i = 0; i < loops ++i) {std::wstring RESULT3 = tested.
Join (L ",");
} clock_gettime (clock_thread_cputime_id, &time2);
Timespec tsjoin =diff (time1,time2);
cout << tsjoin.tv_sec << ":" << tsjoin.tv_nsec << Endl;
//////////////// Show Results////////////////Double secsaccumulate = tsaccumulate.tv_sec + tsaccumulate.tv_nsec/1000000000.0;
Double Secsbuilder = tstostring.tv_sec + tstostring.tv_nsec/1000000000.0;
Double Secsjoin = tsjoin.tv_sec + tsjoin.tv_nsec/1000000000.0; cout << "Accurate performance test: << Endl <<" accumulate took "<< secsaccumulate <<" Seconds, and ToString () took "<< Secsbuilder <<" seconds. "<< Endl <<" the relative Spee D Improvement was "<< ((Secsaccumulate/secsbuilder)-1) * <<"% << endl << "Joi
N took "<< secsjoin <<" seconds.
<< Endl;
} #endif//def __use_posix199309
Finally, a main function calls the two functions implemented above, displays the results in the console, and performs a performance test: one for debugging configurations.
Another for release versions:
See this percentage? The amount of junk mail sent is not up to that level!
Code uses
Before using this code, consider using a ostring stream. As you can see from Mr. Jeff's comments below, it's quicker than the code in this article.
You might want to use this code if:
You are writing code that is maintained by programmers with C # experience, and you want to provide a code for the interface they are familiar with.
You are writing code that will be converted to. NET and you want to point out a possible path.
For some reason, you don't want to include <sstream>. A few years later, some of the IO implementations of the stream became cumbersome, and the current code was still not completely free of their interference.
To use this code, you can do so only by following the main function: Create an instance of StringBuilder, assign it a value with append (), Appendline (), and Add (), and then call the ToString function to retrieve the results.
Just like the following:
int main () {//////////////////////////////////////8-bit characters (ANSI)////////////////////////////////////
Stringbuilder<char> ANSI; Ansi. Append ("Hello"). Append ("").
Appendline ("World"); Std::cout << ANSI.
ToString (); Wide characters (Unicode)//////////////////////////////////////http://en. Wikipedia.org/wiki/cargo_cult std::vector<std::wstring> Cargocult {L "a", L "Cargo", L "cult", L "is", L "a"
, l "kind", L "of", L "Melanesian", L "millenarian", L "movement",//many more lines ...
L "Applied", L "retroactively", L "to", l "movements", l ' in ', L "a", L "much", L "earlier", L "era.\n"};
stringbuilder<wchar_t> wide; Wide. ADD (Cargocult.begin (), Cargocult.end ()).
Appendline (); Use ToString (), just like. NET Std::wcout << wide.
ToString () << Std::endl;
Javascript-like join. Std::wcout << wide.
Join (L "_\n") << Std::endl; //////////////Performance tests////////////////////////////////////testperformance (wide, cargocult);
#ifdef __use_posix199309 accuratetestperformance (wide, cargocult);
#endif//Def __use_posix199309 return 0;
}
In any case, when connecting more than a few strings, beware of the std::accumulate function.
Now wait a minute!
You might ask: Are you trying to convince us to optimize ahead?
No. I agree that early optimization is bad. This optimization is not in advance: it is timely. This is based on empirical optimization: I found myself in the past fighting this particular freak. Experience based optimization (not falling two times in the same place) is not an advance optimization.