Introduction
Client calls often complain that your program is as slow as a snail bait. Check for possible issues: file IO, database access speed, and even viewing web services. However, all these possible questions are normal and there are no problems at all.
You use the best performance analysis tool to analyze and find that the bottleneck lies in a small function. This function is used to write a long string linked list to a file.
You have optimized this function as follows: connect all the small strings into a long string and perform a file write operation to avoid thousands of small string write operations.
This optimization is only half done.
First, test the speed of writing large strings to files and find that the speed is as fast as lightning. Then you can test the splicing speed of all strings.
Several years.
What's going on? How can you overcome this problem?
You may know that. net programmers can use StringBuilder to solve this problem. This is also the starting point of this article.
Background
If you google "C ++ StringBuilder", you will get a lot of answers. Some will suggest you) Use std: accumulate, which can accomplish almost all the things you want to implement:
- #include <iostream>// for std::cout, std::endl
- #include <string> // for std::string
- #include <vector> // for std::vector
- #include <numeric> // for std::accumulate
- int main()
- {
- using namespace std;
- vector<string> vec = { "hello", " ", "world" };
- string s = accumulate(vec.begin(), vec.end(), s);
- cout << s << endl; // prints 'hello world' to standard output.
- return 0;
- }
So far, everything is fine: When you have more than a few string connections, the problem arises, and memory redistribution begins to accumulate.
Std: string provides the basis for the solution in the function reserver. This is exactly what we intend to do: one allocation, random connection.
String connection may seriously affect performance due to heavy and slow tools. Due to the previous hidden danger, this special freak made me trouble, so I gave up Indigo and I wanted to try some refreshing features in C ++ 11 ), and write a part of the implementation of the StringBuilder class:
- // Subset of http://msdn.microsoft.com/en-us/library/system.text.stringbuilder.aspx
- template <typename chr>
- class StringBuilder {
- typedef std::basic_string<chr> string_t;
- typedef std::list<string_t> container_t; // Reasons not to use vector below.
- typedef typename string_t::size_type size_type; // Reuse the size type in the string.
- container_t m_Data;
- size_type m_totalSize;
- void append(const string_t &src) {
- m_Data.push_back(src);
- m_totalSize += src.size();
- }
- // No copy constructor, no assignement.
- StringBuilder(const StringBuilder &);
- StringBuilder & operator = (const StringBuilder &);
- public:
- StringBuilder(const string_t &src) {
- if (!src.empty()) {
- m_Data.push_back(src);
- }
- m_totalSize = src.size();
- }
- StringBuilder() {
- m_totalSize = 0;
- }
- // TODO: Constructor that takes an array of strings.
-
-
- StringBuilder & Append(const string_t &src) {
- append(src);
- return *this; // allow chaining.
- }
- // This one lets you add any STL container to the string builder.
- template<class inputIterator>
- StringBuilder & Add(const inputIterator &first, const inputIterator &afterLast) {
- // std::for_each and a lambda look like overkill here.
- // <b>Not</b> using std::copy, since we want to update m_totalSize too.
- for (inputIterator f = first; f != afterLast; ++f) {
- append(*f);
- }
- return *this; // allow chaining.
- }
- StringBuilder & AppendLine(const string_t &src) {
- static chr lineFeed[] { 10, 0 }; // C++ 11. Feel the love!
- m_Data.push_back(src + lineFeed);
- m_totalSize += 1 + src.size();
- return *this; // allow chaining.
- }
- StringBuilder & AppendLine() {
- static chr lineFeed[] { 10, 0 };
- m_Data.push_back(lineFeed);
- ++m_totalSize;
- return *this; // allow chaining.
- }
-
- // TODO: AppendFormat implementation. Not relevant for the article.
-
- // Like C# StringBuilder.ToString()
- // Note the use of reserve() to avoid reallocations.
- string_t ToString() const {
- string_t result;
- // The whole point of the exercise!
- // If the container has a lot of strings, reallocation (each time the result grows) will take a serious toll,
- // both in performance and chances of failure.
- // I measured (in code I cannot publish) fractions of a second using 'reserve', and almost two minutes using +=.
- result.reserve(m_totalSize + 1);
- // result = std::accumulate(m_Data.begin(), m_Data.end(), result); // This would lose the advantage of 'reserve'
- for (auto iter = m_Data.begin(); iter != m_Data.end(); ++iter) {
- result += *iter;
- }
- return result;
- }
-
- // like javascript Array.join()
- string_t Join(const string_t &delim) const {
- if (delim.empty()) {
- return ToString();
- }
- string_t result;
- if (m_Data.empty()) {
- return result;
- }
- // Hope we don't overflow the size type.
- size_type st = (delim.size() * (m_Data.size() - 1)) + m_totalSize + 1;
- result.reserve(st);
- // If you need reasons to love C++11, here is one.
- struct adder {
- string_t m_Joiner;
- adder(const string_t &s): m_Joiner(s) {
- // This constructor is NOT empty.
- }
- // This functor runs under accumulate() without reallocations, if 'l' has reserved enough memory.
- string_t operator()(string_t &l, const string_t &r) {
- l += m_Joiner;
- l += r;
- return l;
- }
- } adr(delim);
- auto iter = m_Data.begin();
- // Skip the delimiter before the first element in the container.
- result += *iter;
- return std::accumulate(++iter, m_Data.end(), result, adr);
- }
-
- }; // class StringBuilder