Reverse the word (reverse words) in the string)

Source: Internet
Author: User
In the previous article, we have derived two non-sequential memory blocks with different reverse lengths based on the two continuous memory blocks algorithms. In this article, I will use the algorithm previously analyzed to solve a more complex problem: Reversing words in a string.

Introduction
There is an algorithm question:
Problem: reverse words
Comment: Write a function that reverses the order of the words in a string. for instance, your function shocould transform the string "do or do not, there is no try. "to" try. no is there not, do or do ". assume that all words are Space delimited and treat punctuation the same as letters.

This issue is frequently used by many well-known IT companies in domestic and foreign IT forums to examine the basic skills of interviewers. From the content involved in the question, this question is not very difficult, but if you limit the time complexity and space complexity, designing an algorithm that satisfies the interviewer in a short time is not a simple task. Next, I will give a brief introduction to the most common algorithm. This algorithm is straightforward, but it is difficult to meet the requirements in terms of time complexity and space complexity. Then I will introduce the algorithm I have adopted. This algorithm is an extension of the algorithm introduced in the previous article, and adopt an important problem-solving technique-"divide-and-conquer ".

Analysis
When we first came into contact with this question, I believe our mind has come up with a complete solution:
1. A memory segment is used to store reverse results. The memory size of this segment should not be smaller than the size of the source string.
2. Because we want to reverse the string, we naturally think of traversing from the end of the string to the header of the string.
3. During the traversal process, when we find a word, we copy the word to the target memory block.
4. When all characters in the string are traversed. Because it is traversing forward from the back, there is no identifier to end, we only need to record the length of the string first, when the remaining string length is 0, the traversal ends.
Although this solution is simple, it is always difficult to implement it. Since we traverse from the back and forward, the characters in the words we get in order are in reverse order, but when we copy the word to the target memory block, we need to copy it in positive order. We also need to record the first and last addresses of words to be copied, and guard against the "off-by-one" error.
In addition to difficulties in implementation, the biggest problem of this algorithm is the opening of the target memory block, which leads to unsatisfactory space complexity. Since we cannot know the length of the source string in advance, we cannot assume the maximum length of the source string (remember "buffer overflow ?), Therefore, we should at least open up a memory block of the same length as the source string, so that the space complexity of the algorithm will reach O (n ). At the same time, this algorithm first needs to traverse the string to obtain the length of the string, and then traverse the string again to obtain the word in the string, so that the time complexity of the algorithm will reach O (n ).

Is there still room for optimization? Of course, the answer is yes. If we jump out of the familiar idea framework and analyze the problem from a higher abstraction level, we will obtain a solution completely different from the traditional algorithm.
In general, we need to deal with a string consisting of multiple words (separated by spaces). The basic elements in this string are words and spaces. First, I will take a few simple cases to analyze and see if there are any gains.
1> there is no space in the string. In this way, all elements in the string can be considered as a word. At this time, no operation is required.
2> there is a space in the string (at this time, the space is not considered to appear at the beginning or end of the string ). This is an operation to reverse two words, which is equivalent to an operation to reverse two discontinuous memory blocks of varying lengths. This operation has been implemented in the previous article.
3> the string contains two spaces. For example, "Tom Jerry Mike", how can we handle this string? Based on the analysis of the first two special cases, I immediately thought of the most common solution-"divide-and-conquer ". In general, we can divide strings that contain more than one space into three parts:

| Substring on the left of a space | substring on the right of a space |

In this way, we need to complete the following processing:
Step 1. Perform the "reverse words" Operation on the substring on the left of the space;
Step 2. Perform the "reverse words" Operation on the substring on the right of the space;
Step 3. Splits the entire string.
If the left and right strings contain more than one space, we can separate them and perform this operation, until the number of spaces in the string is not greater than 1.
Do you think that a common algorithm is surprisingly similar to this one? Yes, it is the "merge sort" algorithm. The only difference between them is that "merge sort" performs sorting operations, and here the reverse operation is required. With the popularity of the "merge sort" algorithm, I call this algorithm"Merge reverse"Algorithm.
There is still one question. Which space should we use to separate multiple spaces in the string? If you think of "binary search", the answer will be clear. We can use a space in the middle as the split point.

Implementation
With the above analysis, combined with the algorithms implemented in the previous article, the algorithm implementation in this article is very simple:

Void * reversestringbyword (void * pmemory, size_t memtotalsize)
...{
If (null = pmemory) return pmemory;
If (memtotalsize <2) return pmemory;

Unsigned char * pbytememory = reinterpret_cast <unsigned char *> (pmemory );

Int itotalseparator = 0;
Size_t * extends paratorindexarray = reinterpret_cast <size_t *> (malloc (sizeof (size_t) * memtotalsize ));
If (null = pseparatorindexarray) return pmemory;
For (size_t I = 0; I <memtotalsize; I ++ )...{
If (* (pbytememory + I) = spaceseparator )...{
* (Pseparatorindexarray + itotalseparator) = I;
Itotalseparator ++;
}
}

If (itotalseparator = 0 )...{
// Do nothing
} Else if (itotalseparator = 1 )...{
Size_t imiddleseparatorindex = paiparatorindexarray [0];
Size_t iheadblocksize = imiddleseparatorindex;
Size_t iendblocksize = memtotalsize-iheadblocksize-separatorlengh;
Swapnonadjacentmemory (pbytememory, memtotalsize, iheadblocksize, iendblocksize );
} Else ...{
Size_t imiddleseparatorindex = paiparatorindexarray [itotalseparator/2];
Size_t iheadblocksize = imiddleseparatorindex;
Size_t iendblocksize = memtotalsize-iheadblocksize-separatorlengh;

Reversestringbyword (pbytememory, iheadblocksize );
Reversestringbyword (pbytememory + iheadblocksize + 1, iendblocksize );
Swapnonadjacentmemory (pbytememory, memtotalsize, iheadblocksize, iendblocksize );
}

Free (using paratorindexarray );
Return pmemory;
}

Test
I wrote a test case to test the algorithm implemented in this article:

Void test_reversestringbyword ()...{
// Table-driven test case
Static const char * teststring [] = ...{
"",
"AB ",
"",
"",
"A B ",
"AB CD EF ",
"AB CD EF ",
"AB CD ed ",
"Aaa bbb ccc"
};

Void * pmemory = malloc (maxmembuffersize );
If (null = pmemory) return;
For (INT I = 0; I <sizeof (teststring)/sizeof (const char *); I ++ )...{
Printf ("| % S | ==>", teststring [I]);

Size_t istringlength = strlen (teststring [I]);
Memset (pmemory, 0, maxmembuffersize );
Memcpy (pmemory, teststring [I], istringlength );
Reversestringbyword (pmemory, istringlength );

Printf ("| % S |", reinterpret_cast <char *> (pmemory ));
}
Free (pmemory );
}

It is worth noting that this algorithm still needs to be discussed and further analyzed when processing the first (or tail) of a string where spaces are located.

Postscript
Through the algorithm analysis in this article, we can find and summarize some algorithm design principles:
Principle 1: The power of primitive
A complex problem can often be abstracted into a basic problem. Although this basic problem is simple, it can often describe the essence of the problem. When we solve the problem, whether or not we can grasp the essence of the problem is often the key to a correct solution. In the algorithms introduced in this article, we found that the most essential operation in the word inversion algorithm in the string is to swap two non-consecutive memory blocks of an unequal length. With this in-depth understanding of the nature of the problem, we can use the existing algorithms to easily solve the problem.

Principle 2: divide-and-conquer
This idea is indeed the "best practice" to solve many problems ". If we look at the number of classic algorithms in the algorithm book that have applied this idea, we will be amazed to find that we still have a very low understanding of it. The core of this idea, or the key to success, is that it can divide a slightly larger problem into one or more smaller problems, at the same time, problems with a smaller scale are easier to solve than problems with a larger scale. Only in this way can "divide-and-conquer" reveal its magic. In this algorithm, the word in the reverse string is finally divided into two discontinuous memory blocks.

History
12/16/2006 V1.0
First version of the original article
12/17/2006 V1.1
The Postscript section is added to briefly summarize the two algorithm design principles used in this article.

References
1. Programming Pearl, Second Edition
"The power of primitive" is an algorithm analysis principle summarized in chapter 2 of the book. Once again, this book is strongly published.
2. Foundations of algorithms using C ++ pseudo code, Third Edition
The second chapter of this book provides an in-depth analysis of "divide-and-conquer". After reading it, you will surely have a deeper understanding and understanding of it, this in-depth understanding and understanding helps us solve many problems.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.