Itoa, a function that converts a number into a string using assembler

Source: Internet
Author: User

For anyone familiar with C language, the itoa function is certainly not unfamiliar. Itoa is a widely used non-standard C Language extension function. Its function is to convert any type of numbers into strings.
To make it clearer, let us know how to use the assembly language to implement this function. Next we will first use the C language to implement an itoa function, and then illustrate the methods and ideas of using the assembly language. Because whether it is in C or assembly language, its implementation ideas and methods are the same, but the description languages are different. However, we are all familiar with the C language, but we are not so familiar with the assembly language, so in order to make us better understand the implementation of this function's assembly language, I will use the power of the C language to illustrate the analogy.
NOTE: For the number that appears in this article, "xxx" indicates the string corresponding to the number, "x" indicates the character corresponding to a single number, and "x" indicates the number.
I. Why do we need to convert numbers into strings? In assembly languages, numbers cannot be output directly. to output numbers, we need to convert them into strings (that is, characters, then output the string in the form of output characters. This is the initial purpose of implementing this function using assembly, that is, to output numbers. Of course, it is useful to convert numbers into strings.
Ii. How to convert a number into a string, that is, the algorithm for converting a number into a string, in C language, I believe everyone is familiar with it. Divide the number by 10 until the quotient is 0, and add the remainder to the '0' ASCII code to get the ASCII code of the remainder, that is, the character corresponding to the number. For example, if the number 123 is converted to a string, the character generation sequence is '3', '2', and '1 '. Finally, divide 1 by 10, the quotient is 0, and the remainder is 1.
We can see that the generated characters are in the opposite order of the true strings, and the generated characters should be "123 ". Because the 32-bit number is only 10 digits at the maximum, we only need to apply for an array of 11 units and we will certainly be able to put the converted string. However, when converting, we do not know the length of the string after the number is converted to a string. That is to say, we do not know where the '3' character is to be placed in the array. For example, if the number is 123, '3' should be placed in the unit with subscript 2, and '3' should be placed in the unit with subscript 3 if the number is 1243.
For the two reasons mentioned above (the generated characters are in the opposite order of the characters of the generated string, and the unit position where the first character is generated cannot be determined ), we also examine the data structure features of the stack. In the implemented functions, we need a stack. Import the generated characters into the stack. After the generation is complete, place the characters in the stack out of the stack and put them in the character array in order to complete our functions. Because the stack is advanced and later, '3' will end with the stack. In this way, the generated string sequence is the same as expected.
The string generated by the itoa function is a C-style string, that is, the string ends with '\ 0'. Therefore, the generated string should end with the character' \ 0. That is to say, '\ 0' is the last character of the string. So we can import '\ 0' into the stack first, and then it will output the stack at the end and put it at the end of the character array. At the same time, it can also play a role in marking, that is, if '\ 0' out of the stack, it means that all generated characters have been placed in the character array, stack is empty.
Let's take a look at its implementation code. Note: to be different from itoa, the function name, dtoc, is used here, which means DoubleWordToChar. The string address to be put is given by the str parameter.

Void wtoc (int num, char * str) {int rem = 0; // The remainder char c = '\ 0'; // The character Stack s corresponding to the number; // define a stack Init (& s); Push (& s, '\ 0'); // press' \ 0' into the stack while (num! = 0) // determine whether the quotient is 0 {rem = num % 10; // divide by 10 to obtain the remainder c = rem + '0 '; // convert the remainder into the corresponding character Push (& s, c); // press the character into the stack num/= 10; // calculate the num divided by the 10 Operator} do {c = Pop (& s); // character output stack * str = c; // put it in the character array in sequence + + str;} while (c! = '\ 0'); //' \ 0' the output stack. All characters are copied to the end of Destory (& s );}
The practice of this Code is exactly the same as the implementation idea mentioned above, and it is no longer repeated. Stack is a self-defined data structure-Stack. because it corresponds to the operations in the Assembly, there are only two types of operations, one is Push and the other is Pop. In addition, no operation is provided.
III. The implementation of the assembly language is as follows:
; Subroutine name: dtoc; function: Convert dword data to a decimal string, with the string ending with 0. Parameter: (ax) = low 16 bits of dword data, (dx) = high 16-bit dword data; ds: si points to the string address; Return: No dtoc: push sipush cxmov cx, 0; press 0 to the bottom of the stack and push cxrem:; Evaluate the remainder, convert the corresponding number to the ASCII code mov cx, 10; set the divisor call divdw; execute the safe Division add cx, 30 H; convert the remainder to the ASCII code push cx; push the corresponding ASCII code into the stack or ax, dx; Determine whether the quotient is 0mov cx, axjcxz copy; the quotient is 0, this indicates that jmp rem has been removed; otherwise, copy: has been removed; the data in the stack has been copied to pop cx in the string; the ASCII code has the stack mov [si], cl; save the character to the string in jcxz dtoc_return; if 0 is out of the stack, the copy inc si is exited; point to the next write location in jmp copy; if 0 is not out of the stack, continue to copy data from the stack dtoc_return:; recover the register content, and exit the subroutine pop cxpop siret

Assembly subroutine Description: 1. We can see that the assembly program is not much more complex than the C language program above. We should be happy with this.
In the two programs, I split them into five parts using empty lines. The 2nd, 3, and 4 parts of the assembly language subroutine correspond to the 2nd, 3, and 4 parts of the C language program respectively. In the subprogram above, we use the stack provided by the system, instead of defining one by ourselves, because there are not many numbers, and there are only 10 at most, so we can safely use the stack provided by the system to us.
2. In order to unify the operations and prevent division from overflow, The divdw subprogram in the previous blog is used for Division, instead of directly using the div command, if the number to be converted is only 16 bits, it can be a dword-type Number of 32 at a high position of 0. As described in detail in the previous blog, we will not describe it here. It implements a non-overflow division with a dividend of 32 bits and a division of 16 bits. Its function usage instructions are as follows: subroutine name: divdw function: conducts Division operations without overflow. The divisor type is dword type and the result is dword type parameter: (ax) = dword-type data low 16-bit (dx) = dword-type data high 16-bit (cx) = divisor return: (dx) = Result High 16-bit, (ax) = 16-bit lower result (cx) = Remainder
3. The content marked as rem to jmp rem is equivalent to the first loop in the C language implementation, that is, the while loop, the content between the label copy and the command jmp copy is equivalent to the second loop in the C language, that is, the do while loop.
4. add cx, 30 H, which indicates converting the remainder (number) in cx into a character, because the ASCII code of '0' is 30 H.
5. mov cx, 0 push cx; push 0 to the bottom of the stack first
The implementation is to press the character '\ 0' into the stack At the beginning, add' \ 0' to the end of the string, and use it as a flag for all characters that have been output from the stack.


PS: I have the honor to be one of the candidates for the CSDN blog star. If you think I can write a blog, please vote for your valuable vote. Thank you for your support! My voting address: http://vote.blog.csdn.net/blogstaritem/blogstar2013/ljianhui

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.