A friend with a little bit of computer knowledge must know that a computer only recognizes 0 and 1. At the very beginning, to write a program, you must use 0 and 1! So the worship of programmers may come from that time. Later, people found that it was too uncomfortable to write programs with 0 and 1, in addition, it should be difficult to look at it again. In short, for these reasons, there will be an assembly language.
The assembly language uses some mnemonic symbols to replace multiple combinations of 0 and 1, that is, various commands. In this way, it is much more convenient (a good old man: too much convenience) (A cainiao: it is not convenient at all, and cannot be understood completely ). However, compilation is equally inconvenient. It is also uncomfortable to write, and it is also inconvenient to maintain later. In addition, people need to write more programs. In this case, advanced languages have been invented, such as basic, Pascal, C, and C ++ that we use today, it greatly reduced the difficulty of Program Development (a good old guy: too much, so I can write a program on my knees) (a cainiao: Not that difficult ), in the past, it took a long time to compile a program. Now, it takes a short and easy time to develop it. In particular, in recent years, the popularity of visual programming has become widespread, the programmer's mystery suddenly fell, and the word "coder" is now full of sky. The worst is assembly, overnight conversion into low-level language, foul language, migrant workers who finish eating garlic and don't brush their teeth, land workers who finish driving oil and don't give money, Iceland who spit on a bus, etc.
(Assembly: whining... ).
However, assembly has its inherent advantages. because it corresponds to the instructions in the CPU, it must be implemented in some special cases, for example, accessing hardware ports and writing viruses ....
In addition, the generated executable files are very efficient, and the generated executable files are very small. It is very nice to write small programs. In addition, it is very easy to write registration machines with remittances, you don't have to worry about restoring the language you are familiar. After talking so much about it, let's get into the topic (a few stunned audience members ):
Since the computer only recognizes 0 and 1, all files stored on the computer are stored in binary format, including executable files.
Therefore, you only need to find a hexadecimal Editor, such as ultra edit, to open and view the executable file, at this point, we can see all hexadecimal values (each 4-bit binary number can be converted to a hexadecimal number). This is the specific content of the executable file, of course, this includes the executable file code. (An old ox: So kind of mind) (a cainiao: Stupid Ox, you shut up for me, and I spent all my eyes ).
Well, do you think there are some of these things?
These things look like a word book, and no one can rely on it for analysis. With the corresponding software, we can convert these hexadecimal values into the corresponding assembly code. In this way, we can analyze others' software. This is called reverse analysis.
Haha, you must be thinking about it now. If you find the software to calculate the registration code, analyze it, and understand its calculation method, so you don't need to use $ to register the software? Of course, you can also restore this computing process to any programming language you are familiar with. The compiled program is called a registration machine, its function is to calculate the registration code of a specific software. (Do you often see such descriptions in software? "Production and provision of registration machines and cracking programs for the software are prohibited; reverse engineering of the software, such as disassembly and decompilation, is prohibited ")
In this way, we can understand the mood. After all, people spend so much time on their own software. Therefore, I don't want you to learn to crack it just because you cannot afford the registration fee.
In general, the introduction above is too idealistic. The analysis method mentioned above is called static analysis. common tools for such analysis include w32dasm, IDA, and hiew. Static analysis, as its name implies, is to analyze the software only by viewing the disassembly code of the software. Generally, if you only want to crack the software, only static analysis is enough. However, to really understand the registration algorithm, we usually need to perform dynamic analysis, that is, we can use a debugger to execute programs and perform analysis. For details, I will describe it in "how to crack" and "getting started with the debugger.
I have said so much nonsense, but I want to tell you the importance of compilation. I Don't Want You To Be proficient, but at least you have to understand it. Otherwise, what analysis do you want to talk about? Even though some of my friends don't know how to compile the assembly, they even broke a few software. But is it worse? Is it hard for you to crack software for a lifetime?
In fact, you don't have to worry about assembly at all. It looks weird and scary. In fact, it's similar to the attribute methods of those controls that you usually back up. How many compilation commands do you decide about MFC? In addition, assembly is not only useful in crack software, but also useful in many places. Therefore, I think it is incumbent to take the Assembly down:
You just have to trust it.
(Join the second modification as follows)
First, let's talk about the CPU composition:
The CPU task is to execute the command sequence stored in the memory. Therefore, in addition to completing arithmetic logic operations, you also need to perform data transmission tasks between the CPU, memory, and I/O. Early CPU chips only included two major components: the memory generator and the Controller. In recent years, in order to make the memory speed better match the memory speed, high-speed buffer memory has been introduced into the chip (Do you know why p4 is so much more expensive than P4 ?). (When! A hard thing is flying over. VOICE: We don't need to design the CPU when you talk about this)
What are you anxious about, because the compilation is relatively "low-level"; therefore, it is a direct operation of hardware, you think this is to use VB, you can use it whenever you want to use variables, you are not familiar with some work distribution in the CPU. How can you see the assembly code. (When! Again, it's important not to mention)
In addition to high-speed buffer memory, the composition can be divided into three parts:
1. The arithmetic logic component Alu (arithmetic logic unit) is used for arithmetic and logical operations. This part has little to do with us, so we don't have to worry about it.
2. control logic. It has little to do with us.
3. This is the most important thing. The working Register plays an important role in the computer. Each register is equivalent to a storage unit in the memory, but its access speed is faster than the memory. It is used to store the information required or obtained during the calculation process, including the operand address, the operand, and the intermediate result of the calculation. Below we will introduce these registers specially.
Before introducing it, it is necessary to talk about basic knowledge. Know what 32-bit is, that is, the register is 32-bit, Dizzy ~~ Not said. In the CPU, a binary bit is regarded as one bit, and the eight bits are one byte. In the memory, information is stored in bytes, each byte unit is assigned a unique memory address, which is called a physical address. It is used to access the corresponding memory at that time. What can eight binary bits Express? It can express all ASCII codes. That is to say, a memory unit can store an English character or number, while a Chinese character must be represented by a unicode code. That is to say, two memory units can hold a Chinese character. It is not hard to understand that the sixteen bits are two bytes. Of course, if there are sixteen bits, there must be thirty-two bits, sixty-fourteen bits, and so on. The thirty-two bits are called dual characters, and the sixty-fourteen bits are called four characters. The CPU we use today is believed to be 32-bit, unless you use 286 or earlier. Naturally, the registers in the CPU are 32-bit. That is to say, a register can hold 32 0 or 1 (this does not include segment registers ).
In general, there are sixteen registers you need to master. I will introduce them to you one by one:
First, we will introduce Xiao cuier (Dang !, I hit it myself. Recently, I saw Zhou xingchi reading more.) I will repeat it and introduce General registers first.
There are eight eax, EBX, ECx, EDX, ESP, EBP, EDI, and ESI.
Among them, the four registers of EAX-EDX can be called data register, you in addition to direct access, you can also give them 16-bit high and 16-bit low (do I still say they are 32-bit ?) . Their 16-bit low is to remove the E in front of them, that is, the 16-bit low of eax is ax. In addition, their low 16-bit access can be carried out separately, that is, ax can be further decomposed, that is, ax can also be divided into AH (high 8-bit) al (eight lower places ). Separate the other three registers. In this way, you can deal with various situations. If you want to operate an eight-digit data, you can use mov Al (eight-digit data) or mov AH (eight-digit data ), if you want to operate on a sixteen-bit data, you can use mov ax (sixteen-bit data) to perform a thirty-twelve-bit operation, and mov eax (thirty-twelve-bit data, you still don't understand. It doesn't matter. Let's take a look at it. I'll give you a picture, although it's not very beautiful:
── ─
│
│
│ High 16-digit eax Ah ax Al │
│
│
── ─
(Why am I always unable to display this picture? I have re-painted them three times)
Do you understand? It doesn't matter if you don't understand it. You can understand it as much as you can understand.
These four registers are used to temporarily store the operands, results, or other information used in the calculation process.
ESP, EBP, EDI, and ESI can only be accessed by words. Their main purpose is to provide an offset address when addressing memory. Therefore, they can be called pointers or address change registers. After 386, all registers can be used to store memory addresses. (Here I will give you a little bit of knowledge. Have you ever seen the form of [EBX] during the attack? This means that at this time, EBX is installed with a memory address, and the actual access is the value stored in that memory unit ).
In these registers, ESP is called Stack pointer storage. Stack is a very important concept. It is a storage area that uses the "post-import, first-out" method. It must exist in the stack segment, so its segment address is stored in the SS register. It has only one entrance, so there is only one stack pointer register. The content of ESP points to the top of the current stack at any time. If this is the case, you may still find it hard to understand. Let me give you an example. You know that migrant workers build houses. Suppose there are two migrant workers, one migrant worker (hereinafter referred to as migrant workers) to make bricks on the ground, another migrant worker (hereinafter referred to as migrant worker B) handed bricks to migrant worker A. migrant worker a squatted on the ground, while migrant worker B Moved bricks from a distance. He picked them up and used them, after migrant worker B moved from a distance, it was still placed on the pile of bricks. In this way, after migrant worker a was used, migrant worker B made up with both sides, which means that the latter went in and out first. Imagine this process in your mind. Do you want to understand that migrant worker a always takes bricks from the top. The stack is like this. Its base address starts with a high address, and every time there is data in the stack, it stores data in the direction of the low address. The corresponding inbound command is push. Whenever data is imported into the stack, esp changes. In short, it always points to the last data pushed into the stack. Then, if you want to use the data pushed into the stack, use the out-of-stack command to retrieve it. The corresponding command is pop. After the pop command is executed, the ESP will add the corresponding data digits.
Especially now in the Win32 system, the role of the stack can not be ignored, the data used by the API is transmitted by the stack, that is, the data to be transferred first is pushed into the stack, then call the API function. The API function uses the stack command in the function body to export the corresponding data to the stack. Then perform the operation. You will know the importance of this in the future. Many software with clear code comparison usually press the two real and false registration codes into the stack before the key call. Then compare the output stack in the call. Therefore, as long as a key call is found, you can run the D command in the pressure stack command to view the real registration code. The specific content will be detailed later. This chapter will not discuss it for the time being.
In addition, EBP, which is called base address pointer registers, can be used with the stack segment register ss to determine the address of a storage unit in the stack. ESP is used to indicate the offset address at the top of the segment, the EBP can be used as a base address in the stack area to access information in the stack. ESI (source address change register) and EDI (Destination Address Change register) are generally used together with the data segment register ds to determine the address of a storage unit in the data segment. The two address change registers provide the automatic increment and automatic reduction functions, which can be easily used for address change. In string processing commands, when ESI and EDI are used as implicit source address changes and destination address changes registers, ESI and Ds are used together with EDI and the additional segment es, addressing in data segments and additional segments is achieved respectively. It doesn't matter if you don't understand it for the moment.
Next, let's talk about the special register, for example, Hua (Dangdang, I'll call myself again). Then, let's look at the special register. Are you scared by this name? It looks weird and professional.
There are two special registers: eip and flags.
Let's talk about this EIP first. It can be said that EIP is the most important of all registers. It refers to the instruction pointer register, which is used to store the offset address in the code segment. During the running process, it always points to the first address of the next command. It is used with the segment register CS to determine the physical address of the next instruction. When the address is sent to the memory, the controller can obtain the next instruction to be executed, and the Controller immediately modifies the content of the EIP once the instruction is obtained so that it always points to the first address of the next instruction. It can be seen that the computer uses the EIP register to control the execution process of the command sequence.
The redirection commands are implemented by modifying the EIP value.
Next let's talk about this flags, the flag register, also known as psw (program status word), that is, the Program Status Register. This is a register that stores the condition flag, control mark, and system sign.
In fact, we don't need to know much about it at all. At present, you only need to know how it works. Let me give you an example:
CMP eax, EBX; subtract from EBX using eax
Jnz 00470395; Skip here if they are not equal;
These two commands are very simple, that is, the number loaded with the eax register minus the number installed in the EBX register. To compare whether the two numbers are equal. After the CMP command is executed, the corresponding value will be placed on the zero sign of ZF (zero flag) of flags. If the result is 0, that is, if the two of them are equal, ZF sets 1; otherwise, it sets 0. Other include of (overflow mark) SF (symbol mark) CF (carry mark) AF (Auxiliary carry mark) PF (parity mark) and so on.
You do not need to know this clearly at present. You can use the corresponding transfer command.
The last thing I want to talk about is the segment register? It's not me)
There are a total of six registers, namely CS code segment, DS data segment, es additional segment, SS stack segment, FS and GS, and additional segments.
In fact, in the Win32 environment, segment registers are not as important as the DOS era.
So we know.
I believe you have a rough understanding of the CPU. What? Or do you not understand anything? Well, don't be discouraged. Please believe this is my fault. I didn't make it clear. You can refer to some books. I have always felt that it is very necessary for you to write a compilation book. Here, I am the editor of Tsinghua edition 80x86 assembly language programming, edited by Shen meiming, for 46 Yuan.
Let's talk about some common assembly commands. (Considering that I have posted a post, I just picked out some of the most frequently used and necessary information from the assembly instructions. For more information, see books .)
Cmp a and B compare a and B where A and B can be registers, memory addresses, or two registers at the same time, but not both of them are memory addresses. This command is too long to understand. It is used by many software with clear code comparison.
MoV A and B send the value of B to A, where A and B are both registers or memory addresses and can also be two registers at the same time, but they cannot both be memory addresses.
XOR a, a exclusive or operation, mainly used to clear
Lea Mount address. For example, Lea Dx and string load the character address into the DX register.
Push pressure Stack
Pop output Stack
Add addition Command Format: Add DST, Src: (DST) <-(SRC) + (DST)
Sub subtraction Command Format: Sub DST, Src: (DST) <-(DST)-(SRC)
Mul unsigned multiplication Command Format: operations performed by Mul SRC: byte operations (ax) <-(Al) * (SRC); word operations (dx, ax) <-(ax) * (SRC); double-word operations: (EDX, eax) <-(eax) * (SRC)
Div unsigned division command format: operations performed by Div SRC: byte operations: 16. The divisor is in ax. The 8-bit divisor is the source operand, and the result's 8-bit operator is in Al, the 8-digit remainder is in AH. Indicates:
(Al) <-(ax)/(SRC) operator, (AH) <-(ax)/(SRC) remainder. Word operation: 32-bit dividend in Dx and ax. DX is a high-level word, and the 16-bit Division is the source operand. the result's 16-bit quotient is in ax, and the 16-bit remainder is in dx. The remainder of (ax) <-(dx, ax)/(SRC) operator, (dx) <-(dx, ax)/(SRC.
Double-word operation: the 64-bit dividend is in EDX and eax. EdX is a high dual-character, 32-bit divisor is the source operand, The result 32-bit operator is in eax, and the 32-bit remainder is in EDX. Indicates:
(Eax) <-(EDX, eax)/(SRC) operator, (EDX) <-(EDX, eax)/(SRC) remainder.
NOP is useless and can be used to erase the corresponding statement...
Call calls subprograms. You can understand them as processes in advanced languages.
Transfer control command:
Je or JZ jump if equal
Skip if not equal to JNE or jnz
JMP unconditional jump
Skip if JB is smaller
Skip if ja is greater
Skip if JG is greater
Skip if jge is greater than or equal
Jl skip if it is smaller
Skip if jle is less than or equal
In general, the preceding commands are common and need to be mastered, but you need to know more about them. Other commands hope you can learn about them in private, you can refer to the relevant tutorials.
I forgot, but now I want to paste the number conversion:
First, the problem of Binary Conversion to decimal:
The sum of the values multiplied by the values corresponding to the binary is the decimal number corresponding to the binary. For example:
10100 = 4 to the power of 2 + 2 to the power of 2, that is, the decimal number 20.
11000 = 4 to the power of 2 + 3 to the power of 2, that is, the decimal number 24.
The following describes how to convert a decimal number to a binary number:
I don't know how many such methods there are. I just want to explain the simplest one-Division:
Divide the integer part of the decimal number to be converted by 2, and write down the remainder until the quotient is 0.
For example, n = 34D (Note: You may have seen a letter behind some numbers. This letter is used to represent the digit system, and the decimal number is D, binary: B, octal: O, hexadecimal: H)
34/2 = 17 (A0 = 0)
17/2 = 8 (a1 = 1)
8/2 = 4 (A2 = 0)
4/2 = 2 (A3 = 0)
2/2 = 1 (A4 = 0)
1/2 = 0 (A5 = 1)
So n = 34D = 100010b.
The decimal part of the converted decimal number should be multiplied by 2, and the integer part should be noted down until the decimal part of the result is 0.
Conversion Between the hexadecimal number, binary number, and decimal number:
In general, the conversion between the hexadecimal number and the binary number is very simple. You only need to convert the corresponding value.
The base number of the hexadecimal number is 16. There are 16 digits in total. They are 0, 1, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E, F. A Indicates 10 in decimal format, and the rest are similar. Their Relationships with binary and decimal numbers are as follows:
0 h = 0d = 1_ B, 1 H = 1d = 0001b, 2 h = 2D = 0010b, 3 h = 3D = 0011b, 4 h = 4D = 0100b, 5 H = 5d = 0101b, 6 h = 6D = 0110b, 7 h = 7d = 0111b, 8 h = 8d = 1000b, 9 h = 9d = 1001b, ah = 10D = 1010b, BH = 11d = 1011b, CH = 12D = 1100b, DH = 13D = 1101b, Eh = 14d = 1110b, FH = 15d = 1111b
Therefore, to convert the binary and hexadecimal values, you only need to make them a level from low to high, and each four digits can be directly expressed in hexadecimal format:
Example: 1000 1010 0011 0101
8 A 3 5
To convert a hexadecimal value to a binary value, you only need to use four binary numbers for each bit:
Example: a B 1 0
1010 1011 0001 0000
Finally, the conversion between the hexadecimal number and the decimal number
Hexadecimal to decimal
The product of the hexadecimal number and its corresponding weight is the decimal number corresponding to the hexadecimal number.
Example: N = bf3ch
= Power 3 of 11*16 + power 2 of 15*16 + power 1 of 3*16 + power 0 of 12*16
= 11*4096 + 15*256 + 3*16 + 12*1
= 48956d
Convert decimal to hexadecimal
I will only talk about the simplest Division:
Divide the integral value of the decimal number to be converted by 16 and write down the remainder until the quotient is 0.
Example n = 48956d
48956/16 = 3059 (A0 = 12)
3059/16 = 191 (a1 = 3)
191/16 = 11 (A2 = 15)
11/16 = 0 (A3 = 11)
So n = 48956d = bf3ch.
Through the above introduction, I don't know whether you understand it or not. If you have any, please read a book and carefully read what I haven't talked about and what I 've talked about several times. If you do not understand it at all, you need to read the book. Do not lose your confidence in learning. After carefully reading the CPU introduction of the front end, you can figure out the register concept, and then take down the Assembly commands in the back end to get on the road. If you study it carefully, you will find that it is not as difficult as you think. In a week, you can understand the assembly code. If you really want to learn it well, you can even read it later, and write some small programs to train your hands. Of course, if you want to be proficient in compilation, it is not a day or two, May or a month. But if you have perseverance, what can you do? CPU is also done by humans. commands are only part of them. People can make CPUs. Are you afraid you can't even learn how to use them?
After-school FAQ
Q: I have learned 8086/8088 before, and I have also written programs under DOS. Can I do this?
A: It is absolutely feasible. Compared to 8086/8088, the current CPU does not add many new commands in terms of basic commands. You only need to know about the changes in various registers and the knowledge of Windows programs. In addition, since you have written a program in DOS using assembler, you must have been very familiar with debugging and other debuggers, so you have inherent advantages.
Q: compilation is not a problem for me. Why am I always not familiar with it?
A: Well, there are still many old birds like this. They are very skilled in using the compilation. However, they are not familiar with the compilation because of experience, does this happen to many people? At the very least, I followed up when I saw the call. Haha, I followed a lot of APIs. So for this part of experts, you only need to practice more hands and master some analysis skills.
Q: I have never learned programming. Can I learn assembly?
A: In general, that's fine. However, learning compilation won't lead you to lose confidence in learning other advanced languages. :)
Q:
Q: Can I use registers as needed? Are there any restrictions? When I write a program, can those variables be placed in any register?
A: Well, now I will answer questions from friends upstairs.
Registers have their usage mechanism and each register has a clear division of labor.
Such as little Tsui such as data registers (EAX-EDX), they are General registers, and in the software, any data can be stored here. However, they can all be used for their respective purposes.
For example:
Eax can be used as an accumulator, so it is the main Register of arithmetic operations. Specify in commands such as multiplication and division to store operands. For example, in multiplication, you can use Al, ax, or eax to install the multiplier, while ax, DX: ax, eax, or edX: eax is used to hold the final product.
Generally, EBX is used as the base address register when calculating the memory address.
ECX is often used to save the Count value, for example, in the shift command, it is used to hold the displacement, cycle, and serial processing commands as an implicit counter.
At last, the dawn of the Four Kings left. Recently, he was always relatively low-key... (Don't hit me. I'll hit the wall.) Finally, edX is left, the DX and ax groups are usually stored together for a double-character long number during the double-character long operation (What do you remember? What is double-character long? For example, for example, if you want to store 01101000110101000100100111010001 binary data, you can put 0110100011010100 (16-bit high) in Dx and 0100100111010001 (16-bit low) in ax, this number is expressed as DX: Ax). Of course, you can use an EDX to load this number. Therefore, you can also use edX: eax to install a 64-bit data, which you will infer.
ESP, EBP, EDI, and ESI are about introduced above, so I will not talk about them here.
Of course, there are other restrictions, because we just need to look at the assembly code of the Program (people write well, certainly will not make mistakes), rather than to write, so you do not have to grasp. If you are interested, read related books.
In addition, let's talk about your last question: "Can variables be stored in any register when I write a program? "I don't understand what you want to ask. I think you may have mistaken Some points. variables are usually used in advanced languages. If you write programs in advanced languages, you do not need to understand those registers or anything, these are irrelevant to advanced languages. But in the end, the advanced language still converts the program you write into operations on registers and internal memory.