Contact Safety field also counted four years, big and small direction have seen some, but are not proficient, increasingly feel their lack of strength, so I decided to start today to learn the hack technology the most core knowledge and skills: vulnerability mining and malicious code analysis. Because the main involved in this area, so the discussion and web security scripts, such as the association is not big, everything must chase this, root deep to ye Mao, hope can use nearly two years of time, small has become.
Today is to talk about the Hack Technology programming Foundation, talk about the foundation, we will give some insightful views, mainstream recommendations have C/S, Java, Perl, Python, VB, etc., and further socket programming, system programming. I was also confused for a while, the university studied C, postgraduate self-taught C + +, work and learn Python. The feeling is: to do the project or to use C + +, Java, but the development of security tools, or Python is better, because easy to learn easy to use and powerful, very suitable for writing their own personal gadgets. In my opinion, the Hack programming Foundation should involve the following aspects:
1. C + + programming language;
2. Computer memory knowledge;
3. Basic knowledge of Intel processors;
4. Assembly Language Foundation;
5. GDB program Debugging;
6. Python programming skills;
A brief description is given below.
One, C programming language
The importance of C is that UNIX and Windows systems are mainly written in C, so its underlying core mechanism is C, although the widows system more and more use of C + + architecture, but the root cause of the vulnerability is similar to C, the root of the problem is not resolved. The C + + language can help us understand the implementation vulnerability procedure. This part involves only the basic knowledge of main (), variables, function calls, basic input and output, string manipulation, condition/loop structure, etc.
Another need to learn is the basic compilation skills, the recommended use of GCC, because you can choose the target file (gcc-c), assembly files (gcc-s), close stack protection (gcc-fno-stack-protector) and other options, can be said that the function is very powerful.
Specific knowledge of C + + points can be consulted in my anthology: http://blog.chinaunix.net/special/show/sid/1129.html
Second, computer memory
The memory of the computer is the basic read-write memory, the most relevant to our program for RAM, here need to master the following few small points:
-1. BYTE order: Different vendors support different write order, some manufacturers believe that data writing should be written by low memory address, such as Intel, so called "small-end method", and some manufacturers think should start from a high address to write, such as Moto, so called "big-endian method." We will actually deal with both of these writing methods when we discuss shellcode later.
-2. In-memory program layout: Each process of course has its own memory space, the process is actually a program to run the resource container, line friend is a concrete running instance. Here we focus on six main memory sections, which are:
-2.1-:.text section, which is consistent with the. Text section of the binary executable, mainly contains theMachine Instructions, the section is read-only and, if written, causes a segment error;
-2.2-:.data section, which is used primarily to store global initialization variables, such as int a = 0, which is fixed when the size of the section is run;
-2.3-:.BSS section, lower than the stack (below stack section), which stores uninitialized global variables, such as int B, which runs at a fixed size;
-2.4-: A heap section that is used to handle variables that are dynamically allocated when the program is run, and that the allocated space takes a low address-to-high address write mode;
-2.5-: This section mainly deals with Function procedure call data, including variables and statements inside functions, but most systems use high address to low address, and the growth of this kind of stack leads to the existence of soul flushing overflow;
-2.6-: Environment/Parameters section, which is used to hold copies of system-level variables that a process may use at run time, such as paths that are accessible by a running process, shell names, and host names.
-3. Buffers, strings, and pointers: This part is the basis of C, there is no need to say more?
Third, Intel processor
Processor part of the main knowledge is to focus on common registers, such as General register Eax/ebx/ecx/edx, such as segment register CS/SS/DS/ES/FS/GS and so on, here is more important is ESP (extension stack pointer), we often need to use ESP to determine the location of the top of the stack The other is the EIP register, which holds the address of the next instruction that the CPU will execute.
For more information, refer to: http://blog.chinaunix.net/uid-26275986-id-4334522.html
Iv. Basics of assembly language
Do security can not not understand the assembly, not necessarily can use assembly programming, but read the assembly is the basic requirements, do not understand the assembly of the people after all difficult to penetrate into the nature of security issues. Assembly language is divided into ATT and nasm two formats, although the resulting machine instructions are exactly the same, but the assembly language expression is different. For example, the order of the att operand is the opposite of NASM:
Write 0x10 to the EAX register:
ATT:MOVL%eax, $0x10
Nasm:mov 0x10, eax
As you can see, the constants under ATT need to use the $ prefix, while the registers must use the% prefix and the operands in the opposite order. Assembly language is not easy to learn, fortunately we are not assembler programmers,What we need is to be familiar with some common commands and to be able to read and analyze when needed.:
1. mov: This command is used to copy data from source to destination, the source data will not be removed after successful copy;
2. The Add/sub:add command is used to add the source data and the destination data after the result is saved at the destination, the sub command is used to subtract the source from the destination, and the result is stored at the target;
3. Push/pop:push is used to press the stack, a data is written to the stack, and the pop is used for the stack, that is, the stack top element is removed from the stack, and saved to the operand;
4. XOR: The difference or command, in fact, is to determine whether the bits is the same operation, different, that is, ' 1 ', the same as ' 0 ', similar to the MOD2 operation;
5. Jne/jnz, JE/JZ, Jmp:jne and jnz are one thing, and when the 0 Mark Zf=0, the zf=1 and JE will jump, and JMP will jump at any time;
6. Call/ret: Used for Function procedure call and process return;
7. Inc/dec: This command is used to increment or decrement the purpose operand;
8. Lea: This command is used to load the actual address of the source operand into the destination operand, such as Lea EAX, [dsi+4];
9. int: This command can throw a system interrupt signal to the CPU, commonly 0x80, which is used to send system calls to the kernel;
In addition to the basic assembly commands, it is also important to understand the Assembly's addressing pattern, which is mainly a variety of indirect and relative addressing, but fortunately not difficult. Another skill is the ability to use GDB for program debugging, such as setting endpoint tracking.
V. Python Programming skills
This part is mainly able to use the Python development needs of the tools, basic grammar concepts can refer to the mainstream textbook books, can also refer to my anthology: http://blog.chinaunix.net/special/show/sid/1235.html
This article is from the "Run Yang Hang" blog, make sure to keep this source http://windhawk.blog.51cto.com/729863/1639259
"Safe Hiking" (1): Hacker programming skills