Web security practice (6) Information Extraction from web Application Analysis
The web security practice series focuses on the practical research and some programming implementation of the content of hacker exposure-web Application Security secrets and solutions (version 2. So if you fully understand this book, you can skip this article.
Body
The automated tool helps us get a complete map of the target site. If you download the entire site, can I use its search function to get further details. Of course, we have to face the defects of crawling tools. A lot of work is still inseparable from manual work. We may consider the details from the following aspects.
6.1 dynamic pages and static pages
Static pages cannot be used to test tools or submit any requests. we need to pay attention to comments or other information of static pages, which may lead to unexpected discoveries. A dynamic page is a page that interacts with the server. It is also a channel for us to intrude into the server. It is easy to divide all pages into these two types. You only need to differentiate them based on the extension.
6.2 directory structure and directory name, file name
The website architecture is quite regular. We can use the directory structure and directory name to estimate the functions of various directories and files.
Privileged directories such as/admin/adm/
Backup or log file directories such as/back // log/
File Inclusion directories such as/inc/include // js // global // local/
International directories such as/en/eng.
Of course, we can speculate on some hidden directories, and then send requests to these directories to determine based on the prompt information.
6.3 File Extension
The purpose of file extension segmentation is to further analyze the technical usage behind the absence of extension extensions, execute the details, and use search engines to obtain the latest file vulnerabilities and attack methods.
Common file extensions and examples I found online.
Cfm files such as ColdFusion: http://www.joespub.com/web_joes/index.cfm
Aspx files asp.net such as: http://www.neworiental.org/Portal0/Default.aspx
Nsf file Lotus Domino such as: http: // 166.111.4.136: 8080/yjsy/main. nsf/SecondClassParaShow? Openform & ClassCode = C04
Asp file asp for example: www.w3schools.com/asp/default.asp
Do file BroadVision such as: http://login.xiaonei.com/Login.do
Pl File Perl, for example: www.chinaembassy.org. pl
Cgi File: www.bioinfo.tsinghua.edu.cn /~ Zhengjsh/cgi-bin/getCode. cgi
Python for py files: www.orcaware.com/svn/wiki/Svnmerge.py
PHP file: www.paper.edu.cn/index.php
Shtml file SSI such as: http://finance.cctv.com/index.shtml
Jsp file Jsp for example: www.tsinghua.edu.cn/qhdwzy/zsxx.jsp
6.4 form
Forms are the backbone of web applications. We need to find the form information of all pages as much as possible, especially to hide the form. We can use the search function of the automated tool or the manual method to find the form information.
Form submission method. Whether it is get or post. Get is easier to operate in the browser, but it is not assumed that post is safer than get.
Action. Scripts and languages used by the form.
Maximum length. Whether the length of the input field is limited. If the length limit is set, do you consider bypassing the length limit method.
Hide. Pay special attention to the usage of hidden fields. Such code
<input type="hidden" name=username><input type="hidden" name=password><input type="hidden" name=shkOvertime value=720>
6.5 query strings and Parameters
The query string is usually followed by the question mark. For example, www.smg.cn/Index_Columns/Index_Channels.aspx? Id = 25.
It is very complicated to analyze and query strings and parameters.
What is the meaning of a parameter.
What is the page or program that receives parameters.
Whether the processing and verification of parameters are strict.
Include sensitive information such as databases.
User identifiers such as www.tudou.com/home/user_programs.php? UserID = 4030105.
Session ID: www.avssymposium.org/Session.asp? SessionID = 143
Database Query such as: http://flash.tom.com/user_msg.php? Username = itscartoon
6.6 common cookies
Many applications use cookies to transmit information and identify the status. For example:
Referer: http://www.xiaonei.com/
Cookie: _ utma = region; _ utmz = region = (referral) | utmcsr = blog.xiaonei.com | utmcct =/GetEntry. do | utmcmd = referral;
6.7 google hacks
Intext:
This is to use a character in the body of the webpage as a search condition. for example, enter "intext: Net" in google. returns all the web pages that contain "" in the webpage body. allintext: similar to intext. for example:
Intitle:
Similar to the intext above, search for whether the webpage title contains the characters we are looking. for example, search: intitle: Security angel. all Web pages whose titles contain "Security Angel" will be returned. similarly, allintitle: is similar to intitle.
Cache:
Search for the cache of some content in google, and sometimes you may find some good stuff.
Define:
Search for the definition of a word. Search: define: hacker. The definition of hacker is returned.
Filetype:
I would like to recommend that you use this tool to collect information about specific targets, whether it is a web attack or what we will talk about later. search for files of the specified type. for example, input: filetype: doc. all file URLs ending with doc will be returned. of course, if you are looking. bak ,. mdb or. inc is also available, and more information may be obtained :)
Info:
Query the basic information of a specified site.
Inurl:
Search whether the specified character exists in the URL. For example, if you enter inurl: admin, N Connections similar to the following are returned: success.
Link:
For example, search: inurl: www.4ngel.net can return all URLs connected to www.4ngel.net.
Site:
This is also useful. For example, site: www.4ngel.net. will return all URLs related to this site of 4ngel.net.
Related: such as related: www.sina.com. Return to the Sina page.
Some operators are also useful:
+ Display columns that may be ignored by google as the query range
-Ignore a word
~ Word of consent
. Single wildcard
* Wildcard, which can represent multiple letters
"" Precise Query