Objective
First we mention the crawler, we have to say the Web page, because we use python to write a crawler is actually designed for the Web page, parsing the Web page and crawling this data is what the crawler needs to do, everyday we see these pages, we can see a lot of pictures, A lot of headlines and a lot of text information, in fact they are in the browser after rendering the results, we can bar the browser to understand as a translation officer, it put the original information, the original page code translated into some of our visual elements.
See the source through the Web page
In order to allow us to visualize the look, we can right in the page blank, click Check, this time we can see the source code of the Web page, there is a CSS style, which we do not explain first, we first look at the page, every label, text, Or the picture corresponds to a piece of code, we click on the right column in the upper left corner of the arrow symbol, we can try to click on a picture, you can see the source of the place highlighted a block, which contains the image of the connection.
in fact, the code is divided into three parts, the first part isHTMLpart of this structure, there are a lot of tagged imagesDiv, Sectionand so on are all made upHTMLlanguage, for example, we see a label followed by aclass, in fact, this is the label corresponding to the style, that is, in the rightmost column, is the style, each load page will have these two, information, one isHTMLaCSSstyle, and then on the lowest side of the source, there is aScripttag, which is loaded inside theJavaScriptThe code, the three of them, isHTML,CSS,JavaScriptconstitutes the structure of a vast majority of web pages, let's take an example.
Html
The relationship between the three is like the room we live in, where the HTML is equivalent to the structure of the room, to distinguish which piece is exactly what to do, such as a house in the living room, the difference between the bedroom.
Css
CSS part of the equivalent of the decoration of our room, it is a style, it determines what color the wall is, how the roof is, the structure of our decoration.
Javascript
-avascript part You can think of it as a function of the electrical appliances in the room, the TV lighting and the like.
we learn the web crawler, in fact, many of the elements of the Web page is mixed in html,css , so for the crawler,JavaScript may involve relatively little. So this time we'll simply introduce HTML and css .
First of all, in the example we just saw, we can see a lot of this div tag used in the Web page.
This <div> tag represents the meaning of an area in the Web page, which represents what I can put in this area, make an image metaphor, when there is a div tag, then in the page corresponds to his existing area, with a fixed area, need to add some content , for example, we see the picture, the title. We can nest tags, we can add some text in the Div, such as a <p> tag, but the Web page is not so simple, we need to add this CSS style to decorate, then this is the HTML of our website and the basic usage of CSS. As we can see, this <div> tag can add content to it, and this tag is a regional framework.
There are some common such as <li> This is the list, is our daily life of the 12345 of the list, and there is , after using this tag you can insert a picture, then there is H1,H1 to h6 is a title that represents a different font size
<a href= ' # ' > we can see in the Web page with a lot of connection, actually use this tag to connect it, actually these simple tags can constitute a simple Web page, the best way to learn the site is to write a website, here is no longer with everyone to write the page, you can go to understand.
I have written a blog of the Web, you can go to learn https://www.cnblogs.com/liudi2017/p/7614919.html.
Knowledge of Web pages before learning Reptiles