Starter: personal blog, update & error correction & reply
The demo address is here, the code here.
A DotA player and Hero Fit Calculator (view effect), including two parts of the code:
1.python Scrapy Crawler, the overall idea is page->model->result, extract data from the Web page, make meaningful data structure, and then take this data structure to do something.
In this project, the use of crawlers from the long-network DotA database to grab the DotA heroes and items of data and photos stored on the local disk, the data stored in JSON format, convenient for use in the Web application directly.
2. Web application, using the DotA hero data, their own small partners to write the characteristics of the data, their own written fit calculation formula, to calculate the best fit for each small partner of his hero.
The main hero data used in the algorithm is "hero tag" and "Heroic business score", the former indicates whether the hero is melee or remote, the main attribute is what, can play auxiliary, Gank, late, etc., the latter is the hero in DPS, Gank, support and other aspects of the ability to score.
Both of these data are from the network, crawl to the local after making some adjustments, such as the original "Vertigo" "control" two tags changed to "small control" "small group Control" "Group Control" three tags, "late" label changed to "Late" "big late" two tags, increased "unstable" "Multi-Line" "special" and other characteristics, This makes the calculated fit a bit more accurate, of course, it is also far from perfect, for example, the same deceleration, ice girl and Dark shepherd of course different, but did not do a further subdivision.
In addition, the hero's label has been modified and supplemented, the work has not yet finished.
I have to calculate the degree of fit, side to find the need to change the place, the purpose is to make the calculated results and the actual situation, but it is not hard to figure out the number.
tell me about technology.
Only the crawler is worth saying, the Web page that part just do experiments and play, not too much to say, the algorithm self-reference code is good.
Scrapy Crawler, very powerful, basic requirements are covered, the logic flow has been defined, only the programmer at a specific point to write specific role components of the definition code. This approach is familiar, every framework is so fun, control flow is defined, and in its hands, only where the business is involved requires the programmer to give their own implementation.
This Chinese document is still good, followed by the tutorial written HelloWorld, run through a little bit to change your HelloWorld, encountered difficulties in the search for documents, or Baidu a bit.
Key classes: Scrapy Crawler, Item model class, Pipeline processor class.
The crawler component obtains the HTML string from the specified URL, parses the DOM, collects information from it, and constructs the item object
The framework gives the constructed item object to the processor component, and the processor component is processed based on the content of the item object
In this case, it is:
Crawlers crawl the HTML string from the DotA website, parse the DOM to get the hero's name, picture path, Hero detail page URL, and the name of the item, picture path
The crawler also fetches the HTML string from the Hero Detail page, parsing the DOM to get the hero's tag and various data
Each hero's name, picture path, tags, and various data constitute the Hero model object
The name and picture path of each item make up the Item model object
Save picture processor Save hero picture and item picture to local disk and rename according to Hero name, item name (no meaningful name before renaming)
Save the Hero Data processor saves the hero data to a text file on the local disk and forms the JSON format 7 save item data Processor stores the item data in a text file on the local disk and forms the JSON format
Obviously the first 2 steps are fetch and parse, the next two steps are to make model object, the last three steps is to deal with the model object. The order is very clear.
Because I am not familiar with Python, are writing code side-by-side search, basic grammar, the use of libraries, there may be some places written more stupid, please do not care about these details.
the format of the hero data
{ "blue_add": 1.75, //智力成长(蓝色的是智力……) "gank": 7.2, //gank能力评分 "speed": 315, //初始速度 "id": "132", //英雄id "blue": 19, //初始智力 "armor": 7.22, //初始护甲 "excute_after": 0.51, //施法后摇 "support": 5.1, //support能力评分 "green_add": 3.2, //敏捷成长 "attack_after": 0.6, //攻击后摇 "ballistic": 0, //弹道速度 "war": 8.1, //团战能力评分 "red": 15, //初始力量(红色) "tags": "敏捷,近战,后期,减速",//标签们 "hp": 435, //初始生命 "attack_before": 0.3,//攻击前摇 "attack_max": 54, //初始攻击上限 "name": "恐怖利刃", //英雄名称 "dps": 8.9, //dps能力评分 "meat": 6.0, //肉盾能力评分(meat哈哈) "attack_min": 48, //初始攻击下限 "red_add": 1.9, //力量成长 "excute_before": 0.5, //施法前摇 "range": 128, //攻击距离 "green": 22, //初始敏捷(绿色) "mp": 247, //初始魔法 "push": 8.5//push能力评分 }
(English level laughed at)
Here's a picture--
Long-term welcome project Cooperation Opportunity Introduction, project income 10% to reward introducer. Sina Weibo: @ Cold Mirror, qq:908789432.
DotA player and Hero Fit calculator, Python language scrapy crawler use