In-depth research on key word matching Project (2)-Introduction of the table sharding idea and in-depth research on key words _ PHP Tutorial

Source: Internet
Author: User
Keyword matching Project (2)-Introduction of table sharding ideas and in-depth research on keywords. In-depth research on key word matching projects (II)-Introduction of the concept of table sharding and in-depth research on key words (II) introduction of the concept of table sharding recent articles: 1) architecture and application of high-concurrency data collection (keyword matching project in-depth research (2)-Introduction of the table sharding idea and in-depth research on keywords

(2) introduction of table sharding

Recent articles: 1) High-concurrency data collection architecture applications (Redis applications)

2) high-availability data collection platform (how to play with three languages: php +. net + aauto)

Teach you how to create a keyword matching ProjectThis is basically done,In-depth researchIt is an analysis of the system performance, and some changes must be made under the stimulation of some environments.

Teach you how to match keywords: Teach you how to match keywords (search engine) ---- day 1 ~ Teach you how to create a keyword matching Project (search engine) ---- 22nd days (22 articles in total)

In-depth research:The previous section describes the keyword matching Project-introduction of filters.

Each article is dividedCause of the problem,SolutionAnd some necessaryImplementation Scheme.

This article officially started.

Cause of the problem

With the explosive growth of automatic data collection, the capacity of the dictionary is booming. several million pieces of data surge from several million pieces of data, and Shuai is increasingly powerless to look at database queries.

In addition, Ding often said the most to Shuai: When can the word selection be faster, every time I wait for a long time don't have a reaction, it is really anxious to die me.

Shuai is also anxious and worried, and truly feels that this is a challenge. Shuai continued to find the boss helplessly, and asked the boss to offer a great offer.

Pat the boss on the shoulder of Shuai: Young man, you know the difficulty of the project!

Shuai replied: Don't dig into me. I already have a deep feeling. I think my heart may be overwhelmed.

Boss Yu: you can't afford this. it is estimated that some will be for you in the future.

Shuai: Big Brother, don't say that these imaginary lines are not good. hurry up and solve the problem.

Yu Laong: What is it? it's not so urgent. come here and show you a clear path.

"Does each baby have a category attribute? how many words can the millions of data actually belong to this category? Suppose we only use the dictionary of this category to determine whether our project can continue to stabilize ".

Solution

Based on a certain business needs, we can split data tables vertically or horizontally to effectively optimize performance.

Vertical segmentation is also called column segmentation. it separates uncommon columns or long fields to ensure that the object is in a suitable state. common vertical segmentation includes one-to-one association.

Horizontal split is also called row split. data records are stored in different tables based on a specific service split. common operations include table sharding by date.

In this case, horizontal split is used to split data by category.

Implementation Scheme

In order not to change the structure of the data table, the table name is used to differentiate the data table used by the project. In this way, there are relatively few changes. We only need to slightly modify the code to solve the problem.

Modify the Keyword code to add a data source.

 word')";        return mysql_query($sql,$this->getDbConn());    }    public static function getWordsSource($cid,$limit=0,$offset=40){        $sql = "SELECT * FROM keywords_$cid LIMIT $limit,$ffset";        return DB::MakeArray($sql);    }    public static function getWordsCount($cid){          $sql = "SELECT count(*) FROM keywords_$cid";        return DB::QueryScalar($sql);    }}

QueryScalar is added to the DB class to calculate the total amount.

 

Modify the Selector code for word selection:

  "BacklistCharListHandle", "synonym" => "LinklistCharListHandle"); public static function select ($ num_iid) {$ selectorItem = SelectorItem: createFromApi ($ num_iid); Logger :: trace ($ selectorItem-> props_name); $ charlist = new CharList (); foreach (self ::$ charListHandle as $ matchKey => $ className) {$ handle = self :: createCharListHandle ($ className, $ charlist, $ selectorItem); $ handle-> exec () ;}$ selectWords = array (); $ wordsCount = Keyword :: getWordsCount (selectorItem-> cid); $ offset = 40; $ page = ceil ($ wordsCount/$ offset); for ($ I = 0; $ I <= $ page; $ I ++) {$ limit = $ I * $ offset; $ keywords = Keyword: getWordsSource (selectorItem-> cid, $ limit, $ offset ); foreach ($ keywords as $ val) {# code... $ keywordEntity = SplitterApp: split ($ val ["word"]); # code... if (MacthExector: macth ($ keywordEntity, $ charlist) {$ selectWords [] = $ val ["word"] ;}} return $ selectWords ;} public static function createCharListHandle ($ className, $ charlist, $ selectorItem) {if (class_exists ($ className) {return new $ className ($ charlist, $ selectorItem );} throw new Exception ("class not exists", 0 );}}

Summary
Shuai learned new knowledge points again. Is this the rhythm to reward the boss? Do you want to treat me back to me!

Partition (II)-Introduction of table sharding ideas, in-depth keyword studies (II) introduction of table sharding ideas recent articles: 1) architecture and application of high concurrency data collection (...

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.