In-depth research on key word matching Project (2)-Introduction of table sharding

This article mainly introduces the keyword matching project for in-depth research (2)-Introduction of table sharding ideas.

You have already completed the keyword matching Project. in-depth research is to analyze the system performance and make some necessary changes under the stimulation of some environments.

Teach you how to match keywords: Teach you how to match keywords (search engine) ---- day 1 ~ Teach you how to create a keyword matching Project (search engine) ---- 22nd days (22 articles in total)

In-depth research: the keyword matching Project is discussed in the previous section.-introduction of filter.

Each article is divided into the cause of the problem, the solution, and some necessary implementation solutions.

This article officially started.

Cause of the problem

With the explosive growth of automatic data collection, the capacity of the dictionary is booming. several million pieces of data surge from several million pieces of data, and Shuai is increasingly powerless to look at database queries.

In addition, Ding often said the most to Shuai: When can the word selection be faster, every time I wait for a long time don't have a reaction, it is really anxious to die me.

Shuai is also anxious and worried, and truly feels that this is a challenge. Shuai continued to find the boss helplessly, and asked the boss to offer a great offer.

Pat the boss on the shoulder of Shuai: Young man, you know the difficulty of the project!

Shuai replied: Don't dig into me. I already have a deep feeling. I think my heart may be overwhelmed.

Boss Yu: you can't afford this. it is estimated that some will be for you in the future.

Shuai: Big Brother, don't say that these imaginary lines are not good. hurry up and solve the problem.

Yu Laong: What is it? it's not so urgent. come here and show you a clear path.

"Does each baby have a category attribute? how many words can the millions of data actually belong to this category? Suppose we only use the dictionary of this category to determine whether our project can continue to stabilize ".


Based on a certain business needs, we can split data tables vertically or horizontally to effectively optimize performance.

Vertical segmentation is also called column segmentation. it separates uncommon columns or long fields to ensure that the object is in a suitable state. common vertical segmentation includes one-to-one association.

Horizontal split is also called row split. data records are stored in different tables based on a specific service split. common operations include table sharding by date.

In this case, horizontal split is used to split data by category.

Implementation Scheme

In order not to change the structure of the data table, the table name is used to differentiate the data table used by the project. In this way, there are relatively few changes. We only need to slightly modify the code to solve the problem.

Modify the Keyword code to add a data source.

 word')";        return mysql_query($sql,$this->getDbConn());    }    public static function getWordsSource($cid,$limit=0,$offset=40){        $sql = "SELECT * FROM keywords_$cid LIMIT $limit,$ffset";        return DB::MakeArray($sql);    }    public static function getWordsCount($cid){          $sql = "SELECT count(*) FROM keywords_$cid";        return DB::QueryScalar($sql);    }}

QueryScalar is added to the DB class to calculate the total amount.


Modify the Selector code for word selection:

  "BacklistCharListHandle", "synonym" => "LinklistCharListHandle"); public static function select ($ num_iid) {$ selectorItem = SelectorItem: createFromApi ($ num_iid); Logger :: trace ($ selectorItem-> props_name); $ charlist = new CharList (); foreach (self ::$ charListHandle as $ matchKey => $ className) {$ handle = self :: createCharListHandle ($ className, $ charlist, $ selectorItem); $ handle-> exec () ;}$ selectWords = array (); $ wordsCount = Keyword :: getWordsCount (selectorItem-> cid); $ offset = 40; $ page = ceil ($ wordsCount/$ offset); for ($ I = 0; $ I <= $ page; $ I ++) {$ limit = $ I * $ offset; $ keywords = Keyword: getWordsSource (selectorItem-> cid, $ limit, $ offset ); foreach ($ keywords as $ val) {# code... $ keywordEntity = SplitterApp: split ($ val ["word"]); # code... if (MacthExector: macth ($ keywordEntity, $ charlist) {$ selectWords [] = $ val ["word"] ;}} return $ selectWords ;} public static function createCharListHandle ($ className, $ charlist, $ selectorItem) {if (class_exists ($ className) {return new $ className ($ charlist, $ selectorItem );} throw new Exception ("class not exists", 0 );}}

Shuai learned new knowledge points again. Is this the rhythm to reward the boss? Do you want to treat me back to me!

The above introduces the keyword matching project for in-depth research (2)-Introduction of the sub-table idea, including the content, hope to be helpful to friends who are interested in the PHP Tutorial.

