Keyword matching Project (2)-Introduction of table sharding ideas and in-depth research on keywords. In-depth research on key word matching projects (II)-Introduction of the concept of table sharding and in-depth research on key words (II) introduction of the concept of table sharding recent articles: 1) architecture and application of high-concurrency data collection (keyword matching project in-depth research (2)-Introduction of the table sharding idea and in-depth research on keywords
(2) introduction of table sharding
Recent articles: 1) High-concurrency data collection architecture applications (Redis applications)
2) high-availability data collection platform (how to play with three languages: php +. net + aauto)
Teach you how to create a keyword matching ProjectThis is basically done,In-depth researchIt is an analysis of the system performance, and some changes must be made under the stimulation of some environments.
Teach you how to match keywords: Teach you how to match keywords (search engine) ---- day 1 ~ Teach you how to create a keyword matching Project (search engine) ---- 22nd days (22 articles in total)
In-depth research:The previous section describes the keyword matching Project-introduction of filters.
Each article is dividedCause of the problem,SolutionAnd some necessaryImplementation Scheme.
This article officially started.
Cause of the problem
With the explosive growth of automatic data collection, the capacity of the dictionary is booming. several million pieces of data surge from several million pieces of data, and Shuai is increasingly powerless to look at database queries.
In addition, Ding often said the most to Shuai: When can the word selection be faster, every time I wait for a long time don't have a reaction, it is really anxious to die me.
Shuai is also anxious and worried, and truly feels that this is a challenge. Shuai continued to find the boss helplessly, and asked the boss to offer a great offer.
Pat the boss on the shoulder of Shuai: Young man, you know the difficulty of the project!
Shuai replied: Don't dig into me. I already have a deep feeling. I think my heart may be overwhelmed.
Boss Yu: you can't afford this. it is estimated that some will be for you in the future.
Shuai: Big Brother, don't say that these imaginary lines are not good. hurry up and solve the problem.
Yu Laong: What is it? it's not so urgent. come here and show you a clear path.
"Does each baby have a category attribute? how many words can the millions of data actually belong to this category? Suppose we only use the dictionary of this category to determine whether our project can continue to stabilize ".
Solution
Based on a certain business needs, we can split data tables vertically or horizontally to effectively optimize performance.
Vertical segmentation is also called column segmentation. it separates uncommon columns or long fields to ensure that the object is in a suitable state. common vertical segmentation includes one-to-one association.
Horizontal split is also called row split. data records are stored in different tables based on a specific service split. common operations include table sharding by date.
In this case, horizontal split is used to split data by category.
Implementation Scheme
In order not to change the structure of the data table, the table name is used to differentiate the data table used by the project. In this way, there are relatively few changes. We only need to slightly modify the code to solve the problem.
Modify the Keyword code to add a data source.
word')"; return mysql_query($sql,$this->getDbConn()); } public static function getWordsSource($cid,$limit=0,$offset=40){ $sql = "SELECT * FROM keywords_$cid LIMIT $limit,$ffset"; return DB::MakeArray($sql); } public static function getWordsCount($cid){ $sql = "SELECT count(*) FROM keywords_$cid"; return DB::QueryScalar($sql); }}
QueryScalar is added to the DB class to calculate the total amount.
Modify the Selector code for word selection:
"BacklistCharListHandle", "synonym" => "LinklistCharListHandle"); public static function select ($ num_iid) {$ selectorItem = SelectorItem: createFromApi ($ num_iid); Logger :: trace ($ selectorItem-> props_name); $ charlist = new CharList (); foreach (self ::$ charListHandle as $ matchKey => $ className) {$ handle = self :: createCharListHandle ($ className, $ charlist, $ selectorItem); $ handle-> exec () ;}$ selectWords = array (); $ wordsCount = Keyword :: getWordsCount (selectorItem-> cid); $ offset = 40; $ page = ceil ($ wordsCount/$ offset); for ($ I = 0; $ I <= $ page; $ I ++) {$ limit = $ I * $ offset; $ keywords = Keyword: getWordsSource (selectorItem-> cid, $ limit, $ offset ); foreach ($ keywords as $ val) {# code... $ keywordEntity = SplitterApp: split ($ val ["word"]); # code... if (MacthExector: macth ($ keywordEntity, $ charlist) {$ selectWords [] = $ val ["word"] ;}} return $ selectWords ;} public static function createCharListHandle ($ className, $ charlist, $ selectorItem) {if (class_exists ($ className) {return new $ className ($ charlist, $ selectorItem );} throw new Exception ("class not exists", 0 );}}
Summary
Shuai learned new knowledge points again. Is this the rhythm to reward the boss? Do you want to treat me back to me!
Partition (II)-Introduction of table sharding ideas, in-depth keyword studies (II) introduction of table sharding ideas recent articles: 1) architecture and application of high concurrency data collection (...