標籤:http io ar for strong sp 資料 on 問題
沒有太多時間進行全文翻譯,就把重點挑出來,用自己的話串起來,名曰 選擇性翻譯。 以後可能會比較多的採用這種方式。
社會書籤的tag儲存一直是一個比較麻煩的問題。
一個好的資料表設計,不但要能準確查出tag,還應該支援tag的AND/OR/NOT查詢。我們來看看解決方案。
“MySQLicious” solution
表結構
儲存執行個體
Intersection (AND)
“search+webservice+semweb”類的查詢:
SELECT *
FROM `delicious`
WHERE tags LIKE "%search%"
AND tags LIKE "%webservice%"
AND tags LIKE "%semweb%"
Union (OR)
“search|webservice|semweb”類的查詢:
SELECT *
FROM `delicious`
WHERE tags LIKE "%search%"
OR tags LIKE "%webservice%"
OR tags LIKE "%semweb%"
Minus
“search+webservice-semweb”類的查詢
SELECT *
FROM `delicious`
WHERE tags LIKE "%search%"
AND tags LIKE "%webservice%"
AND tags NOT LIKE "%semweb%"
優點:
只有一個表 SQL比較直接
可以用mysql的全文檢索索引來做,效率更高
缺點:
tag的數量受到限制,通常我們都用varchar,這種欄位只256個位元組長。否則,你需要用text類型,速度會變慢。(Easy注,PHP(PHP培訓 php教程 )more的tag用的就是TinyText)
Like ‘%things%’不精確,當然某些應用中,這反而是需要的
“Scuttle” solution
資料表
Intersection (AND)
Query for “bookmark+webservice+semweb”:
SELECT b.*
FROM scBookmarks b, scCategories c
WHERE c.bId = b.bId
AND (c.category IN (’bookmark’, ‘webservice’, ’semweb’))
GROUP BY b.bId
HAVING COUNT( b.bId )=3
首先,所有書籤-tag組合被搜出來 (c.category IN (‘bookmark‘, ‘webservice‘, ‘semweb‘)), ,然後選擇其中包含三個的(HAVING COUNT(b.bId)=3)
Union (OR)
Query for “bookmark|webservice|semweb”:
只需要去掉?AND查詢中的HAVING子句:
SELECT b.*
FROM scBookmarks b, scCategories c
WHERE c.bId = b.bId
AND (c.category IN (’bookmark’, ‘webservice’, ’semweb’))
GROUP BY b.bId
Minus (Exclusion)
Query for “bookmark+webservice-semweb”, that is: bookmark AND webservice AND NOT semweb.
SELECT b. *
FROM scBookmarks b, scCategories c
WHERE b.bId = c.bId
AND (c.category IN (’bookmark’, ‘webservice’))
AND b.bId NOT
IN (SELECT b.bId FROM scBookmarks b, scCategories c WHERE b.bId = c.bId AND c.category = ’semweb’)
GROUP BY b.bId
HAVING COUNT( b.bId ) =2
好處: 我覺得這個方案比前一個方案好的最大理由是,可以有無限個tag。
“Toxi” solution
資料表
Intersection (AND)
Query for “bookmark+webservice+semweb”
SELECT b.*
FROM tagmap bt, bookmark b, tag t
WHERE bt.tag_id = t.tag_id
AND (t.name IN (’bookmark’, ‘webservice’, ’semweb’))
AND b.id = bt.bookmark_id
GROUP BY b.id
HAVING COUNT( b.id )=3
Union (OR)
Query for “bookmark|webservice|semweb”
SELECT b.*
FROM tagmap bt, bookmark b, tag t
WHERE bt.tag_id = t.tag_id
AND (t.name IN (’bookmark’, ‘webservice’, ’semweb’))
AND b.id = bt.bookmark_id
GROUP BY b.id
Minus (Exclusion)
Query for “bookmark+webservice-semweb”, that is: bookmark AND webservice AND NOT semweb.
SELECT b. *
FROM bookmark b, tagmap bt, tag t
WHERE b.id = bt.bookmark_id
AND bt.tag_id = t.tag_id
AND (t.name IN (’Programming’, ‘Algorithms’))
AND b.id NOT IN (SELECT b.id FROM bookmark b, tagmap bt, tag t WHERE b.id = bt.bookmark_id AND bt.tag_id = t.tag_id AND t.name = ‘Python’)
GROUP BY b.id
HAVING COUNT( b.id ) =2
Leaving out theHAVING COUNTleads to the Query for “bookmark|webservice-semweb”.
好處:
你可以給每個tag添加額外的資訊
這是最規範的方案,第三範式。
壞處:
刪除tag時,你要從多個表中刪除(Easy注,Mysql5的話,可以用trigger來做)
然後我們把視線從功能轉移到效能上。
A+B
250個tag
999個tag
A OR B
250個tag
添加速度比較
測試代碼下載?Download the source code (PHP) LGPL協議。
PHP標籤Tag的設計模式