Skills for judging data from multiple categories

Source: Internet
Author: User
Tags decimal to binary

Classification algorithms are used in multiple independently developed systems. Each data may be divided into several categories. Detailed scores can be simply placed in a redundant table, however, the category of each data entry must be recorded in the original data table. There are several difficult problems. One is to search for categories, and the other is to add categories. There are several solutions:

1) each category is recorded as a field to identify whether it belongs to: in this case, the query is followed by the where condition. If it is to search for multiple categories, several conditions are required, and the query efficiency is average. However, it is troublesome to add categories. You must modify database fields and programs.

2) Record all categories in one field: it is easy to add categories, but it is not convenient to search. If you search for a category for each value, you can still use like, however, if you search for multiple categories at the same time, you need to write several like statements in SQL, which is more unacceptable. There is a compromise, that is, to discharge these categories in a certain order, so that a like can be processed during search.

3) the same category is recorded in one field: Full-text indexes can also be used to query the field. This can also be used for compound queries. However, mysql's fulltext index can only be used for myisam, if lucene or sphinx is used, additional configuration is required, which is a little troublesome.

I was not satisfied with the above methods. Today I suddenly thought of a common technique in the previous algorithm questions, that is, using a two-host mechanism to solve the problem. If a category is a 0 or 1 sign, you can record multiple bits for multiple categories. This problem is solved.

Assume that there are currently three CBA categories. The following section describes why the CBA category is reversed. Currently, there are four data entries with 001,010,100,101 categories respectively. The last data is used to indicate that some data will be divided into multiple categories.

CATEGORY search method: Since it is converted to binary, it is natural to consider using bitwise operations. Here we take the mysql function syntax as an example. Suppose you want to search for data of Class A, the condition is categories & B '001', so that data 1 and data 4 match. It is also easy to search for multiple categories. The search belongs to Class A or Class B and the condition is categories & B '011 '. Search for A and B. The condition is categories & B '011' = B' 011 '.

Method of adding a category: Because the Low Bit operation is always behind, the newly added category must be added to the high level, for example, adding a category D, the original data category is not classified into Category D by default, and the original data does not need to be modified.

This method is very efficient for determining whether a category belongs. The computer only needs to perform an operation to obtain the result. However, mysql also has a limit on the number of digits. A 64-bit system can only have a maximum length of 64-bit, pay attention to it during use. You can use the bin (n) function to convert decimal to binary ).

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.