Learning SQL database-based algorithms

Source: Internet
Author: User

Learning Based onSQLDatabase Algorithms

Algorithms are an important research direction in computer science and the key to solving complex problems. Algorithms are everywhere in the computer world. A database is a place for storing data and performing mass computing. It is not uncommon to use some simple SQL commands in a database to store, query, and collect statistics to solve problems in the real world. As the amount of data increases dramatically and business rules become increasingly complex, a dedicated method is increasingly required to meet the efficiency and accuracy requirements. How to convert complex algorithms to commands that can be executed by databases is also an aspect of database application technology research. This article uses commands in MSSQL to illustrate the example.

The database can store Entity Data sets. During computation, the database uses the batchcompute method to process data and reads data from storage devices in batches, the processed data is written back to the storage devices in batches. Some databases provide cursors to read and retrieve each field in a row of data in the table, perform complex business rule Calculation on these fields, and then write them back to the database. Compared with the batch method, the batchcompute method consumes less resources, while the use of cursors consumes too much resources, which is slow, low Efficiency and many restrictions such as locking Conditions.

For example, student_score (SNO, CNO, score, level) is stored in the database, and the score ranges from 0 to 100, if you need to store a field level after the score to describe the merits of the score, for example, a with a score of 90 or higher, 80-90 is divided into B, C with a score of 60 to 80, and D with a score of 60 or less, the following algorithms can achieve the same goal:

1. define a cursor, select all the score records in the student_score table, define a variable for storing the score @ cur_score, store the score of the current record, and define a variable for storing the score level of the current score @ cur_level, used to store scores. The algorithm is as follows: if the record in the cursor is not empty, the result of the current record is retrieved from the cursor, the score segment where the result is located is determined, and the result is stored in the variable @ cur_level, update the level field in the current record with the value in @ cur_level. The whole process needs to read the database at least twice. To obtain a record at a time, you need to write the record to the database at a time. Each record needs to go through this process and the efficiency is relatively low.

2. update the database in batches, set the value of all level fields to D, update the database again, update the level fields of records with scores greater than or equal to 60 to C, and update B and A in turn. One disadvantage of this is that the level field of some records is updated multiple times. For example, if the value of the last level field of a record is A, it is first updated to D, updated to C, B, And a in turn. These repeated updates can be eliminated. By improving the algorithm, you can save the cost of repeated updates. The updated algorithm updates the level field of the record with scores between 0 and 60 to D, and updates the scores of each score segment in sequence. The SQL statements of this algorithm are not hard to write. Use... And... An expression can be used to express the selection conditions for records between 80 and 90.

3. In view of the final analysis of the second method, use... And... When the expression updates a record by referring to a table at the same time, it can easily express the shard and the corresponding level information, and store the information in a table level_about, you can refer to this table in the process of updating the student_score table. In the calculation process, read the content of the level_about table and perform the calculation. For the entire computing process, it is easy to sacrifice space and part of the efficiency for the operator. Due to the current speed of the computer, the level_about table occupies a small amount of space, the loss can be ignored. The information in the level_about table contains at least three fields: start_score, the starting score, end_score, And the ending score. The score between the starting score and the ending score of the level record should be obtained. The data in the table should be similar to the following:

Start_score end_score level

0 59 d

60 79 C

80 89 B

90 100

To update the records in the student_score table, you must use start_score and end_score to determine the level of the score in the current record. The SQL statement implemented in MSSQL:

Update student_score set student_score.level = level_about.level from

Level_about where student. Score between level_about.start_score and level_about.end_score

Compared with the above three methods, the effects of different algorithms for the same purpose are different.

Some simple algorithms can be directly applied to the database without modification. For example, if the business needs to settle for one day every night, the system will automatically settle the bonus twice a week, the settlement bonus period is every Monday and Thursday. In order to achieve Automatic Settlement of the system, you need to use the system's tasks to develop an industry for the system, automatic settlement can be achieved by specifying the settlement at every night (because the settlement interval may change, the timing function in the job cannot be used ). To enable settlement on Monday and Thursday, set a table MISC in the database. The field is equivalent to a global variable. The table has only one record and uses one of the fields (days) to record the number of current settlements, that is, the number of days after the system starts running. When the system executes the task and updates days in the MISC table, update MISC set days = days + 1.

The business requirement is to settle the bonus every Monday and Thursday. It is not difficult to find that the odd number of settlements is 7 days in sequence, the even number of settlements is 7 days in sequence, and the adjacent odd number and even number are 3 days different, you can use the remainder method to unify this problem. If the result of the remainder of the current day (days) and 7 is 0, or the result of the remainder of the current day (days) is 0, the current day is the settlement date. The specific implementation algorithm is:

1. Extract the current number of days to a variable. Declare @ days int set @ days = (select days from MISC)

2. Determine whether the settlement conditions are met. If @ days % 7 = 0 or (@ days-3) % 7 = 0 begin... End

Such a simple algorithm can be directly applied to the database without any problems.

Complex Business Rules require complex algorithms. complex rules are complicated for a variable with specific numbers. If a large number of numbers are randomly stored in the database, in addition, the algorithm needs to be greatly adjusted for batch computing.

For example, for business rules, 4000 yuan should be deducted from the employee's bonus of 400 yuan for repeated consumption, and the last 400 yuan will be deducted. One product will be rewarded once for repeated consumption, you need to use a table (award_repeat) record in the database to produce repeated consumption. If the deduction is less than 400 yuan, the deduction will be made at the next settlement, until the deduction is enough to 400 yuan, and then a product will be rewarded to enter the next cycle, for example, if the total prize money reaches 3600 yuan, it will not be deducted. If it reaches 3700 yuan, it will deduct 100 yuan. If it reaches 7700 yuan, it will deduct 410 yuan, and generate a duplicate consumption.

To implement this rule, record the total number of employee bonuses ([total_award]) in the employee table (member) and record the number of repeated consumption times ([repeat_num]). in another transition table (award_day), record each bonus and deduct the bonus of repeated consumption, and finally in the bonus table (Award) and the repeated consumption that needs to be deducted during the settlement, the actual bonus will be obtained. The batchcompute method is used to accumulate the current bonus to the employee's ([total_award]) field ([total_award]) after the bonus is calculated and deducted from repeated consumption. record the sum of all bonuses for repeated consumption. The algorithm used to calculate repeated consumption is to set the condition (F1) to multiply the total number of bonuses in the member table by one or more times by 4000. If there is a record that meets the condition F1, then, select the primary key and the current date (days) in the record that meet the condition to insert it to the repeated consumption table (award_repeat), and update repeat_num of F1 in the member table to increase it by 1, repeat the condition F1 until the Member table does not meet the condition F1 record.

Conclusion: The Research and Implementation of algorithms in databases are quite difficult and also a challenge. With the increasingly complex business rules in the real world, the algorithms required for implementing business rules in the corresponding database application software are also increasingly complex. To apply complex algorithms to databases, we need to find a unified approach, on the premise that you are familiar with business rules, finding a step suitable for database batch computing is the key to solving the problem based on the characteristics of the database and the ability to execute commands.

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.