Using upgrade scripts to manage and publish database versioning

Source: Internet
Author: User
Tags define definition include microsoft sql server range versions visual studio visual studio 2010

1. Introduction

Database upgrades are often dragged to the "finalize phase" of the release task, which is often left at the end of the project or the last sprint before the release, which is not ideal because:

Every time the software is released on a test environment, the database is often rebuilt, which means that testers lose their test data each time.

If the duration of the project is longer, the upgrade scripts are sometimes delayed until a few months after the initial database changes occur, and the idea of how to migrate the data may have long been forgotten or missing.

If the upgrade script does not undergo thorough and repetitive testing before publishing to the production environment, the risk of failure is very high.

The time required to write an upgrade script is unpredictable, which increases the risk of publishing by scheduled dates and costs.

To avoid these problems, experience has taught me that a good database versioning and publishing strategy is required for most enterprise-class projects. That's how we handled it when we were objectivity.

2. About Agile

Our project takes the agile approach, meaning that the application is developed incrementally and iteratively, and the database is part of the software development process. The first thing to do is to define the "completion criteria" (Definition of Done–dod), which is very important for each efficient team. The completion criteria for the user story level should include a "Story" condition, which means that we only need to consider the completion of the user story, which can then be automatically published through the script. Of course there are many other conditions in the completion criteria (writing database upgrade scripts is one of them), but this topic can be an article.

The completion criteria developed in this manner also have an impact on the sprint plan and the estimate, which serves as a checklist to verify that all major tasks have been implemented. On the database side, each team member should know how to write an upgrade script in accordance with the rules that the team follows: What format is used? Are you using some kind of template? Where is the file saved? What are the naming conventions to follow? Such.

In the development process, the developer completes the Code and database modification in parallel. In addition to making changes to the database project, team members also need to write upgrade scripts, then check in version control with other code, and test user stories in a separate environment.

When a sprint is finished, if you decide to deploy this part of the software functionality to the production environment, the scripts join the entire installation process along with the other necessary steps.

3, version management of the way

Each project has different implementation details for version management of the database, but contains similar key elements, for example:

The database is managed by version control – this is an obvious starting point. If we can't point to the changes in the database, how can we write an upgrade script for them? We have successfully applied the database project in Visual Studio 2010 or Redgate SQL Source control to manage the structure of the database, both of which support the TFS version library. This section has now been supported by a number of tools.

Database version information is kept in the database itself – the database itself needs to be marked in order to check the version of the database schema that is running in a specified environment. There are many ways to do this (user-defined functions, extended attributes, specially named objects, and so on), but in objectivity, we always use a dedicated database table to save version information, usually this table is called Dbversion. The advantage of this approach is that database tables are standard database objects that are widely understood and used by developers and administrators and can be easily accessed from the code side. The database can choose to save the current version or the entire version history. You can see a sample table structure definition in table 1 below.

Check the version of the database when the application starts-contains the check functionality for the database version in the application's code and verifies it when it is initialized. If a condition is not met, the application displays some kind of error message and stops running. This best practice keeps the development team from being stuck with major deployment errors and minimizes the risk of wasting test time.

Writing upgrade scripts in parallel with development – When developers modify the database schema, the SQL scripts that change the database are prepared beforehand. We prepared a set of templates for this script (see the example in Table 2). The previous rows of the template check the expected version of the database and start a transaction automatically once the version is correct. When specific database changes, such as those written by the developer, are completed, the template updates the database version table, submits the transaction, and displays the success information. This practice solves the 2nd and 4th issues mentioned in the Introduction section of this article.

4. Mixed solution

Sometimes, if the number of objects in a database is too large (not exponential), the upgrade script can become bloated, especially if we use a stored procedure or custom function. One approach is to try to control the upgrade scripts only on certain object types, usually the objects that store the data (such as tables). and reload other types of objects during the last phase of the upgrade process. If a team is just touching the database upgrade process and the database contains a lot of business logic, I highly recommend this hybrid solution.

5, how to deal with data?

Data can basically be divided into two groups of processing:

Initializing data, which is essential to running or starting an application, such as referencing data, data dictionaries, and so on.

Business data, which is the sample data that is created by an application's interface, imported by an external data source, or designed to allow developers and testers to start working.

The recommended approach is to separate the two sets of data from the outset of the project to avoid finding problems during the "finale phase".

When initializing a database, we write each set of data as a script or CSV file, put it in a different folder, or embed the initialization data within the upgrade script (which simplifies deployment within a small system). In addition to putting the data in different folders, it's best to be mindful of writing scripts so that the scripts you write can run multiple times (to avoid side effects). Another problem you need to deal with is the order in which database tables are inserted. In a complex database schema (such as a database that contains circular dependencies), it is not possible to accurately set the order of database tables, so the best practice is to prevent foreign key relationships when data is inserted, and then reopen the data after it is inserted.

6. Version Management best Practices

The following practices are not necessary, but I find them very useful and you should at least consider applying them to new projects.

Use a three-part version string

We find that the database version is very flexible with the following format string:

< major version number >.< sub version number >.< revision >

The first part will be changed during the important release or major phase of the system, such as once every few months. The following two sections are controlled by the developer. Versioning changes mean that destructive changes (such as new required fields) are added to the database, which makes the "old" application no longer compatible with the new database schema. The revision number is incremented each time a non-destructive change occurs (such as a new index, a new table, a new optional field, and so on).

Writing scripts that do not depend on the environment

In theory, the authoring of the upgrade scripts should allow them to run in different environments without having to make changes. This means that it should not include paths, database instance names, SQL user names and associated server settings, and so on. In Microsoft SQL Server, you can use the SQLCMD variable to achieve this goal. More information can be viewed here.

If multiple teams are developing in one database at the same time, dividing the entire database into multiple schemas

If you divide a large database into multiple schemas, it becomes very effective for multiple concurrent teams to develop in the database at the same time. Each schema includes its own version and update scripts, which minimizes the conflict between code merges. Of course, the Dbversion table also needs to be modified to allow for the storage of schema versions (new fields). We can separate two schemas: shared schemas and exclusive schemas. When a team plans to change the shared schema, it must consult other teams to ensure that the structure of the shared objects is modified correctly. The exclusive schema is fully controlled by a team.

Alternatively, if a database is a legacy system and we are unable to introduce a new schema, we can divide the database objects into several virtual parts and have each part versioned separately.

After you check in the upgrade scripts, never modify them

When the database shows its current version, you will naturally use it. As a developer, you don't typically compare it to the original version. This is also difficult to discover if different versions of the upgrade scripts are applied to a database instance at the same time. If you have incorrectly modified something in your upgrade script, write a new script to offset previous modifications-do not modify the original script, as it may have been applied to some environment.

When multiple versions are developed at the same time, a certain range of version numbers is retained to simplify the merge operation

When multiple teams develop multiple release numbers in parallel in the same system and database, it is best to agree beforehand to reserve a range of version numbers for each team to avoid possible consolidation problems.

For example, the team a currently in Release Number 1 can use the 2.x.x version number for the shared schema, use the 1.x.x version for the order schema, and the team currently in Release Number 2 uses the 3.x.x version number for the shared schema and uses the 1.x.x version number for the report schema.

Automated upgrade process

One drawback of writing an upgrade script during development is that it is too numerous. As a result, automation is the ideal solution because it saves a lot of time for developers and release managers and others. In addition, it accelerates the entire release process, making the process more resilient. and automating the upgrade process is also easy to incorporate into the continuous integration process.

In objectivity we use the Psake module (PowerShell) to automate the process. PowerShell is the Microsoft Task Automation framework, which includes a scripting language built on top of the. NET Framework. Psake, on the other hand, is a domain-specific language written in PowerShell that uses a dependency pattern similar to rake or MSBuild to create a build. A Psake build consists of multiple tasks. Each task is a direction in which you can define dependencies on other task methods. Our upgrade script is written as a separate Psake task.

This is our database upgrade step:

Check the current version of the database

Review the upgrade scripts associated with the current version (this step relies on the file naming conventions corresponding to the database version)

If the file is found, the file content is executed and the output is validated, and exits if an error occurs

If no script is found, exit directly

Repeat step 1

You can find an example implementation in table 3.

Validate the upgrade script in your continuous integration process

We often find in the objectivity project that developers who are unfamiliar with database upgrade processes sometimes break the rules used in the project when they write upgrade scripts. Therefore, it is a good idea to verify the consistency of your upgrade scripts after each check-in to the continuous integration server, including checking the following:

File naming conventions-we use the following format for file names:

< prefix >_< the current version number in the database version table >_< target version number >_< additional information about upgrades >.sql,

For example: Upgrade_1.0.1_1.0.2_rename_column.sql

If multiple schemas are used, the schema name is included in the prefix.

File Content – You can check the header and tail content of the script to make sure that the correct template is used, and you can compare the file name with the version number in the content.

The validation process can be done before the code is actually built. Once the violation is detected, the build fails.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.