Taiyi_interview (Introduction to Database refactoring)

Source: Internet
Author: User

Introduction to Database refactoring


Original link: by Scott W. ambler;http://www.tdan.com/view-articles/5010/

Published:july 1, 2006 Published in tdan.com July 2006

Material for this article is modified from refactoring databases:evolutionary Database Design by Scott Ambler and Pramod Sadalage (Addison Wesley 2006). www.ambysoft.com/books/refactoringDatabases.html

Abstract:

A database refactoring is a small change to a database schema which improves it design without changing, at a practical l Evel, the semantics of the database. In the other words, it's a simple database transformation which neither adds nor breaks anything. The process of database refactoring defines how to safely evolve a database schema in small steps. Database refactoring enables data professionals to work on an evolutionary manner, just as modern application developers D O. It also provides a coherent strategy for organizations to dig their A-out of the legacy database hole.

1. What is Database refactoring?

in the seminal text  refactoring , Martin Fowler [1] describes the programming technique called refactoring , which is a disciplined-restructure code in small steps. Refactoring enables-evolve your code slowly over time, to take an evolutionary (iterative and incremental) approach To programming. A critical aspect of a refactoring is that it retains the behavioral semantics of your code. You don't add functionality when you are refactoring, nor does it away. A refactoring merely improves the design of your code-nothing more and nothing less.

A database refactoring [2, 3] is a simple change-to-a database schema that improves it design while retaining both it be Havioral and informational semantics-in other words, you cannot add new functionality or break existing functionality, Y OU cannot add new data, and you cannot change the meaning of existing data. A database schema includes both structural aspects, such as table and view definitions, and functional aspects, such as St ORed procedures and triggers. I Use the terms code refactoring to refer to traditional refactoring as described by Martin Fowler and database R Efactoring to refer to the refactoring of database schemas. The process of database refactoring is the act of making these simple changes to your database schema.

2. Why Database refactoring?

There is fundamental reasons why do you want to adopt database refactoring:

  1. To repair existing legacy databases. Database refactoring enables safely evolve your database design in small steps, making it an important technique fo R improving the legacy assets within your organization. This was clearly much less risky than a "big bang" approach where you rewrite all of your applications and rework your data Base schema and release them all to production at once. Furthermore, it is much better than the "let's try not to allow things to get any worse" strategy currently employed by th E vast majority of data management groups which I ' ve run into, a strategy which have no hope of success because all it take S is one development team to go around the data management group and does an imperfect database design.
  2. To support evolutionary software development. modern software development processes, including the Rational Unified Process (RUP), Extreme Programming (XP), Agile Unified Process (AUP), Scrum, and Dynamic System development Method (DSDM), is all evolutionary in nature. Craig Larman [4] summarizes the evidence, as well as the overwhelming support among the thought leaders within th E IT community, in the support of evolutionary approaches. Unfortunately, most data-oriented techniques is serial in nature, relying on specialists performing relatively narrow TAS KS, such as logical data modeling or physical data modeling. Therein lies the rub-the the groups need to work together, but both want to doing so in different manners. I believe that data professionals need to adopt evolutionary techniques, such as database refactoring, which enable them t o being relevant to modern development teams. Luckily these techniques exist [3], and they work quite well, it's now up to data professionals to choose to adopt them.
3. Implementing a Database Refactoring

Sometimes a project team finds itself in a relatively simple, "single-application database" situation, and if so they shou LD consider themselves lucky. Architecture database refactoring is fairly simple-you merely change your database schema and update yo UR application to use the new version of the schema. What's more typical was to has many external programs interacting with your database, some of which is beyond the scope of your control. In this situation cannot assume it all the external programs would be deployed at once, and must therefore support a Transition period (also referred to as a deprecation period) during which both the old schema and the new schema is Suppo rted in parallel. For the rest of this article I'll assume that's you ' re in this situation.

To put the database refactoring into the context, let's step through a quick example. You has been working on a banking application for a few weeks and has noticed something strange about the Customer table Depicted in Figure 1[1] -One of the column names isn ' t easy to understand. You decide to apply the Rename column refactoring to theFName column to Rename it to FirstName.

Figure 1. The initial database schema for Customer.

Agilists typically work together in pairs, one person should has application programming skills and the other data ski LLS, and ideally both people have both sets of skills. The pair begins by determining whether the database schema needs to be refactored. Perhaps the programmer is mistaken on the need to evolve the schema, and what's best for go about the refactoring. The refactoring is first developed and tested within the developer ' s sandbox. When it was finished, the changes was promoted into the project-integration environment, and the system was rebuilt, tested, and fixed as needed.

To apply the Rename Column refactoring in the development sandbox, the pair first runs all the tests to see that They pass. Next, they write a test because they is taking a test-driven Design (TDD) approach [5, 6, 7]. A likely test is to access a value of the FirstName column. After running the tests and seeing them fail, they implement the actual refactoring. To does this they introduce the FirstName column and the synchronizefirstname trigger as you see in figure 2.

Figure 2. The database schema during the transition period.

The trigger is required to keep the values in the Columns Synchronized-each external program accessing the custome R  table would at the most work with one and not both columns. At first, all production applications would work with FName , but over time they'll be reworked to Access firstname  instead. There is other options to does this, such as views or synchronization after the fact, but I find that triggers work best.

the  FirstName  column must is populated with the values from the  FName  column. Need to run both columns in parallel during a "transition period" of sufficient length to give the development te AMS time to update and redeploy all of their applications. This transition period could is several years in length, depending on the ability of your project teams to get new release s into production. In this case we have decided that the transition period would run to November.

The pair reruns the tests and sees that they now pass. They then refactor the existing tests to work with theFirstName column rather than the FName column. Once The database refactoring is completed in their development work environment, the pair promotes their work into the TE AM ' s integration sandbox where they rebuild and rerun the tests, fixing any problems which they find. To update the database schema, the pair runs the appropriate and migration scripts in the appropriate order.

This promotion strategy continues to your pre-production integration testing environment and then eventually into Produc tion. Depending on your need, you could implement and then deploy the refactoring within a single day, although more realistical Ly it would be several months until the next major release of your application so you would deploy the refactoring along With any other updates that's you ' ve made.

After the transition period, you remove the original column plus the trigger (s), resulting in the final database schema of Figure 3. You remove the these things only after the sufficient testing to the ensure that it's safe to do so. At this point, your refactoring are complete.

Figure 3. The final database schema for Customer.

There is a little more to successfully implementing a database refactoring than what I ' ve described. You need a-coordinate the refactoring efforts of all the development teams within your organization, clearly someth ing that may prove quite difficult. You also need-get good at deploying refactorings in production, once again coordinating the efforts of several teams. In refactoring Databases [3], my co-author Pramod Sadalage and I discuss several strategies for doing each of these things .

4. Why isn't Just Get it right-to-Begin with?

I am often told by existing data professionals, the real solution are to model everything up front, and then you Wou LD not need to refactor your database schema. Although that was an interesting vision, and I had seen it work in a few situations, experience from the past three decade s have shown that this approach does not seem to being working well in practice for the overall IT community. The traditional approach to data modeling does not reflect the evolutionary approach of modern methods such as the RUP and XP, nor does it reflect the fact that business customers is demanding new features and changes to existing functionality At a accelerating rate. The old ways simply aren ' t sufficient any more, if they ever were [8].

I suggest that's take a Agile model-driven development (AMDD) approach [9,], which do some high-level Modelin G to identify the overall "landscape" of your system, and then model storm the details on a just-in-time (JIT) basis. Take advantage of the benefits of modeling without suffering from the costs of over-modeling, over-documentation, and the Resulting bureaucracy of trying to keep too many artifacts up-to-date and synchronized with one another. Your application code and Your database schema evolve as Your understanding of the problem domain evolves, and you Maintai n Quality through refactoring both.

5. In conclusion

Database refactoring is a database implementation technique, just like code refactoring are an application Implementatio N technique. You refactor your the database schema to ease additions to it. You often find which has to add a new feature to a database, such as a new column or stored procedure, but the Existin G design is not the best one possible to easily support this new feature. You start by refactoring your database schema to make it easier to add the feature, and after the refactoring have been suc Cessfully applied, you then add the feature. The advantage of this approach is so you were slowly, but constantly, improving the quality of the database design. This process is makes your database easier to understand and use, it also makes it easier-evolve over time; In the other words, you improve your overall development productivity.

My experience is the data professionals can benefit from adopting modern evolutionary techniques similar to those of Deve Lopers, and that database refactoring is one of the several important skills that data professionals require. Evolutionary development has arguably become the norm within the IT community, and agile software development approaches E Xtend evolutionary methods to become more effective. My advice to data professionals are to take evolutionary and agile concepts and techniques seriously:they ' re real, they wo RK, and they ' re here to stay.

6. References and Recommended Reading
    1. Fowler, M. (1999). Refactoring:improving the Design of Existing Code. Menlo Park, California:addison Wesley Longman, Inc.
    2. Ambler, S.W. (2003). Agile Database Techniques:effective Strategies for the Agile software Developer. New York:john Wiley & Sons. Www.ambysoft.com/books/agileDatabaseTechniques.html
    3. Ambler, S.W and Sadalage, P.J. (2006). Refactoring Databases:evolutionary Database Design. Boston:addison Wesley. Www.ambysoft.com/books/refactoringDatabases.html
    4. Larman, C. (2004). Agile and iterative development:a Manager ' s Guide. Boston:addison-wesley.
    5. Astels D. (2003). Test driven development:a Practical Guide. Upper Saddle River, Nj:prentice Hall.
    6. Beck, K. (2003). Test driven Development:by Example. Boston, Ma:addison Wesley.
    7. Ambler, S.W. (2004). Introduction to Test Driven Development (TDD). Www.agiledata.org/essays/tdd.html
    8. Ambler, S.W. (2004). The Agile Data Home Page. Www.agiledata.org.
    9. Ambler, S.W. (2002). Agile modeling:best Practices for the Unified Process and Extreme programming. New York:john Wiley & Sons. Www.ambysoft.com/books/agileModeling.html
    10. Ambler, S.W. Agile Model Driven Development (AMDD). Www.agilemodeling.com/essays/amdd.htm

Taiyi_interview (Introduction to Database refactoring)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.