How to choose a database platform wisely

Source: Internet
Author: User
Tags ibm db2 informix sybase sybase database most popular database

There is no silver bullet in the database world. It is up to you to choose which data you want to store.
, The operating system used by your application
And language platform, your budget, and whether you need a data warehouse, Bi or decision support system.

  Background

Many of my friends, developers, application and system architects, and sometimes even System Administrators often ask me the same type of questions: "Bo, should I use SQL Server or Oracle Database in my applications? ", "Bo, should I migrate my Sybase Database to SQL Server ?", "Which database (Platform) is the best ?", "Why don't we use open-source databases, such as MySQL or PostgreSQL ?". All these problems are basically the same-which database platform is most suitable for an application or which database platform is most meaningful to IT and business?

To answer these questions, I will ask them more questions (usually more than 10) Before I recommend any database platform ), many times, people do not understand why a simple problem causes so many problems.

In fact, many people do not know why he/she needs a database, what is a relational database, the difference between the database and the workbook (and their respective advantages and disadvantages), the Access database or the SQL Server database should be used, or what benefits applications can obtain from the database platform.

Although I am not here to publicize what databases are, I know that it and non-IT personnel are not the same thing about databases. I don't want to be called a database-centric nerd, however, I think that in the 21st century, people have been talking about Bi (Business Intelligence). I also think that the most flexible, complex, and easy-to-maintain system is data-driven.
The database is used to store the generated data. Next, I will give a brief introduction to the most popular database platform. Based on my experience and interactions with other it friends, I will introduce how people use these databases.

  The following databases I have used over the past 19 years:

• DBASE/FoxBASE/FoxPro in the DOS/Novell/Windows 3.x world

• Ms SQL Server (6.5, 7.0, 2008, and)

• Oracle (7, 8/8i, 9i, 10g)

• Sybase SQL/ASE servers (4.9.2, 10, 11, 11.9.2, 12/12. 5, 15.x)

• Sybase IQ 12.6/12.7

• Informix/dynamic (6, 7)

• MySQL (4, 5/5. 1)

• PostgreSQL (8.3.5)/enterprisedb

In addition, I participated in a series of database selection evaluations, such as vhayu and onetick. I also had memory in various industries.
Database usage experience, such as Oracle TimesTen and Sybase ASE memory version (part of Sybase rap commercial version, it has become a strong competitor in the field of time series databases, especially after the integration with CEP at the beginning of last year, ).

I have worked as a full-time employee in the company or as a consultant for different projects in different companies. I really feel that there is no silver bullet in the database world. What database to choose depends on the following factors:

• Who are you;

• Goals you want to achieve (Business/functional requirements, performance/reliability/scalability/availability requirements );

• How much data is stored in the current database;

• The operating system and language platform to be selected for the application;

• What is your budget;

• Do you want/need to build a data warehouse, Bi or decision support system.

Comparing different database systems is like comparing different development platforms and operating system platforms:

• Comparison between Java/J2EE and (C #, VB)/. net;

• Comparison between Windows and Linux/Unix;

• Comparison between Linux and private UNIX systems such as AIX/HP-UX/Solaris/Tru64/IRIX;

• Compare UNIX with openmvs/Unisys/tandem/IBM mainframe.

If we do not need databases in front-end office application systems, such as order entry, banking or telecommunications systems (mainly worried about lock/blocking, latency and low I/O ), we only want to use a database to store fact data or use a database to back up data.

Of course, all relational databases provide different functionality, even though they all claim to comply with ansi SQL and SQL-92 standards, but actually all have their own set of SQL languages, this is like the relationship between WebSphere/WebLogic/Oracle app server/SYBASE eserver/JBoss. They are different implementations of J2EE and are compatible with J2EE specifications, but they also have their own extensions and changes.

  Key tasks (OLTP) and high-performance systems

 
If we need databases to play a more important role in key business systems and want to achieve no-fault time, high availability, and cluster reliability (performance and data volume ), disaster replication, automatic report generation, and other goals, or the need to achieve extremely high-speed and low-latency (HPC, high-performance computing) results, we may need enterprise-level database features, but the cost is also higher.

  HA (high availability) Cluster Function

The cluster functions of each database platform are as follows:

• Oracle RAC is best suited for HA clusters and Server Load balancer
And expansion;

• Ms SQL Server provides ha failover without load balancing;

• The Sybase ASE 15 cluster has the same functions as Oracle RAC;

• Teradata is a distributed database platform;

• The MySQL cluster claims that the TPC-C performance benchmark is higher than Oracle RAC;

• IBM DB2/UDB: IBM pseries/power servers running AIX hacmp (high-availability cluster multi-processing)
And Veritas VCs/Sun clusters run well.

  High performance/low latency/memory
Database

• Oracle provides a TimesTen memory database;

• MySQL clusters use memory storage
Engine;

• Sybase provides the ASE memory database;

• In addition, we can also use ramsan/SSD (SSD)
) Install the SQL Server database. All disk I/O related to the database occur on the SSD instead of the traditional hard disk. Therefore, we can greatly reduce the disk I/O latency, provides I/O throughput and improves the overall performance of the database;

• There are other memory databases, such as SQLite or ExtremeDB, but most of them either only support embedded systems or only support single users or single connections at the same time.

  Scalability/performance

• When using Oracle RAC and Sybase ASE 15 clusters, if the existing hardware cannot meet the performance requirements, we only need to add a more powerful server to the Cluster
And gradually replace other nodes;
• When using an SQL server cluster, because it only supports active/passive mode, there is no scalability at the instance level. If the existing hardware cannot meet the requirements, we must replace the entire cluster;

• MySQL clusters also support adding nodes to clusters where all nodes are active, but they are non-shared clusters. The more nodes, the more database copies, more storage and data replication between all nodes require more networks
Traffic.

  Copy

• SYBASE: First of all, I am not a child care specialist in Sybase. Sybase has the best replication solution, which exceeds the replication solution of Oracle Data guard and SQL Server (that is, using its image/log transfer/transaction replication, SQL Server replication is still far worse than Sybase's replication server );

Sybase's heterogeneous replication and homogeneous replication were one of the main reasons why Sybase was excluded by Wall Street in the 1990s s but used by many large companies.

Sybase replication can be performed at the database level (active/active replication, better than SQL Server images), transaction level (similar to SQL Server transaction replication), table level, and stored procedure level, function-level, MSA (available for multiple sites), and log transfer (replication of solutions for the poor in the world, its image activator, combined with EMC/srdf, can truly ensure zdl (zero data loss) during planned and unplanned downtime ).

• SQL server: SQL Server replication can be achieved through log transmission and database mirroring (active/passive, R/W or read-only), and supports object-level (table) Transaction replication.

• ORACLE: Oracle Data guard is one of the best solutions for database cloning and disaster recovery (using RMAN ).

• MYSQL: MySQL provides a non-shared cluster. Basically, all active nodes are synchronized. Its asynchronous replication also supports transaction and Object-level replication.

• Storage-level solutions: Use EMC's BCV, netapp's snapshot manager, Hitachi's snapshot, and Veritas's volume replicaser to replicate data at the disk block level and volume level.

  Hybrid System

Generally, backend database storage
When a large amount of data is retrieved (data sorting, grouping, aggregation, and computing), performance issues may occur. When the running speed slows down, the data writing process may be affected.

Therefore, for these types of applications, we hope that the database can handle a lot of data and read/write blocking.

Oracle

Oracle is the best choice for such a system, because in Oracle, read/write programs do not block each other, and data is consistent throughout the read process/session/transaction, system overhead mainly occurs in memory
Lock and redo log.

SQL Server

New Snapshot isolation of SQL Server 2005 uses the same row-based version function of Oracle, but more Io is generated in tempdb.

Sybase ASE

Sybase ASE's read/write programs block each other, But Sybase IQ uses snapshot version-based management to solve this blocking problem.

Other Databases

Some other databases do not have blocking problems, but some do not even comply with acid (atomicity, consistency, isolation and durability) standards.

  Data Warehouse

Enterprises establish data warehouse storage
/Archive all historical data and the data they think needs to be stored in the database. Normally, the nonstandard data enters the data warehouse and the multidimensional data is OLAP and Bi) the solution provides quick retrieval.

 

Address: http://kb.cnblogs.com/page/59817/

 

The data warehouse is also used for new or enhanced business strategic backtracking tests. In addition, the data in the data warehouse grows exponentially but not linearly.

  Advantages of column-based relational databases

A column-based relational database has great advantages because it is inherently multidimensional and each column is self-indexed (B tree ).

Typical column-based databases include Google's big table, Sybase IQ, vertica, and MySQL-based kickfire.

On average, these column-based databases are better than traditional row-based relational databases (such as Oracle, SQL Server, Sybase ASE, teradata, DB2, and MySQL) the data retrieval speed is 60 to 100 times faster, but the speed is slow when data is updated based on rows. So far, Sybase IQ seems to be the leader in this field, sybase IQ is the world's largest Data Warehouse (more than 1 p bytes.

  Visit the websites of these databases for more details:

Http://www.kickfire.com/Solutions/Resource-Center

Http://www.sybase.com/products/datawarehousing/sybaseiq

  More information about column-based relational databases

When Oracle and SQL Server are used as data warehouses, multi-dimensional data cubes need to be created, and more storage space is required. I personally think that, data Warehouses using row-based relational databases will soon be replaced by column-based data warehouses.

In general, row-based relational data (SQL Server, Oracle, Informix, DB2/UDB, MySQL, Sybase ASE, etc.) is suitable for OLTP applications, while column-based relational databases (Sybase IQ, kickfire, vertica, etc.) is more suitable for OLAP/DSS and data warehouse applications.

Note: those time series databases such as onetick, KDB +, and vhayu cannot be considered relational databases. Their storage structures are file-based, each column of data uses a separate file storage, which is basically a column-based database. The column-based database can also be used for backtracking testing. In addition, the data in the row-based Data Warehouse (cube) increases exponentially, but the data in the column-based Data Warehouse increases linearly.

Traditionally, Oracle has a leading advantage in the data warehouse field, but it has lost its advantage in column-based relational databases (such as Sybase IQ, vertica, and kickfire, the Big Table that Google is using is also a column-based database/storage system.

  Migration considerations

Affordability and existing investments in existing databases are another consideration.

I have seen many successful cases of migrating data from Sybase to SQL Server, and vice versa, but few of them see migrating data from Sybase/SQL Server to Oracle, and vice versa.
Depending on the database size and application scope, it may take a long time to migrate the Sybase/SQL Server Stored Procedure, rewrite the application and perform tests, even with the help of the Oracle migration workbench (now we can use SQL developer), each stored procedure must be rewritten if the application only uses ODBC/jdbc/ADO. and no stored procedures/temporary tables, it is easier to migrate.

  Native programming languages

As mentioned earlier, Oracle has the best support for Java. We can store Java code in Oracle and use it as a native Oracle function, this makes Oracle highly scalable (a bit like open source code). In addition, Sybase also supports Java code in the native. SQL Server and Sybase support extended stored procedures. They basically refer to an interface in the database that calls an external dynamic library. They can be implemented in different programming languages.

  Cost of Ownership

When studying the functions of these databases, we must remember that the cost and time range, even if you have an experienced database team, it makes no sense if more than three database platforms are selected. In addition, it does not make much sense to merge all databases into a single database platform, because most heterogeneous databases can use the SQL Server Connection Service, Oracle Gateway
Or use some database replication and ETL tools. When considering the database maintenance cost, do not isolate the database cost. Instead, consider it as part of your applications and infrastructure.

I hope this article will be helpful to those who will or are selecting databases for their applications and will not regret their choice.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.