How to customize an ODBC driver based on rest service

Source: Internet
Author: User
Keywords Data Analysis Data Warehouse rest
Tags access an application analysis api application applications aspx based

The REST service can help developers to provide services to end users with a simple and unified interface. However, in the application scenario of data analysis, some mature data analysis tools (such as Tableau, Excel, etc.) require the user to provide an ODBC data source, in which case the REST service does not meet the user's need for data usage. This article provides a detailed overview of how to develop a custom ODBC driver based on the existing rest service from an implementation perspective. This article focuses on the implementation of the ODBC driver principle, combined with the code details of the ODBC and Rest service data interaction, and at the end of the article introduced the ODBC client program to invoke the ODBC API principle, as well as the actual development of debugging environment to build.

Readers who may benefit

The current mainstream data analysis tools, such as Tableau,microstrategy,excel, can only be accessed by ODBC Driver to access the underlying data source. In other words, in the process of developing a database or data warehouse, even if we have implemented a data access interface that conforms to the SQL specification, even if we provide our own JDBC driver, we still cannot guarantee that the data users can use our data effectively. To do this, we need to customize an ODBC Driver for the data source.

If your data source happens to be a common database product like Mongodb,hbase, you might consider buying some commercial products directly, such as Simba ODBC driver, to address your needs once and for all, but it will mean a lot of spending. The more difficult case is that your data source is not that mainstream, and there is no driver that can be purchased directly to apply to it, then customizing your own ODBC driver may be your best bet. Even if you are a developer who knows nothing about ODBC driver, this article will bring you more or less help.

Our situation

Simply put, our team has developed a special SQL engine in Java. At the beginning of the project we had only JDBC drivers and a rest Server for Web page clients, but we didn't have ODBC drivers, so most customers couldn't really use our products to do their job.

To solve this problem, we designed the following figure solution: We use rest server to uniformly accept requests from all clients, including Web page clients and clients using ODBC driver. The REST server uses the JDBC driver to access our database. Of course, if your client is a Java program, you can simply access our database directly through JDBC, saving the cost of these steps. This is not shown in this picture.

At the client, we have a deep customization of a proprietary ODBC Driver that provides a standard ODBC API to the top application, encapsulating the logic of all implementations. On the underlying implementation, it calls the rest Library of C + +, encapsulates the SQL query request sent over the application as a rest request, sends it to our rest Server, and returns it to the upper-level application in an ODBC-compliant manner after the results are obtained.

  

Starting with Hello World

For developers who have never had access to ODBC, understanding the behavior of an ODBC client helps to understand what specific APIs are required to customize an ODBC driver. The following illustration shows the implementation of a simple ODBC client program, each with a detailed annotation explaining its behavior, reading through the code, and having an intuitive understanding. To simplify the code, we omitted all the code for error checking. All SQLXXX-formatted functions are standard APIs defined by ODBC.

We divided the program into five areas, labeled A~e respectively. The area A and area B initializes three ODBC-related handles, respectively:

Environnement handle (HENV): contains one or more connection handle. At the same time, some global information is included, such as the ODBC version required by the client, and diagnostic information at the environmental level.

Connection handle (Hconn): Represents a connection to a dbms/data source that contains information about the connection level, such as the time-out period for the connection, the isolation level, and the diagnostic information at the connection level.

Statement handle (HSTMT): It can be viewed as a specific query request, such as SELECT * from employee.

It is worth mentioning that the ODBC specification only defines how the data source exposes the data access interface, but it does not specify how to implement it, which also includes the implementation of the three class handles. In fact, in code these three types of handles are passed by the Sqlhandle type, and Sqlhandle is essentially a void * type, pointing to the corresponding structure of our custom.

ODBC provides applications with a series of C-language-style APIs to support access queries. Unlike an object-oriented language driver, an application that uses an ODBC driver needs to prepare the memory area for the data to be returned in advance, and in this sense the task of ODBC is to correctly move the data that the user needs into the user-specified memory area (possibly with some data transformations, For example, if the application needs to support Unicode, then ODBC driver may need to convert the source data of the char type to the WCHAR type. A series of handles and variables are initialized in area A of the following figure, where line No. 305 to No. 307 opens up some of the memory areas used as caching on the program's stack. In fact, in the E region, we passed in the reference to the variable x and I, so we can think of the two numeric variables in row No. 308 to No. 309 as the memory region where the result is stored.

  

In zone C, we call the SQLDriverConnect function, passing in the Hconn handle and the user name, password, driver name, and so on that the data source is connected to. In the implementation of the ODBC driver, we complete a series of assignment operations to HCONN (its initialization has already been done in zone B), making Hconn a usable connection handle. When the return value of the SQLDriverConnect equals Sql_success, a dbms/data source connection is formally established.

In zone D, the client program first initializes the statement handle HSTMT and then queries directly using the SQLExecDirect API. The second parameter of the API receives a string that is the SQL for the query request.

In the last e zone, the client gets the query results. In this program, the client first extracts the column Name property of the first column using the SQLColAttribute interface in line No. 331, where the second parameter specifies the first class to return the result, and the third constant parameter sql_desc_name specifies the desired property label (column name). Next, the client takes advantage of the SQLBindCol interface to tell the ODBC driver that it wants to populate the first column with the return result in the memory that Szcoldata points to, and use constant arguments to SQL_C_ TCHAR tells the ODBC client that the return type you want to see is the Char data type, which is called binding (BIND). After everything is ready, the client invokes the SQLFetch interface to get the first column of the first row of results, and because no other return column is bound, SQLFetch actually returns the contents of the first column.

In a more realistic client code, it is possible to call the Sqlnumresultcols interface first to know how many columns a total number of returns results. For each returned column, the client uses a more convenient SQLDescribeCol interface than the SQLColAttribute interface to get all the basic information about the column, including the column name, type, length, and so on. Depending on the column information returned, the client adjusts the parameters of the SQLBindCol to properly accept the corresponding return result. After everything is ready, the client invokes the SQLFetch interface to obtain the desired query results. Because each call to the SQLFetch returns a row in the result set, the client program needs to call SQLFetch again until SQLFetch no longer returns SQL_SUCCESS, but instead returns SQL_NO_DATA, indicating that more rows are no longer available to return. Depending on the requirements, the client can consider whether to reuse the same block of memory to accept the different rows in the result (take a row, using one line, or at the outset, request a chunk of memory that can hold all the rows, and each bind to a different memory location (after all the rows are taken and the result data is used together).

Start customizing ODBC Driver

With a certain understanding of the interfaces required by the ODBC Driver in the previous section, if we need to write a complete ODBC Driver from scratch, we need to do two tasks:

1. Implement all api,msdn required by the client gives a detailed definition of each API in the ODBC specification (http://msdn.microsoft.com/en-us/library/ms714562 (v=vs.85). aspx, Thankfully we didn't have to implement every interface, just to find the smallest necessary set according to the client's behavior, and in Windows we packaged all of the API implementations into an executable module, usually a DLL file.

2. Let the program client program find our driver correctly, in short, we need to properly install the ODBC driver on the machine where the client program is running.

Since the second step is significantly lower than the first step in the implementation complexity, the introduction of the second step also helps us to have a holistic understanding of the ODBC driver. So while the second step actually relies on the completion of the first step, we are still giving priority to the implementation of step two. Here we can assume that we have implemented all the necessary APIs, and that the implementations of these APIs are packaged in a file called Driver.dll.

First step: Install ODBC Driver

Understanding ODBC schemas

In the ODBC schema (http://msdn.microsoft.com/en-us/library/aa266933 (v=vs.60). aspx), there are four key modules, respectively:

API: Connects the data source by invoking the ODBC interface, sending and receiving data, and closing the connection. The API here is just an interface, not implemented, and the implementation needs to be done in the driver module.

Driver Manager: Provide applications with information such as available data sources, dynamically load drivers on demand, provide parameter checking, and more.

Driver: A function method that handles ODBC to manage all interactions between an application and a particular dbms/data source. If necessary, driver also converts the request statements in standard SQL format to the native SQL format of the target data source.

Data Source: Consists of a database engine and its databases.

  

Where APIs and driver manager are already in general the operating system itself. On Windows, we can install MDAC (Microsoft Data Access rs, http://www.microsoft.com/en-us/download/details.aspx?id= 5793) to get all the required header files already relevant to the tool resources. There are similar unixodbc in UNIX environments. In the following, we only consider ODBC development under Windows. When developing ODBC driver, the underlying data source is generally ready. So we just need to register the ODBC driver into Driver Manager.

Register ODBC Driver

Driver Manager learns from the registry the list of all available ODBC Driver and their respective details. The exact location is (assuming the target machine has 64-bit Windows installed):

32-bit drive:

Hkey_local_machine\software\wow6432node\odbc\odbcinst. INI\ODBC Drivers

64-bit drive:

Hkey_local_machine\software\odbc\odbcinst. INI\ODBC Drivers

Take 32-bit Windows as an example, we open the ODBC drivers key in the registry, we can see all the 32-bit ODBC drivers installed in the system, we name our ODBC as Ebayodbcdriver and Add a line to driver:

  

After learning the name of the ODBC Driver, Driver Manager looks for details of the appropriate ODBC Driver on the parent node of ODBC Driver, or Odbcinst.ini. The prerequisite information includes the driver property and the Setup property, telling the ODBC manager where to look for executable programs for driver and setup. Where Driver corresponds to the ODBC Driver we are going to implement, and the Setup program includes APIs for setting up DSN, which, according to the usual practice, is compiled with the ODBC Driver API in the same DLL file. So we see that under Ebayodbcdriver, driver and setup point to the same DLL file. You can also define some other properties here, but these are optional.

  

With everything in place, we can create DSN for our ODBD driver in control panel--administrator tools--data Sources (ODBC). For 32-bit ODBC driver, we need to use C:\Windows\SysWOW64\odbcad32.exe this 32-bit version of Data Sources (ODBC). It is worth mentioning that in the process of creating a DSN using Data Sources (ODBC), we used the interfaces in the Setup program mentioned above, especially the ConfigDSN interface. (http://msdn.microsoft.com/en-us/library/ms709275 (v=vs.85). aspx)

To simplify these installation steps, we can wrap the logic of all these registered ODBC driver in the form of Windows Installer, allowing users to simply install and register ODBC driver by installing an EXE.

  

Step two: Implement the ODBC API

Descriptors

In MSDN's Exposition of ODBC architecture, the core function of the ODBC driver module is to manage all interactions between applications and specific dbms/data sources. The carrier of the interaction is the data, and the data means that it needs memory space to store it. As mentioned earlier, the ODBC driver itself is positioned on a data porter, which forwards the application's request to the data source and moves the data returned by the data source to the application line by row. In this process, the ODBC driver requires at least two areas of memory, or buffer: one to cache the results returned from the data source, and the other to cache the results of the handover to the application. These two pieces of buffer include not only the data itself, but also the description of the data. For example, in the data returned to the application, ODBC driver not only needs to maintain the column data itself, but also maintains the type, length, and other information of the column data, between the ODBC driver and the application, collectively known as creator Row Buffer Descriptor (ARD). Correspondingly, the data source to the ODBC driver is not just the data itself, but also includes a description of the meta information for each returned column, which is referred to as implementation Row Buffer descriptor (IRD).

  

In fact, the memory area where the data itself is stored in ARD is passed in by the application when it calls SQLBindCol, ARD is not responsible for the application and release of the memory, and the memory required for storing other information is maintained by the ODBC driver. When an application invokes an interface such as Sqlnumresultcols,sqlcolattribute,sqldescribecol, the ODBC driver finds the corresponding content in ARD and returns it to the caller; When the application invokes SQLBindCol and SQLFetch, the ODBC driver learns the location (pointer) where the returned data should be stored by ARD, reads the latest row of data from the IRD, and exerts some necessary types of conversion to the specified location.

The ODBC standard, which corresponds to Ard,ird, also provides two other types of buffer, namely, creator parameter descriptor (APD) and implementation parameter descriptor (IPD) , which is used to deal with the parameters in dynamic query, the two types of buffer are not described in detail in this paper. These four types of buffer form the four most important descriptor in the ODBC world. For more information, readers can refer to (http://msdn.microsoft.com/en-us/library/ms716262 (v=vs.85). aspx).

In a specific implementation, ARD and IRD are defined as special structural bodies (struct) that are stored in the structure genodbcstmt (Generic ODBC Statement) that represent Statement. The following illustration shows some members of one of our genodbcstmt structures: first, to identify their type of label (to differentiate it from the structure representing Environnement and representative connection), and then to the four different uses of descriptor mentioned above, Then there are some attribute information at the statement level, SQL statements, and so on.

  

We take ARD For example detailed analysis, the concrete implementation of ARD is not bound by the ODBC specification, can realize freely. In our implementation, we use the struct genodbcard to represent the corresponding ARD of a statement. Each ARD includes some information that returns the column share, and contains different details for different return columns, represented by a more granular structure genodbcarditem.

  

Observing the member variables in Genodbcarditem, it is easy to generate one by one correspondence with the ODBC API. For example, the dataconcisetype corresponds to the column type information returned by SQLColAttribute, and the DATAPTR member corresponds to a pointer to the memory space passed in by the SQLBindCol. In summary, the implementation of most ODBC APIs is essentially the access and modification of different member variables of ARD and IRD.

  

Accessing a data source with the rest API

ARD is responsible for the interaction between the ODBC driver and the application, and its initialization is done during application invocation sqlbindcol. The IRD is responsible for the interaction between the ODBC driver and the data source, and its initialization needs to be done in the process of exchanging data with the data source.

We first define the interface for the rest request:

Std::unique_ptr Restquery (

wchar_t* Rawsql, Char serveraddr, char username);

The Sqlresponse class encapsulates an SQL request for all returned content, and for each SQL query, REST Server returns an instance of Sqlresponse. The Columnmetas member in the instance contains information about each returned column, while the results member saves each row of the returned results as a string.

  

ODBC Driver The returned Sqlresponse instance to the IRD of the statement so that IRD actually has the return results of the SQL query. When an API such as SQLFetch is invoked, the ODBC driver only needs to find the Sqlresponse instance in IRD to parse the information to return the information the caller needs.

  

Other implementations

The realization of Sqltables,sqlcolumns

The application calls these two APIs to get all the tables and columns that the current connection can query. Since this part of the information is common throughout the database connection, we send requests for all table and column meta information to the rest server when the application SQLDriverConnect and establishes the connection handle. Similar to Sqlresponse, we encapsulate this information with Metadataresponse and give the returned instance to the Genodbcconn structure representing the connection handle for maintenance. When the Sqltable,sqlcolumns request arrives, the ODBC driver extracts the required information from the Genodbcconn metadataresponse and returns it to the caller.

The realization of SQLDriverConnect

Check the incoming connection string, and if it explicitly gives information about the address, username, password, etc. required to connect to the database, confirm that the rest server survives directly with the rest request and get the table and the Metadataresponse. If the above information is incomplete, the message is filled in with the Application pop-up dialog box. Of course, the caller can also specify the already configured DSN directly in the connection string to complete the connection directly.

The realization of SQLGetInfo

When an application cannot confirm the source of an ODBC driver through the name of an ODBC driver, it invokes a series of sqlgetinfo interfaces to obtain some of the features of the ODBC driver, such as the version, supported functions, supported data types, and so on. The tableau and other BI tools choose different behaviors based on these return results, such as tableau when generating SQL queries that cannot be determined by whether a marker, such as a table name, is enclosed in single or double quotes, and it calls SQLGetInfo to get Sql_identifier_ Quote_char property to determine. The ODBC driver implementation requires an analysis of the behavior of the application to determine the results of the SQLGetInfo return.

Unicode support

Starting with ODBC 3.5, ODBC supports both Unicode and ANSI-encoded APIs. ODBC uses the suffix W to represent interfaces that support Unicode, such as SQLDriverConnect and SQLDRIVERCONNECTW. Details can be referred to (http://msdn.microsoft.com/en-us/library/ms716246 (v=vs.85). aspx).

Client Principles and Diagnostics

Client principle

Creating a standard ODBC client program in Visual Studio requires:

The introduction of the <sqlext.h> header file, the reference to the file will indirectly refer to <sql.h>, <sqltypes.h>, <sqlucode.h> and other headers, These header files contain declarations of various constants and methods in the ODBC standard, such as SQLDriverConnect methods and Sql_success, Sql_error constants, and so on.

In the linker of the project, the dependency on Odbc32.lib and Odbccpp32.lib is added. In Visual Studio 2012, newly created C + + projects rely on these two static link libraries by default, as shown in the following illustration:

  

The required header file resources and static link library resources for these two departments are provided by ODBC Manager (or the Windows operating system). After these two steps are complete, we can invoke the ODBC API arbitrarily, as we did in the client sample program at the beginning of the article. The client program is able to successfully access the corresponding API in the corresponding ODBC driver, thanks to the help of ODBC Manager in the middle. We can note that during compile and link, the client program does not have any dependencies on our ODBC driver, and instead, the client program relies only on the header files and static link libraries provided by ODBC Manager. The calling principle is shown in the following figure Purple arrows:

  

ODBC Manager gives the "implementation" of all ODBC APIs in Odbc32.lib and Odbccpp32.lib so clients can complete link without relying on any specific ODBC driver. However, this "implementation" is in fact only a simple forwarding, ODBC Pret the client's ODBC API request to the appropriate ODBC driver based on the driver name specified in the client program.

For example, the client invokes SQLDriverConnect, because the client program and the Odbc32.lib are link together, the program will first enter the implementation of SQLDriverConnect in Odbc32.lib, which is implemented through the explicit Linking (can be understood as run-time dynamic link, reference http://msdn.microsoft.com/en-us/library/784bt7z7.aspx), you can find and invoke our real ODBC driver DLL ( Referred to as Driver.dll) in the SQLDriverConnect method, the end client program calls to the SQLDriverConnect method in Driver.dll.

ODBC client program, ODBC MANAGER,ODBC driver These three role assignments can gracefully complete the decoupling between client programs and specific drivers, ensuring that we can freely add and specify new ODBC drivers in the system. However, this design also poses problems for development and debugging: we cannot simply trace every step of the program running in the debug mode in Visual Studio, because the client does not have a dependency on our driver, so visual Studio could not find the source code for the driver for us.

Diagnosing client Programs

The workaround is to remove the dependency of the C + + project in Visual Studio on the static link library provided by the ODBC Pret, instead relying directly on the static link library of the Custom ODBC driver ( For short driver.lib, this static link library will be produced as a by-product when we compile driver.dll. In the property pages->configuration properties->linker->additional dependencies, we first cancel the Inherit from parent or Project defaults option, and then inherited values except Odbc32.lib and odbccpp32.lib into additional dependencies. This ensures that we do not lose the default dependencies on other items, but also on the client's reliance on ODBC Manager's Odbc32.lib and Odbccpp32.lib. Finally, we add the Driver.lib path to the client program to show that it relies on the implementation of our ODBC driver.

  

It is worth mentioning that we still need to refer to the header file <sqlext.h> In Visual Studio's view, the only difference is that the ODBC in the header file The implementation of the API is no longer in odbc32.lib and Odbccpp32.lib, but in our driver driver.lib. Visual Studio can automatically navigate to the source code of the driver based on this understanding, so that we can freely trace function calls between programs in Visual Studio's debug mode.

Other Debug Tools

The debug of ODBC driver is relatively cumbersome, and in addition to printing some key steps to a log file, there are two other ways to add it:

1. Use the tracing feature of the ODBC Data Source Administrator to capture logs for ODBC manager.

  

2. Include windows.h, use the OutputDebugString method to output debug statements, with Dbmon viewer output. This approach can be supplemented when log logs are not working, and will not incur additional overhead if the Dbmon is not turned on. See http://msdn.microsoft.com/en-us/library/windows/desktop/aa363362 (v=vs.85). aspx specifically.

Summary

This article describes in detail how to complete an ODBC driver that takes the rest API as a way to access the background data, with the rest service in view. The work of this article can help the rest service's offering team to package their services in an ODBC format so that their users can consume their services on more commercial platforms.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.