Translation: Powerful features of Python and SQL Server 2017

Source: Internet
Author: User
Tags data structures http post python script

Powerful features of Python and SQL Server 2017

Python is a new version of SQL Server 2017. Its main purpose is to allow Python-based machine learning in SQL Server, but it can use much more than that, as well as any Python libraries or frameworks. To provide a possible example, Hitendra shows how to safely use this feature to provide a smart application cache, where SQL Server can automatically display when a data change triggers a cache refresh.

SQL Server 2017 has added its advanced analytics extension, now known as the Machine Learning Service, which allows SQL Server to execute Python scripts in tsql through the Python machine learning service. This basically provides a way for a database programmer to pass data directly to Python. This is not limited to the effectiveness of providing machine learning data analysis functionality, because Python has many readily available modules and frameworks to solve many problems, such as performing a large number of computations and data structures, profiling, network operations, database operations, network operations, or local/network-based file system operations. Obviously, many of them are the best in the middleware, but in the database system, there are many times, direct communication with the external system directly is more convenient, rather than relying on external processes to poll the data source to perform the task. This makes sense when there is no need to have such a solution in the database or data tier, when it does not provide any security issues.

Here, we will try to demonstrate an example of using Python in an advanced analysis extension, which shows how a database triggers an external process to perform the activity of data provided as a parameter. This is to consider security, data reliability, and transaction response time issues.

Use Case for Python

Some tasks can be done more easily by invoking Python scripts from SQL instead of relying on middleware. This is especially the case when the task is initiated by an event in the database. Tasks may include

1. Work to send data or receive data to a network-based system through TCP/HTTP/SOAP.

Two. Leverage local platform resources, such as file systems, networks, or GPUs.

3. The project uses a common data format, such as JSON, XML, or YAML, to build real-time integration between one or more systems.

4. Generate data or files by communicating with external applications.

Of course, there are few potential downsides.

1. Work if your use of Python requires Internet access, there is a risk that secure data must be guaranteed to be shared unexpectedly on the internet. Any Internet access must be strictly regulated by the Internet.

Two. Allows you to execute Python scripts on the server through enable external scripting to expose security risks.

3. Project resource-intensive Python scripts on the same server can affect the performance of ongoing transactions in large OLTP systems.

When you weigh these pros and cons, Python can still play a useful role if it minimizes risk. As an example, let's consider how to use Python to build the data caching system used by the application layer.

Sample Solution Cache

Caching data can be an effective way to improve application performance. On the storage overhead of the cache, we can gain useful performance benefits in the face of chatty network traffic similar to the database, as well as the high resource consumption of the database in the face of duplicate queries. When we build our cache infrastructure, the common problem we face is when to flush the cached content. After a certain interval, we tend to adopt a simple method of rebuilding the cache. However, this is very inefficient. It is a better practice to flush the cache when the data changes and refresh only the content that has changed. We can do this when the data is created, updated, or deleted. There are a number of tools and frameworks that can resolve refresh problems, but they encounter problems with how to determine what has changed in the data and when the change occurred. Databases are best suited to do these things.

For our caching system, it can be provided here, and we will limit ourselves to the Microsoft stack to prevent Python itself.

? Microsoft SQL Server (CPT)

The service agent to isolate the transaction database.

? Python execution script that can be updated via HTTP cache (Python 3.5 executable library from Python distribution)

?。 4.5.2 Net

? Asp. Our sample Web UI for network MVC

? Asp. Net Webapi encapsulates the cache storage for our sample solution.

The following is a graphical representation of the sample solution cache system:

? WebApplication provides a user interface to read and update data.

? Restful. In our sample cache storage solution, use ASP to build a cache application. Net WebAPI2, whose content type is JSON. The Http-get operation provides data from a local cache (a static collection).

? SQL Server (CPT) is a database server with a

? Transdb OLTP database, busy processing transactions.

? The Cacher proxy database executes a python script execution script execution with the Enable external scripting option turned on. Refers to Microsoft. DOC: External script supports server configuration options.

? service agent, a reliable messaging framework for SQL Server, AIDS Bridge cacher-agent Transdb. Receiving messages through the cache proxy can be processed to update the cache.

? Python is the integrated scripting language for the SQL (CPT) database system.

The architecture of the solution

In our solution, we will cache the product type name of the entity in restful. The cache application and WebApplication will have a function to create a new product type entry, and from restful. The cache is read.

Prerequisite

By the way, we need to consider some prerequisites and more information.

1. Working cachedb managed SQL instance must have Python's machine learning Service installed

Two. To execute a python script with tsql in Cachedb, you should run SQL Service mssqllaunchpad or SQL Server Launchpad. Refer to Microsoft. Net: Microsoft Machine Learning Services

3. The project enables sp_configure external script execution, please refer to Microsoft. DOC: External script supports server configuration options

sp_configure ' external script enable ', 1; reconfigure;

4. The Transdb and Cacher managed environments should have a service proxy endpoint created on their instance, and if they are managed independently on two different SQL instances, each instance should have its own endpoint.

5. The Transdb and Cacher databases should have proxies enabled. Refer to Microsoft. Technet: How to: Activate a service proxy message delivery in a database

6.1234 ALTER DATABASE transdb SET enable_broker; Go ALTER DATABASE cachedb SET enable_broker;

。 Web applications

WebApplication has two major MVC operations, using the HTTP verb post to update the new entity in Transdb, and another operation that returns a list of product types from the cache with an HTTP verb.

Restful. There are two ways to do this, one is to update the cache with the newly added entity product type with the HTTP verb post, and the other is to get all the cached product types from the local cache.

For our sample solution, both applications reside in a separate application pool identity under IIS to ensure application security. However, for the actual system implementation, the managed environment can be a separate Web server in a local area network or an Internet environment.

Restful. The cache authorization rule has only two service accounts to handle HTTP requests.

ABC \ WEBAPP_SVC and ABC \ Cacheragent_svc. The Abc\ cacheragent_svc service account allows Python scripts in SQL to reach the application over HTTP to flush the cache.

Abc\ webapp_svc users use a Web application that has an authorization rule pattern to allow access to restful. Cache the application.

SQL database and Service proxy

The OLTP database Transdb some objects, including tables, stored procedures, and service proxy objects.

For our purposes, the Process Updateproducttype Update ProductType table with the new record and the Acknowledgeproducttypecache process activation process is cacheintegration queue, It receives a message from the target acknowledgment processing, that is, from the Cacher database. It also handles exceptions and records these exceptions in the Cacheintegrationerror table.

More information about service brokers can be found at Microsoft. Doc:sql Server Service Broker

For our example solution, TRANSDB is a source database that creates an update cache message when a new ProductType record is created, which is a message that executes an action because it has a updatemessage message type, A cacheintegration contract to send a message with the Cachesource service against the database. The service has a cachequeue, which is used by the service proxy component to perform reliable messaging. The Tocachetarget route has the information to deliver the message to the destination.

To eliminate any opportunity to increase transaction processing time and to avoid the security risk of remaining data in the transactional database, we will use a proxy database called the Cacher database to isolate the cache update process in our sample solution. The Service Broker messaging infrastructure will help connect the TRANSDB and Cacher databases, and event-based message processing will enable us to update cache storage residing on a network-based system. When the update message arrives, the Cacher database plays the role of the agent to perform a cache refresh. It updates the cache by executing a python script.

To hide your own database are:

1. Work Cachelog and Cacheintegrationerror tables, track records when the cache refreshes, and record any errors that may occur during the cache refresh.

Two. The performcacheupdate process receives incoming messages from TRANSDB through the service proxy. If the type of the message is updatemessage, then it executes another procedure, Updatewebcache, which executes the Python script.

The execution results of the Aupdatewebcache process are saved in a table variable and then inserted into the Cachelog table at the end of the message session.

B. The process also ends the conversation when the received message has an error or end message type, and the Exception log is written to the Cacheintegrationerror table on the error type.

3. The project Updatewebcache program extracts the ID and name as parameters from the incoming XML message and embeds the values in the Python script text. The script execution result set is a structured table of type Upddatecachelog.

Cacher's service proxy object, mainly Updatemessage message type and cacheintegration contract is the same as TRANSDB, Cachequeue has a activation process called perfomcacheupdate, The service is named Cachetarget, and the route relates to information about the TRANSDB service Cacheservice and endpoint addresses.

For our sample solution, set the maximum queue reader to 1 for two database queues. This can be increased if needed, for example, if data modifications are very high, you need to increase the cache refresh rate.

Service Proxy Endpoint

For our solution, the database runs on the same instance, so they all use the same service proxy endpoint to send and receive messages.

However, if we want to host the database on a single instance, then the service account for each SQL instance should have a service proxy endpoint. and two SQL instances should allow sending messages to each other's endpoints. You can use the following set of TSQL commands to complete the authorization and grant of a connection. Note that in the messaging infrastructure, one sender and the other is the receiver, as mentioned earlier, if the SQL instance is part of the sender and receiver, then each instance should have its own process identity. The following picture shows how each SQL Server runs according to its own identity.

This is the SQL code that authorizes and grants endpoints to connect to TRANSDB SQL Instance service account [identity] in the SQL instance of the Cacher database.

1234 Change the authorization endpoint: SERVICEBROKERENDPOINT[ABC \ transdb_svc] Go to grant on the endpoint connection:: SERVICEBROKERENDPOINT[ABC \ Transdb_svc] Go

Similarly, here is the code that authorizes and grants the endpoint connection to the SQL Instance service account of Cacher in the SQL instance of the TRANSDB database [identity].

1234 Change the authorization endpoint: SERVICEBROKERENDPOINT[ABC \ cacheragent_svc] Go to grant on the endpoint connection:: SERVICEBROKERENDPOINT[ABC \ Cacheragent_svc] Go

Python script

Here is the Python script text, saved as a string in the TSQL variable @ updatecache. It has a logical updatecache method that can perform an HTTP post call to rest. The data object is received as an input parameter by passing a data object that has a name and ID field to cache. It receives a JSON object and returns it as output to the caller.

At the end of the script, the returned object is converted to an array, so it can be constructed as a SQL result.

DECLARE @updatecache NVARCHAR(MAX) = N' Import pandas as PND #data structure Packagedef updatecache (name,id):import requests as HTTP #http request Package#Perfom HTTP POST to update cacheHttpRequest = Http.post (Http://localhost/RESTful.Cache/ProductType/UpdateCache, {name:name, id:id})Cachelog = Httprequest.json ()return Cachelog#Update cache and build log elementlog = [Updatecache ('+ @Name+', '+ CAST(@Id as C16>varchar() +')] #Return data frame i.e. table structure from SQLOutputdataset = PND. DataFrame (Data=log)';

There are a few things to note when using Python scripts in SQL Server.

1. Work we can write a sequential script, or group them into a method, as we did in this solution. Alternatively, we can create an inline class or create a package and use the PIP command to import them in Python at the command prompt.

Two. For the MS SQL version of CPT, the import statement can only be imported into a package-scoped place, so we can note that the Import request import statement exists inside the method Updatecache, and the import statement imports the giant panda in the script that exists in the last line of the script.

3. Project method Updatecache the output object is immediately converted to an array to represent the panda in this way. Dataframe can convert an object to a data structure, and SQL Server can easily interpret it as a table with rows and columns.

4. The data structure assigned to the Outputdataset object is provided by SQL Server in the TSQL execution context.

5. The last line of the program dbo. Use the result set (as dbo. Updatecachelog) for Updatewebcache; there is a user-defined table type dbo. Updatecachelog, which helps maintain the order of the underlying columns and avoids any mismatches that occur during the generation of result sets from the received data structure. Another approach is to build a mapped column structure within Python and in the result set.

Database security

Transdb is an OLTP database that we do not want any security vulnerabilities to the system's attacks, so in our sample solution approach, such a database can be hosted on a SQL instance that does not have the machine learning service installed. Cacher is a proxy that can reach a network-based system, so it can stay on the SQL instance where the Machine learning service is installed. Each of these two SQL instances can have a separate service account identity, which is authorized to connect to the service proxy endpoint only for a specific port. Another way to secure authentication communication is to use a certificate. For service proxy endpoint authorization, refer to Microsoft. Technet: How to: Allow Service proxy network access with certificates (Transact-SQL).

All components are put together

After putting all the components in place, here is our web application, which allows us to create a new producttype and use restful HTTP calls to list the same product types from the refresh cache. Behind the wall are the components that manage the data and the cache in front-end applications that are invisible.

Conclusion

Applications such as e-commerce and healthcare can benefit from a good cache implementation. By extending the use of our familiar technologies, we can get an easy-to-maintain solution without having to learn new frameworks or features.

Our sample solutions meet our needs

? When an OLTP transaction creates or modifies data, the system refreshes the cache system for network-based read access.

? It can use asynchronous events to flush the cache, near real time. This does not affect the performance of the original transaction.

? It can draw a security boundary between the transaction and the cache system via HTTP to keep the data in the OLTP database.

The minimal monitoring feature is enabled, and the cache log and Exception Log can be further enhanced to build the management console.

? Using the Service Proxy message delivery component, when asynchronous message processing occurs, the solution has sufficient flexibility to trigger or reach a network-based system. In other words, the database is integrated with the SQL Service proxy message, which, based on the received data, performs an operation to fetch or send data to external external systems that are outside the data tier.

? by using the service proxy messaging mechanism, the external system is triggered into a dedicated database, which helps to obtain transactions and data for the OLTP database.

The source code for this project is available in Githib.

Translation: Powerful features of Python and SQL Server 2017

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.