SQL Server on Linux:how? Introduction:sql Server Blog

Source: Internet
Author: User
Tags new set hosting

SQL Server Blog

Official News from Microsoft ' s information Platform

https://blogs.technet.microsoft.com/dataplatforminsider/2016/12/16/sql-server-on-linux-how-introduction/

This post is authored by Scott Konersmann, Partner Engineering Manager, SQL Server, Slava Oks, Partner Group Engineering Manager, SQL Server, and Tobias ternstrom, Principal program Manager, SQL Server.

Introduction

WE first announced SQL Server on Linux in March, and recently released the first public preview of SQL Server on Linux (SQ L Server v.next CTP1) at the Microsoft Connect (); Conference. We ' ve been pleased to see the positive reaction from our customers and the community; In the both weeks following the release, there were more than 21,000 downloads of the preview. A lot of you is curious to hear more about what we made SQL Server run on Linux (and some for you has already figured out and posted interesting articles about part of the stories with "drawbridge"). We decided to kick off a blog series to share technical details on this very topic starting with an introduction to the Journey of offering SQL Server on Linux. Hopefully you'll find it as interesting as we do! J

Summary

Making SQL Server run on the Linux involves introducing what's known as a Platform abstraction Layer ("PAL") into SQL server. This layer was used to align any operating system or platform specific code in one place and allow the rest of the Codebas E to stay operating system agnostic. Because of SQL Server ' s long history in a single operating system, Windows, it never needed a PAL. In fact, the SQL Server database engine codebase have many references to libraries that is popular on Windows to provide V Arious functionality. In bringing SQL Server to Linux, we set strict requirements for ourselves to bring the full functional, performance, and S Cale value of the SQL Server RDBMS to Linux. This includes the ability for a application that works great on SQL Server on Windows to work equally great against SQL S Erver on Linux. Given These requirements and the fact that the existing SQL Server OS dependencies would make it very hard to provide a hi ghly capable version of SQL Server outside of WinDows in reasonable time it is decided to marry parts of the Microsoft (MSR) Project drawbridge with SQ L server ' s existing platform Layer sql server Operating system  (SOS) to create "What do we call the Sqlpal." The drawbridge project provided an abstraction between the underlying operating system and the application for the purpose s of secure containers and SOS provided robust memory management, thread scheduling, and IO services. Creating Sqlpal enabled the existing Windows dependencies to being used on Linux with the help of parts of the Drawbridge des IGN focused on OS abstraction and leaving the key OS services to SOS. We are also changing the SQL Server Database Engine code to by-pass the Windows libraries and call directly into Sqlpal fo R resource Intensive functionality.

Requirements for supporting Linux

SQL Server is Microsoft's flagship database product which with close to years of development behind it. At a high level, the list below represents our requirements as we designed the solution to make the SQL Server RDBMS avail Able on multiple platforms:

    1. Quality and security must meet the same high bar we set for SQL Server on Windows
    2. Provide the same value, both in terms of functionality, performance, and scale
    3. Application compatibility between SQL Server on Windows and Linux
    4. Enable a continued fast pace of innovation in the SQL Server code base and make sure new features and fixes appear immedia Tely across platforms
    5. Put in place a foundation for the future of SQL Server Suite services (such as integration Services) to come to Linux

To make SQL Server support multiple platforms, the engineering task was essentially to remove or abstract away its Dependen Cies on Windows. As can imagine, after decades of development against a single operating system, there is plenty of os-specific depend Encies across the code base. In addition, the code base is huge. There is tens of millions of lines of code in SQL Server.

SQL Server depends on various libraries and their functions and semantics commonly used in Windows development that fall I Nto three categories:

    • "Win32" (ex. user32.dll)
    • NT Kernel (Ntdll.dll)
    • Windows application libraries (such as MSXML)

You can think of these as core library functions, most of them has nothing to do with the operating system kernel and onl Y execute in user mode.

While SQL Server have dependencies on both Win32 and the Windows kernel, the most complex dependency are that of the Windows app  Lication libraries that has been added over the years on order to provide new functionality. Here is some examples:

    • SQL Server ' s XML support uses MSXML which are used to parse and process XML documents within SQL server.
    • SQLCLR hosts the Common Language Runtime (CLR) for both system types as well as user defined types and CLR stored Procedur Es.
    • SQL Server has some written-COM like the VDI interface for backups.
    • Heterogeneous distributed transactions is controlled through Microsoft distributed Transaction Coordinator (MS DTC)
    • SQL Server Agent integrates with many Windows subsystems (shell execution, Windows Event Log, SMTP Mail, etc.).

These dependencies is the biggest challenge for us to overcome to meet our goals of bringing the same value and have A very high level compatibility between SQL Server on Windows and Linux. As an example, to re-implement something like SQLXML would take a significant amount of time and would run a high risk of Not providing the same semantics as before, and could potentially break applications. The option of completely removing these dependencies would mean we must also remove the functionality they provide from SQ L Server on Linux. If the dependencies were edge cases and only impacting very few customer visible features, we could has considered it. As it turns out, removing them would cause us-to-remove tons of features from SQL Server on Linux which would go a Gainst our goals around compatibility and value across operating systems.

We could take the approach of doing this re-implementation piecemeal, bringing value little by little. While this would is possible, it would also go against the requirements because it would mean that there would is a signif Icant gap between SQL Server on Linux and Windows for years. The resolution lies in the right platform abstraction layer.

Building a PAL

Software that's supported across multiple operating systems always have an implementation of some sort of Platform Abstrac tion Layer (PAL). The PAL layer is responsible for abstraction of the calls and semantics of the underlying operating system and its Librari Es from the software itself. The next couple of sections consider some of the technology that we investigated as solutions to building a PAL for SQL Se RVer.

SQL Operating System (SOS or Sqlos)

In the SQL Server 2005 release, a platform layer is created between the SQL Server engine and Windows called the SQL Oper ating System (SOS). This layer is responsible for the user mode thread scheduling, memory management, and synchronization (see Sqlos for Referenc e). A key reason for the creation of the SOS is that it allowed for A centralized set of low level management and diagnostic s functionality to BES provided to customers and support (subset of Dynamic Management Views/dmvs and Extended Events/xeven  TS). This layer allowed us to minimize the number of system calls involved in scheduling execution by running non-preemptively  and letting SQL Server does its own resource management. While SOS improved performance and greatly helped supportability and debugging, it does not provide a proper abstraction LA Yer from the OS dependencies described above, i.e. Windows semantics were carried through SOS and exposed to the database Engine.

In the scenario where we would completely remove the dependencies on the underlying operating system from the database eng  INE, the best option is to grow SOS into a proper Platform abstraction Layer (PAL). All the calls to Windows APIs would is routed through a new set of equivalent APIs in SOS and a new host extension layer W Ould is added on the bottom of SOS that would interact with the operating system. While this would resolve the system call dependencies, it would is not a help with the dependencies on the higher-level Librari Es.

Drawbridge

Drawbridge was a Microsoft see drawbridge for reference focused on drastically reduc ing the virtualization resource overhead incurred when hosting many Virtual machines on the same hardware.  the Resea RCH involved-ideas.  The first idea is a "picoprocess" which consists of an empty address space, a monitor proce SS that interacts with the host operating system on behalf of the picoprocess, and a kernel driver that allows a driver to Populate the address space at startup and implements a host application Binary Interface (ABI) that allows the picoproces s to interact with the host.  the second idea is a user mode Library OS sometimes referred to as libos.  DRAWB Ridge provided a working Windows Library OS that could is used to run Windows programs on a Windows host.  this Libra Ry OS implements a subset of the 1500+ Win32 and NT ABIs and stubs the rest to either succeed or fail depending on the Typ E of call.

Our needs didn ' t align with the original goals of the drawbridge.  For instance, the picoprocess idea isn ' t something needed for moving SQL Server to other platforms. However, there were a couple of synergies that stood out:

    1. Library OS implemented most of the 1500+ Windows ABIs in user mode and only 45-50 ABIs were needed to interact with the Ho  St.  These ABIs were for address space and memory management, host synchronization, and IO (Network and disk).  This made for a very small surface area the needs to is implemented to interact with a host. That's extremely attractive from a platform abstraction perspective.
    2. Library OS is capable of hosting other Windows components. Enough of the Win32 and NT layers were implemented to host CLR, MSXML, and other APIs that the SQL suite depends on. This meant, we could get more functionality to work without rewriting whole features.

There were also some risk and reward tradeoffs:

  1. The Microsoft project was complete and there is no support for drawbridge. Therefore, we needed to take a source snapshot and modify the code for our purposes.  the risks were around the costs To ramp-a team on the Library OS, modify it-be suitable for SQL Server, and make it perform comparably with Windows .  on the positive side, this would mean everything are in user mode and we would own all the code within the STACK.&N Bsp Performance critical code can be optimized because we can modify all layers of the stack including SQL Server, the Library OS, and the host interface as needed to make SQL Server perform.  Since There is no real boundaries in the process, It is possible for SQL Server to call Linux.
  2. The original drawbridge project is built on Windows and used a kernel driver and monitor process.  This would need to is dropped in favor of a user mode only architecture. In the new architecture, the host extension (referred to as PAL in the drawbridge design) on Windows would move from a ker  Nel driver to just a user mode program. Interestingly enough, one of the researchers had developed a rough prototype for Linux that proved it could is done.
  3. Because the technologies were created independently there was a large amount of overlapping functionality. SOS had subsystems for object management, memory management, Threading/scheduling, synchronization, and IO (disk and Netwo RK).  The Library OS and Host Extension also had similar functionality. These systems would need to is rationalized down to a single implementation.
Technologies

Sos

Library OS

Host Extension

Object Management

Memory Management

Threading/scheduling

Synchronization

I/O (Disk, Network)

Meet Sqlpal

As a result of the investigation, we decided on a hybrid strategy. We would merge SOS and Library OS from drawbridge to create the SQL PAL (SQL Platform abstraction Layer). For areas of the Library OS that SQL Server does not need, we would remove them. To merge these architectures, changes were needed on all layers of the stack.

The new architecture consists of a set of SOS direct APIs which don ' t go through any Win32 or NT syscalls. For code without SOS direct APIs they would either go through a hosted Windows API (like MSXML) or ntum (NT User Mode API – This is the 1500+ Win32 and NT syscalls). All the subsystems like storage, network, or resource management'll be based on SOS and would be shared between SOS Direc T and Ntum APIs.

This architecture provides some interesting characteristics:

  • Everything running in process boils down to the same platform assembly code. The CPU can ' t tell the difference between the code that's providing WIN32 functionality to SQL Server or native Linux cod E.
  • Even though the architecture shows layering, there is no real boundaries within the process (there is no spoon!). IF code running in SQL Server which are performance critical needs to call Linux it can do this directly with a very small  Amount of assembler via the SOS direct APIs to setup the stack correctly and process the result.  An example where this have been done is the disk IO path. There is a small amount of conversion code left to convert from Windows scatter/gather input structure to Linux vectored I  O structure. Other disk IO types don ' t require any conversions or allocations.
  • All resources in the process can is managed by Sqlpal. In SQL Server, before sqlpal, most resources such as memory and threads were controlled, but there were some things outsid  E it ' s control.  Some libraries and Win32/nt APIs would create threads on their own and do memory allocations without using the SOS APIs. With this new architecture, even the Win32 and NT APIs would is based on Sqlpal so every memory allocation and thread Wou LD is controlled by SQL PAL. As can see this also benefits SQL Server on Windows.
  • For SQL Server on Linux we is using about Bayi MB of uncompressed Windows libraries, so it's a tiny fraction (less than 1%) of a typical Windows installation. Sqlpal itself is currently around 8 MB.
Process Model

The following diagram shows what is the address space looks like when running.  The host extension is simply a native Linux application.  When the host extension starts it loads and initializes sqlpal, Sqlpal then brings up SQL Server. Sqlpal can launch software isolated processes that is simply a collection of threads and allocations running within the S  Ame address space. We Use this for things like SQLDumper which are an application that's run when SQL Server encounters a problem to collect An enlightened crash dump.

One point to reiterate are that even though this might look like a lot of layers there aren ' t all hard boundaries between S QL Server and the host.

Evolution of Sqlpal

At the start of the project, SQL Server is built on SOS and Library OS is independent.  The eventual goal is to has a merged SOS and Library OS as the core of SQL PAL.  For public Preview, this merge wasn ' t fully completed, but the heart of Sqlpal had been replaced with SOS. For example, threads and memory already use SOS functionality instead of the original drawbridge implementations.

The result is, there is, instances of SOS running inside the CTP1 Release:one in SQL Server and one in Sqlpal.  This works fine because the SOS instance in SQL Server are still using Win32 APIs which call-down into the sqlpal. The Sqlpal instance of the SOS code have been changed to call the host extension ABIs (i.e. the native Linux code) instead of Win32.

Now we is working on removing the SOS instance from SQL Server.  We is exposing the SOS APIs from the Sqlpal. Once This is completed everything would flow through the single Sqlpal SOS instance.

More Posts

We are planning more than these posts to share to the tell what we journey, which we believe has been amazing and a ton of Fun worth sharing. Please provide comments if there is specific areas you is interested in US covering!

thanks!

SQL Server on Linux:how? Introduction:sql Server Blog

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.