Introduction to OpenStack Architecture analysis written in front
- Understand the logical relationships of OpenStack components;
- Understand the communication and deployment relationships of the various components of OpenStack;
- Understand the work flow of OpenStack;
Next I will master:
- The logical relationship between OpenStack components;
- OpenStack API;
- The communication relationship between OpenStack components;
- Several different types of storage in OpenStack;
- OpenStack work flow;
- The deployment architecture of OpenStack;
The relationships between OpenStack components are: logical relationships, communication relationships, deployment relationships ...
1. The logical relationship between OpenStack components
OpenStack is a constantly evolving system, so the architecture of OpenStack is evolving , for example:
- E version has 5 components
Compute is Nova;image is Glance, provides the image storage service for Nova, object is to provide the object storage service Swift;dashboard is what we usually say horizon;identity is Keystone;
- The F version has 7 components, the core components:
There are seven components to build a relatively complete cloud computing environment, Heat, Sahala is optional, and the new two additions to the E version are Block Storage Cinder and Network Neutron, between the two components and Glance,swift There is no direct link, actually developed from the Compute Network and Compute Volume, Neutron component is not directly to replace the Compute network, it is a relatively independent, is also very famous SDN project, it is a Co Mpute provides network connectivity and provides network resource management services such as Block Storage (aka Cinder) for Compute, replacing Compute Volume
2. OpenStack's API
The logic of OpenStack is to transfer information between components, and the transmission of information between components is mainly realized by calling APIs between OpenStack, as an operating system, as a framework, its API is of great significance.
Based on the HTTP protocol, RESTful Web API;
What is REST?
The full name is: Representational State Transfer, performance status transmission. Presented by Dr. Fielding (the first president of the Apache Foundation, one of the main designers of the 1.0 and 1.1 versions of the HTTP protocol, the Apache server Software author). REST is manipulating the state of resources by manipulating the performance of the resources .
Another type of WEB service interface protocol is SOAP.
The difference between the two is that the invocation of the RESTful Web API is very simple, but we usually use soap as a way to do it on a framework-by-case basis. Net,java is already ripe, and we don't feel the complexity of the underlying mechanism, and REST actually and soap Compared to the very concise, on the other hand, REST describes a style of a framework of a principle, so it does not provide specific practice or agreement.
The most common implementation today is the RESTful Web API based on the HTTP protocol, which is what we use in OpenStack. The operation of resources in the REST architecture, including: Fetch, create, modify and delete, corresponds to the GET, POST, PUT and Delete methods provided by the HTTP protocol, so it is convenient to implement rest with HTTP.
The following are three main points of the RESTful Web API :
- The URI of the resource address and resource, for example: http://example.com/resources/
- The expression of the transmission resource, refers to the Web service accepts and returns the Internet media type, for example: Json,xml, in which JSON has the lightweight characteristic, the mobile Internet Rapid Development Lightweight protocol is very popular, the JSON obtains the widespread application
- For the operation of a resource, a series of request methods supported by the WEB service on that resource, such as: Post,get,put,delete
The following is an example of an OpenStack Swift interface:
首先，用 curl 命令访问一个 http 服务crul -i -X GET http://storage.clouddrive.com/v1/my_account?format=json\ -H "X-Auth-User:jdoe" -H "X-Auth-Key:jdoepassword"
It is the header of an HTTP response, with details about the account, including how many
Container How many objects, how many bytes of storage are consumed, etc.
然后是 JOSN 格式的内容，列出了这个 Account 下面的 Container
several ways to invoke and debug the API :
- The first way: The Curl command (a command-line tool that sends an HTTP request and accepts a response under Linux), which is less practical and more cumbersome
- The second way: compare commonly used OpenStack command-line clients, each OpenStack project has a Python-written command-line client
- Third Way: REST client with Firefox or Chrome browser (graphical interface, browser plugin)
- The fourth way: with OpenStack SDK, you can send HTTP requests without manually writing code to call the REST interface, but also save some of the work such as Token and other data, can easily be based on OpenStack development, then OpenStack officially provides P Ython's SDK, and of course, third-party SDKs such as the well-known Jclouds that support Java, and support for node. js, Ruby,. Net, and more
OpenStack also offers an additional set of APIs that are compatible with Amazon's EC2, allowing for easy application migration between the two systems.
3. Communication relationships between OpenStack components
The communication between OpenStack components is divided into four categories:
- Based on HTTP protocol
- Based on AMQP protocol (Message Queuing protocol based)
- Database-based connections (primarily SQL-only communication)
- Native API (third-party based API)
There is a picture to look at:
- Compute node is the one that actually runs the virtual machine
- Block Storage Node is primarily a Cinder-connected storage backend (storage device)
- Network node is usually a gateway with a number of gateways such as routing (Networking Devices)
3-1. Communication based on the HTTP protocol
The communication relationships established through the API are basically in this category, which are all RESTful Web APIs, the most common of which is the kind of communication that occurs when the components are manipulated through Horizon or the command-line interface, and then the components are Keystone to the user Identity verification, use of this communication for validation, and, for example, the Nova Compute between the time the image is captured and Glance, the call to the Glance API, and the read and write of the Swift data, as well as the RESTful Web through this HTTP protocol. API to do that.
3-2. Based on advanced Message Queuing protocol
Communication based on the AMQP protocol is primarily a communication between components within each project, such as Nova Compute and Scheduler, then Cinder between Scheduler and Cinder volume.
It should be stated that Cinder evolved from Nova Volume, so there is also a communication relationship between Cinder and Nova via the AMQP protocol, which is also a service-oriented architecture due to the AMQP protocol, although most of the components that communicate through the AMQP protocol belong to the same A project, but does not require them to be installed on the same node, to the scale of the system has a great advantage, can be in each of the components in their own load to scale out, because they are not on a node, each with a different number of nodes to host their services.
(AMQP is a protocol, OpenStack does not specify what it is implemented, we often use Private MQ, in fact, users can choose other message middleware according to their own situation.) ）
3-3. SQL-based communication
Communication through a database connection, most of which is also within the project, and does not require the database and other components of the project to be installed on the same node, it can also be installed separately, you can also deploy a database server specifically, the database services on the above, through the SQL-based connections to communicate. OpenStack does not specify which database must be used, although MySQL is usually used
3-4. Communicating through the native API
appear between the OpenStack components and the hardware and software of the third party, for example, the communication between the Cinder and the storage backend, the Neutron agent or the communication between the plug-in and the network device, which need to call third-party devices or third-party software APIs, we call them Native API, then this is what we said earlier about communication based on third-party APIs.
4. OpenStack several different storage
OpenStack Storage services are divided into three types:Glance,Swift,Cinder;
- Glance (mirrored storage) is a mirrored storage Management Service and does not have the functionality to store it;
- Cinder (Block storage) provides block storage interfaces , itself does not provide storage of data, but also need to follow a storage backend, such as EMC's bulk devices, Huawei's storage devices, NETAPP storage devices can do its back end. There is also a relatively fire open source distributed storage called Ceph,ceph also provides block storage services and can also be used as Cinder backend. The role of Cinder is to provide the block storage interface for OpenStack, an important function called the volume management function , the virtual machine does not go directly to the storage device (not directly to use the backend storage system), using a block device on the virtual machine (volume Volume), the fact that Cinder is to create and manage these Volume and mount it on a virtual machine. Cinder is independent from the Nova Volume and is popular with various storage vendors, and can be incorporated into the OpenStack ecosystem by writing Cinder Driver. The
- Swift (Object Store) provides the object storage Service , which has the same features as Amazon IWSS3, providing access to data through RESTful APIs to solve two problems: the first one, We can go directly to a storage, without having to do the forwarding of the data through our own WEB server, otherwise it is a kind of pressure on the server load. Second, in our big data age, when the volume of data is particularly large, if we use the file system will have a problem: the number of files explosion, the performance of the storage will drop sharply, and object storage is actually to solve the problem, object storage discards the structure of that directory tree, with a flat structure to manage the data. Swift actually has only three layers of structure, namely account, Container, Object. Object is the final data, that is, the file, the front has two levels of management, the first is the Container container, it put Object into the container, and then the upper level is account, and accounts to associate, Container equivalent is to put these object Make a classification, and use account to associate with accounts.
Three storage Concepts : file storage , block storage , object storage
- With a POSIX interface or a POSIX-compliant interface, you can think of it as a file system , compared to a typical distributed file system with HDFS in the fs,hadoop like Glance.
- A disk on the computer after the format is a file system, then before the format is a block device, that is, block storage , in fact, we are in the data center, like many of EMC's devices, such as some of Huawei's known as SAN devices, such as some of the NetApp devices, If the bulk storage is generally provided block storage;
- The typical representative of object storage is Amazon's AWS S3, whose interface is not POSIX, nor is it a block-stored interface like a hard disk, accessed through RESTful Web APIs, and the advantage for application developers is that it is easy to access the data stored in the store, The data stored in the object store is usually called object, which is actually file, but the object is stored inside in order to make a difference to the file system, it is called objects
5. OpenStack Workflow
Here's an example of creating a virtual machine to see how OpenStack works, and the following figure is the whole process of OpenStack creating a virtual machine:
Here is a brief description of the text:
There are a total of 28 steps, mainly related to Nova, in fact, in the Keystone Glance Neutron Cinder Swift and other components inside there are many small steps, add up I am afraid there are more than 50 steps. Here are the 28 basic steps, most of which are published in an article from the Ilearnstack website, the process of creating a virtual machine on Dashboard.
- 1th, first of all, Horizon or command line tools to the Keystone initiated a REST call, and the user name and password sent to Keystone;
- 2nd, Keystone the user name and password received, and generates tokens, which are used later when sending rest calls to other components;
- 3rd, Horizon or the command-line client sends a NOVA boot command that initiates a virtual machine operation or knocks on the command line, including the Token converted from the previous step to the RESTful Web API sent to Nova-api;
- 4th, Nova-api to verify the legitimacy of tokens sent to Keystone by the client after receiving the request, which means that there is an interaction between Nova and Keystone;
- The 5th step, after verifying the Token, Keystone the user's role and permissions back to Nova, also return to NOVA-API;
- The 6th step is a process of interaction between NOVA-API and Nova Database;
- The 7th step is that Nova Database creates a record for the new virtual machine instance and returns the results to Nova-api.
- 8th, Nova-api sends a synchronous remote call request via AMQP to Nova-scheduler, where the synchronous remote call request is actually called Rpc.call request in the AMQP protocol, which is a synchronous call, which means that it sends out After the request has been waiting, waited until this queue (message queue) to return it to the result, so is waiting to obtain the new virtual machine instance of the entry and the host Id,host ID is the virtual machine will be launched in the future the ID of the sending host;
- 9th, Nova-scheduler out the request from the message queue because there is an AMQP component in it, so the interaction between the other components of Nova is basically done via AMQP, in fact the implementation of AMQP is RabbitMQ;
- The 10th step is to nova-scheduler the Nova Database and pick out a suitable host to launch the virtual machine (the process of selection is complex);
- The 11th step is that Nova-scheduler returns the call to the previous NOVA-API via AMQP and sends the host's ID to it;
- The 12th step is that Nova-scheduler sends an asynchronous call request to Nova-compute to start the virtual machine on the host by Message Queuing, which is an asynchronous call request (Nova-schduler sends the request to the message queue and it returns, There is no need to wait for this request to return the result, in AMQP it is rpc-cast this request);
- The 13th step is nova-compute from the queue to take the request;
- The 14th step is that Nova-compute sends a rpc.call synchronous call to Nova-conducter via Message Queuing to about the virtual machine (including the host's ID, the virtual machine configuration information has memory size, CPU configuration, hard disk size, etc.);
- The 15th step, nova-conducter from the queue to take out the above request;
- The 16th step, nova-conducter with the database to interact;
- 17th, Nova-conducter Returns the information requested by the preceding Nova-compute;
- The 18th step, nova-compute from the message queue to take out the information returned Nova-conducter, that is, synchronous call to this end;
- The 19th step is that Nova-compute sends a REST request with Token to Glance-api to request the mirror data;
- The 20th step, is glance-api to Keystone to verify the legitimacy of this token, in front of a similar step (Nova to verify the legitimacy of token, in fact, in 19 steps and 20 steps there are many details, for example, if it is Swift to store Gl Ance Mirror, the middle also has the interaction between Swift and Glance, Glance internal also has some interaction);
- The 21st step, is nova-compute get the metadata of the image, the previous steps is equivalent to the image of the metadata to obtain, know where the image is from where the long what. The following 22 to 24 steps are for the virtual machine to prepare the network, or Nova-compute-centric;
- The 22nd step is that Nova-compute sends a REST request with Token to the Neutron API for network configuration and obtains the IP address assigned to the virtual machine to be created, which in fact Neutron do some work, there are many details;
- 23rd step, Neutron Server to Keystone to verify the legitimacy of tokens;
- The 24th step, Nova-compute to obtain the network configuration information, the following steps are to give the virtual machine to prepare the virtual machine's hard disk, that is, the previously mentioned volume storage or block storage;
- The 25th step is to Nova-compute call Cinder-api's RESTful interface to mount the virtual machine, that is, a block device or a virtual hard disk;
- The 26th step, similar to the above Glance and Neutron, Cinder-api also to Keystone to verify that the legitimacy of this token;
- The 27th step, Nova-compute will get this virtual machine block storage of information, after 27 steps, so that the creation of a virtual machine required for the various information and conditions before formally ready to complete;
- 28th step, Nova-compute to call Hypervisor or Libvirt interface to create a virtual machine, of course, in the Libvirt or Hypervisor to create a virtual machine process, the internal is a very complex process;
Four stages of virtual machine creation :
- Block_ device_mapping
the embodiment of several communication relations :
- Calls between the various component APIs, which belong to HTTP communication;
- The communication of each component within Nova and Neutron belongs to the AMQP protocol;
- In the middle of the frequent read and write database operation belongs to the database connection to communicate;
- Nova and Hypervisor or Libvirt interactions belong to the Native API that is the third-party interface to communicate, there is a Volume to the virtual machine to prepare the process of Cinder also need to interact with the storage device, which also need to use Native API or a third-party interface;
6. OpenStack's deployment architecture
The relationship between OpenStack components has been analyzed from a logical relationship and a communication relationship, and the API and storage for OpenStack are also described.
The various architectures mentioned above are basically logical architectures on the software, but OpenStack is a distributed system that solves the problem of mapping from logical to physical architectures, that is, how OpenStack projects and components are installed on the actual server node. On the actual storage device, how to connect them over the network, this is the deployment architecture of OpenStack.
the deployment of OpenStack is divided into :
- A single-node deployment, usually for learning or developing
- Multi-node deployment (cluster Deployment)
The deployment architecture of OpenStack is not static, but rather designs different implementations based on actual requirements.
In the actual production process, we first need to plan for the computing, network, storage needs of resources, although we are now using the cloud computing technology, it than the traditional it architecture in the resource planning of the difficulties and workload is less, but still need a plan, Learn about the two basic and complex cluster deployment architectures here.
6-1. Simple deployment architecture
Is the deployment of OpenStack required for a most basic production environment.
Design different embodiments According to the actual needs
The following explains "Three kinds of network and four kinds of nodes"
(1) Green Management Network + Blue storage Network + Yellow service Network
- The Management Network is the management node of OpenStack or its management service to other nodes to manage such a network, they have "API calls between different components, migration between virtual machines" and so on;
- The storage network is the network that the compute node accesses the storage service, including the traffic that reads and writes the data to the storage device basically needs to go from the storage network, and another is the service network;
- The service network is a network of virtual machines managed by OpenStack, which is usually a server with several network cards, several network ports, and we can isolate each network. The advantage of isolation is that their traffic does not intersect, for example, when we read and write storage devices, we may have a particularly large amount of traffic on the storage network, but it will not affect our nodes to provide services, similarly, when we do the virtual machine migration may be on the management network its data traffic will be very large, But it also does not affect the read and write performance of these compute nodes to the storage device.
(2) Four types of nodes:
- Control node (OpenStack's management node , most of OpenStack's services are running on the control node, such as Keystone Authentication Service, virtual machine Image Management Service Glance, etc.)
- COMPUTE nodes (compute nodes refer to the nodes that actually run the virtual machine)
- Storage node ( provides object storage service , a node that provides object storage for swift, or a Proxy node for a swift cluster, or a storage backend for other services)
- Network nodes ( functions that implement gateways and routes )
Some services can be deployed directly on the controller node or directly on the control node, but it is important to note that both the Nova and Neutron components must be distributed deployments. Say that Nova:nova-compute is the control of the virtual machine, is to control and manage the virtual machine, so it must be deployed on the compute node, and several other Nova services should be deployed on the control node, in particular, the need to emphasize that nova-compute and Nova-conducter must not be deployed on the same node, separating the two is to understand the decoupling.
Say something about Neutron:neutron. Some plugins or agents need to be deployed on network nodes and compute nodes, while others, such as Neutron Server, can be deployed on the control node
6-2. Complex deployment architecture
There are three key points to be mastered:
- In larger cases, various management services are deployed to different servers. Take these services apart and deploy them to different nodes, or even separate components of the same service, such as the Nova database to be deployed as a MySQL db cluster, and the Cinder Scheduler and Volume can be deployed to different nodes, in fact, because the SWIFT project has a certain degree of independence, so Swift itself has a cross-geographical deployment of the production environment, very large, cross-geographical deployment, so its service is very high availability, naturally have this cultivated characteristics, Object storage services that provide very high availability and data persistence . Thus, it is easy to scale out these services for OpenStack and scale out the entire OpenStack system, which allows OpenStack to have a relatively high load capacity and can reach a larger scale. All of this is due to the use of the so-consistent service-oriented architecture in OpenStack design, the SOA architecture, how each component is distributed and how it is scaled out horizontally.
- For high-availability considerations, we deploy OpenStack's same service to different nodes in a production environment, creating a highly available cluster of dual-machine hot-standby or multi-machine hot standby. (or with a load-balanced cluster).
- In complex data center environments, there are also many third-party services, such as LDAP services, DNS services, and so on, that consider how to interface and integrate with third-party services. For example, we may need to use an OPEN LDAP server as the Keystone authentication and sound backend, which is typically done in the management network.
网络 MOOC 学习笔记 From 高校帮 《OpenStack 入门 @讲师 李明宇》By 胡飞2016/3/31 19:56:39
II. structure analysis of the introduction of OpenStack