Extended development of ceph management platform Calamari

Source: Internet
Author: User
Tags glusterfs saltstack

Extended development of ceph management platform Calamari
I haven't written logs for nearly half a year. Maybe I am getting lazy. But sometimes writing something can help you accumulate it. Let's record it. I have been familiar with some related work since I joined the company for more than half a year. Currently, I am mainly engaged in the research and development of distributed systems. The current development is mainly at the management level and has not yet reached the modification of code. In the past six months, I have been familiar with two very good distributed systems, glusterfs and Ceph. The two distributed storage products have their own advantages. The file service provided by Glusterfs is not available by the Ceph system. However, the unified architecture of Ceph Block devices, Object Storage Service, and file systems cannot meet the requirements of GlusterFs. Therefore, each has its own advantages.

From the code level, the GlusterFs code is relatively simple, the layers are obvious, and the stack-based processing process is very clear. It is very easy to extend the functions of the file system (you can add a processing module on the client and server). Although the server and client code are a piece of code, the code is clear on the whole, less code.

Ceph is developed using C ++, and the system itself has multiple processes. Multiple processes constitute a large cluster, and there are also small clusters in the cluster. Compared with Glusterfs, the code is much more complex, and Ceph implements self-adjustment and self-repair. Supports customization of software systems and finds the storage location of objects through the Crush algorithm.

Ceph is quite popular, but Glusterfs is a good choice for file systems.

Recently, I am engaged in Ceph-related management platform development. I am familiar with the official Calamari platform. This platform currently mainly provides Ceph distributed storage system management, on the whole, it mainly provides a way to manage the page Ceph. From the current implementation perspective, the Platform still has some limitations and cannot complete powerful functions, or the current version can only provide some basic functions. But the Calamari framework is really good. Ceph is an open-source software. Calamari is also an open-source software. Calamari is a combination of open-source software. These open-source software only provides specific functions. Although it is patchwork, the framework of the Management Platform is worth learning.
Refer to http://www.openstack.cn /? P = 2708.
Structural Diagram of Calamari

The red box is the code implementation part of Calamari, and the non-red box is the open-source framework implemented by non-Calamari.

Components installed on the Cephserver node include Diamond and Salt-minion. Diamond is responsible for collecting monitoring data. It supports a lot of data types and metrics. Each type of data is a collector, except for collecting Ceph status information, it can also collect key resource usage and performance data, including CPU, memory, network, I/O load and disk metrics. Collector uses local command lines to collect data and then reports it to Graphite.

Graphite is not only an enterprise-level monitoring tool, but also a real-time drawing. Carbon-cache is a highly scalable, event-driven backend process of the I/O architecture implemented by Python, it can effectively communicate with a large number of clients and process a large amount of business with low overhead.

Similar to RRDtool, Whisper provides database development libraries for applications to manipulate and retrieve file data stored in special formats (time point data ), the most basic operation of Whisper is to create a new Whisper file, update and write the new data point to a file, and obtain the retrieved data point.

Graphite_web is a user interface used to generate images. Users can directly access these images through URLs.

Calamari uses Saltstack to allow the Calamari Server to communicate with the Ceph server node. Saltstack is an open-source automated O & M management tool, similar to Chef and Puppet functions. The Salt-master sends commands to the specified Salt-minion to manage the Cpeh Cluster. After the Ceph server node is installed, Salt-minion will synchronize and install a ceph from the master. py file, which contains the Ceph operation API. It will call librados or the command line to finally communicate with the Ceph Cluster.

Calamari_rest provides the Calamari rest api. For detailed interfaces, see the official documentation. Ceph rest api is a low-level interface, in which each URL is directly mapped to the equivalent ceph cli; Calamari rest api provides a higher-level interface, API users can use the GET/POST/PATCH method to operate objects without having to know the underlying Ceph commands. The main difference between them is that, ceph rest api users need to know Ceph very well, while Calamari rest api is closer to Ceph Resource description, so it is more suitable for upper-layer application calls.

Cthulhu can be understood as the Service layer of the Calamari Server. It provides interfaces for APIs and calls the Salt-master.

Calamari_clients is a set of user interfaces. During the installation process, the Calamari Server will first create the opt/calamari/webapp directory and put the manager under webapp/calamari. in The py (django configuration) file, all the content of calamari_web should be placed under opt/calamari/webapp to provide the UI access page.

The files in the calamari-web package provide all web-related configurations, both calamari_rest and calamari_clients.

This framework uses a large number of open-source software, but it is worth learning from the perspective of expansion. saltstack implements communication links between management nodes and server nodes and supports multi-node management, in this way, you do not need to consider the communication between management nodes and servers. on the server side, you only need to implement the specific business logic, that is, the implementation of specific management tasks. At the same time, Saltstack is developed using Python, which facilitates rapid development of the system and facilitates management personnel to debug and locate problems on site. Ceph itself also provides python API, which can be used directly to control clusters. The use of SaltStack allows the cluster to reach a certain scale. The Master end of SaltStack is actually the control interface of the Management end, And SaltStack is the Agent End of the server. In Calamari, heartbeat packets are sent through Saltstack to check server information and cluster information and control command distribution. You can understand the basic mode of SaltStack to understand the development and expansion of Calamari.

Another set of very important open-source software in this framework is diamond + graphite. diamond completes the collection of server-side information, while graphite provides chart information. Diamond currently provides information collection for the vast majority of open-source systems and basic Server Information Collection (CPU, memory, disk, and other information). It is also implemented in Python and is very easy to expand and debug. Currently, Ceph information collection exists in diamond. Graphite mainly provides time series data for the foreground, which simplifies re-writing specific business logic.

To learn about Calamari, you must understand basic components and understand their functions and objectives. The following describes how to extend Calamari at the code level.
1 Calamari Extension

Based on Calamari, new function development is mainly divided into the following modules, including Rest-API, Cthulhu, and salt client extensions. The basic steps for scaling new features are as follows:

> Expand the URL module to determine the corresponding response interface parameters and corresponding response interfaces in ViewSet.

> Complete the implementation of some interfaces in ViewSet. This part mainly involves interaction with cthulhu and how to obtain data information. In some cases, you also need to obtain the serialization operation of objects in serializer.

> Complete the extension of the corresponding type in backend rpc. py, which is mainly for some post operations.

> Complete the cluster_monitor.py extension. For some functions that provide operations, you must support create, update, delete, and other operations. You must provide the corresponding RequestFactory. In cluster_monitor.py, you need to add the corresponding RequestFactory to the code.

> Write the corresponding RequestFactory class, which encapsulates command operations. And construct corresponding request operations.

> Salt-minion extension, which is mainly for the ceph. py file extension. Of course, you can also provide a new xxx. py file.

The following describes the control and operation of PG.

1.1URL module Extension

Currently, Calmamari uses the Rest-API form and the Rest-Framework of Django is supported. This part is in the rest-api code directory. Django adopts the implementation of logical separation of Url and code, so the URL can be expanded separately.

Add the following PG-related URL in rest-api/calamari-rest/urls/v2.py:

Url (R' ^ cluster /(? P <fsid> [a-zA-Z0-9-] +)/pool /(? P <pool_id> \ d +)/pg $ ', calamari_rest.views.v2.PgViewSet.as_view ({'get': 'LIST'}), name = 'Cluster-pool-pg-list '),

Url (R' ^ cluster /(? P <fsid> [a-zA-Z0-9-] +)/pool /(? P <pool_id> \ d +)/pg /(? P <pg_id> [0-9a-fA-F] + \. [0-9a-fA-F] +)/command /(? P <command> [a-zA-Z _] +) $ ',

Calamari_rest.views.v2.PgViewSet.as_view ({'post': 'application '}),

Name = 'Cluster-pool-pg-control '),

The two URLs are defined as follows:

Api/v2/cluster/xxxx/pool/x/pg

Api/v2/cluster/xxxx/pool/x/pg/xx/command/xxx

The above two URLs specify the interfaces in PgViewSet, And the get method of the URL corresponds to the list interface. The apply interface corresponding to the post interface. These two interfaces must be implemented in PgViewSet.

1.2ViewSet Extension

After the URL is extended, the corresponding response interface is extended, which is mainly implemented for the interface class specified in the URL. In the previous PG, two different interfaces are specified: Get and operation commands. The corresponding code path is/rest-api/calamari-rest/view/v2.py, the specific code is as follows:

Class PgViewSet (RPCViewSet ):

Serializer_class = PgSerializer

Deflist (self, request, fsid, pool_id ):

PoolName = self. client. get (fsid, POOL, int (pool_id) ['pool _ name']

Pg_summary = self. client. get_sync_object (fsid, PgSummary. str)

Pg_pools = pg_summary ['pg _ pools '] ['by _ pooled'] [int (pool_id)]

Forpg in pg_pools:

Pg ['pooled '] = poolName

Return Response (PgSerializer (pg_pools, bytes = True). data)

Defapply (self, request, fsid, pool_id, pg_id, command ):

Return Response (self. client. apply (fsid, PG, pg_id, command), status = 202)

From the above implementation, we can see that the Code implements two interfaces: list and apply, which correspond to the previous get and post operations. The above two operations will interact with the background cthulhu. The parameters are obtained and the request is submitted. The returned content is also different.

Serialization settings are also made in the list interface, namely, PgSerializer, which is implemented in rest-api/calamari-rest/serializer/v2.py.

1.2.1 serialization

Data is usually serialized in Rest-Api, which is not necessarily required. It is usually necessary in the operation to be changed. The serialization operation of Pg is as follows:

Class PgSerializer (serializers. Serializer ):

ClassMeta:

Fields = ('id', 'pool ', 'state', 'up', 'acting', 'Up _ primary', 'Acting _ primary ')

Id = serializers. CharField (source = 'pgid ')

Pool = serializers. CharField (help_text = 'pool name ')

State = serializers. CharField (source = 'state', help_text = 'pg state ')

Up = serializers. Field (help_text = 'pg Up set ')

Acting = serializers. Field (help_text = 'pg acting set ')

Up_primary = serializers. IntegerField (help_text = 'pg up primary ')

Acting_primary = serializers. IntegerField (help_text = 'pg acting primary ')

This part is not necessary. Some modules may not have such operations. In the previous three steps, the Rest-API part is basically extended, and the main ViewSet is extended. ViewSet actually implements the interaction between cthulhu and rest-api.

In fact, rpc and background interaction are used in ViewSet extension. Therefore, the implementation of cthulhu mainly processes the corresponding rpc requests.

1.3rpc Extension

All the request operations are implemented in rpc. py, but the new extension operations also need to support expansion. Take pg as an example to continue to explain:

Defapply (self, fs_id, object_type, object_id, command ):

"""

Apply commands that do not modify an object in a cluster.

"""

Cluster = self. _ fs_resolve (fs_id)

Ifobject_type = OSD:

# Run a resolve to throw exception if it's unknown

Self. _ osd_resolve (cluster, object_id)

Return cluster. request_apply (OSD, object_id, command)

Elifobject_type = PG:

Return cluster. request_apply (PG, object_id, command)

Else:

Raise NotImplementedError (object_type)

The Pg list is obtained through PgSummary. This part already exists in the previous implementation. The previous code implementation is as follows:

Defget_sync_object (self, fs_id, object_type, path = None ):

"""

Getone of the objects that ClusterMonitor keeps a copy of from the mon, such

Asthe cluster maps.

: Param fs_id: The fsid of a cluster

: Param object_type: String, one of SYNC_OBJECT_TYPES

: Param path: List, optional, a path within the object to return insteadof the whole thing

: Return: the requested data, or None if it was not found (including ifany element of ''path''

Was not found)

"""

Ifpath:

Obj = self. _ fs_resolve (fs_id). get_sync_object (SYNC_OBJECT_STR_TYPE [object_type])

Try:

For part in path:

If isinstance (obj, dict ):

Obj = obj [part]

Else:

Obj = getattr (obj, part)

T (AttributeError, KeyError) as e:

Log. exception ("Exception % s traversing % s: obj = % s" % (e, path, obj ))

Raise NotFound (object_type, path)

Return obj

Else:

Returnself. _ fs_resolve (fs_id). get_sync_object_data (SYNC_OBJECT_STR_TYPE [object_type])

1.4cluster _ monitor. py Extension

All requested operations are controlled by clusters. This part can be implemented through cluster_monitor. pg is used as an example.

Def _ init _ (self, fsid, cluster_name, notifier, persister, servers, eventer, requests ):

Super (ClusterMonitor, self). _ init __()

Self. fsid = fsid

Self. name = cluster_name

Self. update_time = datetime. datetime. utcnow (). replace (tzinfo = utc)

Self. _ notifier = notifier

Self. _ persister = persister

Self. _ servers = servers

Self. _ eventer = eventer

Self. _ requests = requests

# Which mon we are currently using for running requests,

# Identified by minion ID

Self. _ favorite_mon = None

Self. _ last_heartbeat = {}

Self. _ complete = gevent. event. Event ()

Self. done = gevent. event. Event ()

Self. _ sync_objects = SyncObjects (self. name)

Self. _ request_factories = {

CRUSH_MAP: CrushRequestFactory,

CRUSH_NODE: CrushNodeRequestFactory,

OSD: OsdRequestFactory,

POOL: PoolRequestFactory,

CACHETIER: CacheTierRequestFactory,

PG: PgRequestFactory,

ERASURE_PROFILE: ErasureProfileRequestFactory,

ASYNC_COMMAND: AsyncComRequestFactory

}

Self. _ plugin_monitor = PluginMonitor (servers)

Self. _ ready = gevent. event. Event ()

This part is mainly to bind the corresponding request with the corresponding request factory class, in order to generate a suitable request.

1.5 factory writing

The factory class mainly implements specific interface classes based on different requirements. Different objects have different request classes. Pg is used as an example to describe:

From cthulhu. manager. request_factory importRequestFactory

From cthulhu. manager. user_request importRadosRequest

From calamari_common.types importPG_IMPLEMENTED_COMMANDS, PgSummary

Class PgRequestFactory (RequestFactory ):

Def scrub (self, pg_id ):

Return RadosRequest (

"Initiating scrub on {cluster_name}-pg {id}". format (cluster_name = self. _ cluster_monitor.name, id = pg_id ),

Self. _ cluster_monitor.fsid,

Self. _ cluster_monitor.name,

[('Pg scrub ', {'pgid': pg_id})])

Defdeep_scrub (self, pg_id ):

Return RadosRequest (

"Initiating deep-scrub on {cluster_name}-osd. {id}". format (cluster_name = self. _ cluster_monitor.name, id = pg_id ),

Self. _ cluster_monitor.fsid,

Self. _ cluster_monitor.name,

[('Pg deep-scrub ', {'pgid': pg_id})])

Defrepair (self, pg_id ):

Return RadosRequest (

"Initiating repair on {cluster_name}-osd. {id}". format (cluster_name = self. _ cluster_monitor.name, id = pg_id ),

Self. _ cluster_monitor.fsid,

Self. _ cluster_monitor.name,

[('Pg repair ', {'pgid': pg_id})])

Defget_valid_commands (self, pg_id ):

Ret_val = {}

File ('/tmp/pgsummary.txt', 'a + '). write (PgSummary. str +' \ n ')

Pg_summary = self. _ cluster_monitor.get_sync_object (PgSummary)

Pg_pools = pg_summary ['pg _ pools '] ['by _ pool']

Pool_id = int (pg_id.split ('.') [0])

Pool = pg_pools [pool_id]

Forpg in pool:

If pg ['pgid'] = pg_id:

Ret_val [pg_id] = {'valid _ commands': PG_IMPLEMENTED_COMMANDS}

Else:

Ret_val [pg_id] = {'valid _ commands': []}

Return ret_val

This class implements the implementation of three different commands. This command is mainly for encapsulation. This part of the keywords need to be selected according to the parameters in the ceph source code, therefore, you must refer to the json Parameter Name of the corresponding command in the ceph source code for encoding.

1.6salt-minion is an extension module of salt. It is mainly used to obtain corresponding data information and execute corresponding operation commands. Execute the corresponding operation command through salt in cthulhu. Ceph. py has interfaces such as rados. commands, which can be used to execute ceph commands. The commands encapsulated in the factory class are finally executed through this interface.

Summary
Overall, the code structure of Calamari is clear and the open-source framework is worth learning. In the future distributed management system, you can also refer to the architecture of saltstack + diamond + graphite, which implements the control logic, the next two functions are data collection and data storage display.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.