Time Series Database Dewford General Assembly OPENTSDB

Source: Internet
Author: User
Tags git clone opentsdb

"Editor's note" Liu Bin, a ONEAPM back-end research engineer with more than 10 years of programming experience, has been involved in the development of large-scale finance, communications and Android phone operations, and is familiar with Linux and background development technologies. He has been involved in the translation of the first Docker book, the introduction and practice of GitHub, the authoritative guide to WEB application Security, Web+db Press, and software Design, as well as the presenter of the Docker introductory and practical course. This paper describes the "time series database", the author is responsible for the product Cloud Insight to the performance index aggregation, grouping, filtering process of combing and summary.

What is Opentsdb

Opentsdb, which can be considered as a time series data (library), it is based on hbase storage data, the full use of HBase distributed Columnstore features, support millions of per second read and write, it is characterized by easy to expand, flexible tag mechanism.

Introduction to Architecture

Here we take a brief look at its architecture, as shown in:

The main component is TSD, which is the core of receiving data and storing it in hbase processing. The server with the C (collector) flag is a data acquisition source that sends data to the TSD service.

Installing OPENTSDB

In order to install OPENTSDB, the following conditions and software are required:

    • Linux operating system

    • JRE 1.6 or later

    • HBase 0.92 or later

    • Installing Gnuplot

If you also want to use your own interface, you will need to install Gnuplot 4.2 and later, as well as GD and Gd-devel. Here we have selected the version of Gnuplot 5.0.1.

According to the situation (not installed), install the required software

$ sudo yum install -y gd gd-devel libpng libpng-devel

After installing gnuplot:

$ tar zxvf gnuplot-5.0.1.tar.gz$ cd gnuplot-5.0.1$ ./configure$ make$ sudo make install
Install HBase

First, make sure the java_home is set up:

$ echo $JAVA_HOME/usr

This is not much to say, very simple, just to follow https://hbase.apache.org/book.html#quickstart the said here, download, unzip, modify the configuration file, start.

At this time, set the Hbase_home again:

$ echo $HBASE_HOME/opt/hbase-1.0.1.1

You can then start HBase:

$ /opt/hbase-1.0.1.1/bin/start-hbase.shstarting master, logging to /opt/hbase-1.0.1.1/logs/hbase-vagrant-master-localhost.localdomain.out
Installing OPENTSDB

This is also very simple, if the build fails, it must be a lack of make or autotools and other things, with the package Manager installed.

$ git clone git://github.com/OpenTSDB/opentsdb.git$ cd opentsdb$ ./build.sh

The table structure required to create the table Opentsdb:

$ env compression=none./src/create_table.sh2016-01-08 06:17:58,045 WARN [main] util.  nativecodeloader:unable to load Native-hadoop library for your platform ... using Builtin-java classes where applicablehbase Shell; Enter ' help ' for list of supported commands. Type "Exit" to leave the HBase shellversion 1.0.1.1, Re1dbf4df30d214fca14908df71d038081577ea46, Sun may 12:34:26 PDT 20 15create ' Tsdb-uid ', {NAME = ' id ', COMPRESSION = ' NONE ', bloomfilter = ' ROW '},{name = ' NAME ', COMPRESSION = > ' NONE ', bloomfilter = ' Row '}0 row (s) in 1.3180 secondshbase::table–tsdb-uidcreate ' Tsdb ', {NAME = ' t ', versi ONS = 1, COMPRESSION = ' NONE ', bloomfilter = ' Row '}0 row (s) in 0.2400 secondshbase::table–tsdbcreate ' tsdb-t Ree ', {NAME = ' t ', VERSIONS = 1, COMPRESSION = ' NONE ', bloomfilter = ' Row '}0 row (s) in 0.2160 secondshbase: : Table–tsdb-treecreate ' Tsdb-meta ', {name = ' name ', COMPRESSION = ' NONE ', bloomfilter = ' Row '}0 row (s) in 0.4 480 SeconDshbase::table–tsdb-meta 

In the Habse shell, you can see that the table has been created successfully.

> listTABLEtsdbtsdb-metatsdb-treetsdb-uid4 row(s) in 0.0160 seconds

Once the table is created, you can start the TSD service, just run the following command:

$ build/tsdb tsd

If you see the output:

2016-01-09 05:51:10,875 INFO [main] TSDMain: Ready to serve on /0.0.0.0:4242

You can think of a successful start.

Save the data to Opentsdb.

After installing and starting all the services, let's try to send 1 data.

The simplest way to save data is to use Telnet.

$ telnet localhost 4242put sys.cpu.user 1436333416 23 host=web01 user=10001

This data can be seen from the Opentsdb interface. Since Sys.cpu.sys has only one data, OPENTSDB can only see one point.

For OPENTSDB's own query interface, Access http://localhost:4242 can be.

Data storage structure in the OPENTSDB

Let's take a look at the important concept UID of opentsdb, starting with the data stored in HBase, let's look at what tables it has, and what those tables do.

TSDB: Storing data points

hbase(main):003:0> scan ‘tsdb‘ROW                           COLUMN+CELL   \x00\x00\x01U\x9C\xAEP\x00\x column=t:q\x80,timestamp=1436350142588, value=\x17    00\x01\x00\x00\x01\x00\x00\x    02\x00\x00\x02  1 row(s) in 0.2800 seconds

As you can see, the table has only one piece of data, so let's take a look at the column, just a column, with a value of 0x17, that is, decimal 23, the value of the metric. rowID.

The row key on the left is one of the features of Opentsdb, whose rules are:

metric + timestamp + tagk1 + tagv1… + tagkN + tagvN

The above attribute values are the UID of the corresponding name.

The metric we added above are:

sys.cpu.user 1436333416 23 host=web01 user=10001

A total of 5 UID is involved, namely metric named Sys.cpu.user, and host and user two tagk and their values WEB01 and 10001.

The row key for the above data is:

\x00\x00\x01U\x9C\xAEP\x00\x00\x01\x00\x00\x01\x00\x00\x02\x00\x00\x02

Here's how the row key is calculated, so let's take a look at the Tsdb-uid table.

Tsdb-uid: Store the mapping relationship for name and UID

The following table Tsdb-uid data, the line between the lines artificially added, for the convenience of display.

Tsdb-uid is used to preserve the relationship between the name and UID (METRIC,TAGK,TAGV), which is formed in groups, that is, given a name and UID, the (NAME,UID) and (Uid,name) two records are saved.

We see a total of 8 rows of data.

As we have seen in the Tsdb table above, the row key for the metric data is \x00\x00\x01U\x9C\xAEP\x00\x00\x01\x00\x00\x01\x00\x00\x02\x00\x00\x02 , we break it down and connect it with the + sign (the mapping from name to UID is the last 5 lines):

 \x00\x00\x01 + U + \x9C\xAE + P + \x00\x00\x01 + \x00\x00\x01 + \x00\x00\x02  + \x00\x00\x02sys.cpu.user       1436333416           host    =      web01          user     =    10001

As you can see, this is in line with the way the row key is formed in front of us.

What you need to focus on is how the timestamp is stored.
Although the time we specify is in seconds, the row key is used in one hour, i.e.: 1436333416 – 1436333416 % 3600 = 1436331600 .

1436331600 is converted to 16, which is the 0x55 0x9c 0xae 0x50, and the 0x55 is the capital letter u,0x50 the uppercase P, which is the 4-byte timestamp storage method. I believe the following diagram can help you understand this meaning, that is, only one row key per hour, the data will be saved as a column every second, greatly improve the speed of the query.

In turn, the same as the UID to name, for example, to find the UID \x00\x00\x02 TAGK, we can see from the above results, the row key (\X00\X00\X02) has 4 columns, and COLUMN=NAME:TAGK value is the user, Very simple and intuitive.

Important: We see that the above metric or TAGK or TAGV, uid only 3 bytes, which is the default configuration of Opentsdb, three bytes, should be able to represent more than 16 million different data, which is long enough for metric name or TAGK, For TAGV is not necessarily, such as TAGV is the IP address, or phone number, then this field is not long enough, this can be modified by the source code to recompile opentsdb can be, at the same time to note that, after the re-compilation, old data can not be used directly, you need to export and re-import.

Tsdb-meta: Meta data table

Let's look at the third table Tsdb-meta, which is the table used to store time-series indexes and metadata. This is also an optional feature that is not turned on by default and can be enabled through the configuration file, so there is no special introduction.

Tsdb-tree: Tree-shaped table

The 4th table is Tsdb-tree, used in a tree-like hierarchy to represent the structure of the metric, only after the configuration file to open the feature, the table will be used, here we do not introduce, you can try.

Saving data via HTTP interface

Save data in addition to our previous use of the Telnet method, you can also choose the HTTP API or Bulk Import Tool

import( http://opentsdb.net/docs/build/html/user_guide/cli/import.html )

Here we will take a brief example of the HTTP API.

Suppose we have the following data, saved as file Mysql.json:

[    {        "metric": "mysql.innodb.row_lock_time",        "timestamp": 1435716527,        "value": 1234,        "tags": {           "host": "web01",           "dc": "beijing"        }    },    {        "metric": "mysql.innodb.row_lock_time",        "timestamp": 1435716529,        "value": 2345,        "tags": {           "host": "web01",           "dc": "beijing"        }    },    {        "metric": "mysql.innodb.row_lock_time",        "timestamp": 1435716627,        "value": 3456,        "tags": {           "host": "web02",           "dc": "beijing"        }    },    {        "metric": "mysql.innodb.row_lock_time",        "timestamp": 1435716727,        "value": 6789,        "tags": {           "host": "web01",           "dc": "tianjin"        }    }]

Then execute the following command:

$ curl -X POST -H “Content-Type: application/json” http://localhost:4242/api/put -d @mysql.json

You can save the data to the OPENTSDB.

Querying data

After reading how to save the data, let's look at how to query the data.

Querying data can use the query interface, which can either use the Get query string, or use post to specify the criteria in JSON format, here we will be an example of the data just saved to illustrate.

First, save the following as Search.json:

{    "start": 1435716527,    "queries": [        {            "metric": "mysql.innodb.row_lock_time",            "aggregator": "avg",            "tags": {                "host": "*",                "dc": "beijing"            }        }    ]}

Execute the following command to query:

$ curl -s -X POST -H "Content-Type: application/json" http://localhost:4242/api/query -d @search.json | jq .[  {    "metric": "mysql.innodb.row_lock_time",    "tags": {      "host": "web01",      "dc": "beijing"    },    "aggregateTags": [],    "dps": {      "1435716527": 1234,      "1435716529": 2345    }  },  {    "metric": "mysql.innodb.row_lock_time",    "tags": {      "host": "web02",      "dc": "beijing"    },    "aggregateTags": [],    "dps": {      "1435716627": 3456    }  }]

As you can see, we have saved the Dc=tianjin data, but we have not returned it in this query because we have specified the dc=beijing condition.

It is worth noting that the tags parameter in the new version 2.2, will not be recommended, instead of the filters parameter.

Summarize

It can be seen that the opentsdb is still very easy to get started, especially the stand-alone version, installation is very simple. With HBase as the backing, the query is also very fast, many large companies, such as Yahoo, and so on, are also using this software.

However, large-scale use, multiple TDB and multi-storage nodes, etc., should all need professional, careful operation and maintenance work.

Related reading

This is the other part of this series of articles:

    • Sequential column Database Dewford what is TSDB

    • Time Series Database Dewford General Assembly TSDB Directory Part 1

    • Time Series Database Dewford General Assembly TSDB Directory Part 2

    • Time Series Database Dewford General Assembly KAIROSDB

Cloud Insight integrates monitoring, management, computing, collaboration, and visualization to help all IT companies reduce human and time cost inputs to system monitoring, making operations more efficient and simple.

This article was transferred from OneAPM official blog

Time Series Database Dewford General Assembly OPENTSDB

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.