In the previous "time series database Dewford General Assembly TSDB Directory Part 1" and "Time series database Dewford Tsdb directory of the General Assembly", we introduced some common tsdb, and in the "Time series Database Dewford Chapter Kairosdb" In-depth understanding of kairosdb. This article will detail the opentsdb in Tsdb.
Opentsdb, which can be considered as a time series data (library), it is based on hbase storage data, the full use of HBase distributed Columnstore features, support millions of per second read and write, it is characterized by easy to expand, flexible tag mechanism.
Introduction to Architecture
Here we take a brief look at its architecture, as shown in:
The main component is TSD, which is the core of receiving data and storing it in hbase processing. The server with the C (collector) flag is a data acquisition source that sends data to the TSD service.
Installing OPENTSDB
In order to install OPENTSDB, the following conditions and software are required:
- Linux operating system
- JRE 1.6 or later
- HBase 0.92 or later
- Installing Gnuplot
If you also want to use your own interface, you will need to install Gnuplot 4.2 and later, as well as GD and Gd-devel. Here we have selected the version of Gnuplot 5.0.1.
According to the situation (not installed), install the required software
-y gd gd-devel libpng libpng-devel
After installing gnuplot:
$ tar zxvf gnuplot-5.0.1.tar.gz$ cd gnuplot-5.0.1$ ./configure$ make$ sudo make install
Install HBase
First, make sure the java_home is set up:
echo $JAVA_HOME/usr
This is not much to say, very simple, just follow the https://hbase.apache.org/book.html#quickstart here, download, unzip, modify the configuration file, start.
At this time, set the Hbase_home again:
echo $HBASE_HOME/opt/hbase-1.0.1.1
You can then start HBase:
$ /opt/hbase-1.0.1.1/bin/start-hbase.shstarting master, logging to /opt/hbase-1.0.1.1/logs/hbase-vagrant-master-localhost.localdomain.out
Installing OPENTSDB
This is also very simple, if the build fails, it must be a lack of make or autotools and other things, with the package Manager installed.
$ git clone git://github.com/OpenTSDB/opentsdb.git$ cd opentsdb$ ./build.sh
The table structure required to create the table Opentsdb:
$ env compression=NONE./src/create_table.sh2016-01-0806:17:58,045 WARN [main] util. Nativecodeloader:unableToLoad Native-hadoop LibraryFor yourPlatform ...Using Builtin-java classes where applicable HBase Shell; Enter ' Help 'For listof supported commands. Type "Exit"To leaveThe HBase Shell Version1.0.1.1, Re1dbf4df30d214fca14908df71d038081577ea46, Sun may1712:34:PDT2015Create ' Tsdb-uid ', {NAME = ' id ', COMPRESSION = 'NONE ', Bloomfilter = ' ROW '},{name = ' NAME ', COMPRESSION = 'NONE ', Bloomfilter = ' ROW '}0 row (s)Inch1.3180Secondshbase::table–tsdb-uidCreate ' Tsdb ', {NAME = ' t ', VERSIONS =1, COMPRESSION = 'none ', Bloomfilter = ' ROW '} 0 row (s) in 0.2400 secondshbase::table–tsdbcreate ' Tsdb-tree ', {NAME = ' t ', VERSIONS = > 1, COMPRESSION = ' none ', bloomfilter = ' ROW '}0 row (s) in 0.2160 secondshbase::table–tsdb-treecreate ' Tsdb-meta ', {NAME = > ' name ', COMPRESSION = ' none ', bloomfilter = ' ROW '} 0 row (s) in 0.4480 secondsHbase: : Table–tsdb-meta
In the Habse shell, you can see that the table has been created successfully.
> listTABLE tsdb tsdb-meta tsdb-tree tsdb-uid 4 row(s) in 0.0160 seconds
Once the table is created, you can start the TSD service, just run the following command:
$ build/tsdb tsd
If you see the output:
2016-01-09 05:51:10,875 INFO [main] TSDMain: Ready to serve on /0.0.0.0:4242
You can think of a successful start.
Save data to Opentsdb
After installing and starting all the services, let's try to send 1 data.
The simplest way to save data is to use Telnet.
4242put sys.cpu.user 1436333416 23 host=web01 user=10001
This data can be seen from the Opentsdb interface. Since Sys.cpu.sys has only one data, OPENTSDB can only see one point.
For OPENTSDB with the query interface, access to http://localhost:4242 can be.
Data storage structure in the OPENTSDB
Let's take a look at the important concept UID of opentsdb, starting with the data stored in HBase, let's look at what tables it has, and what those tables do.
TSDB: Storing data points
hbase (main):003:0> scan ' tsdb ' ROW column+c ELL \x00\x00\x01U\x9c\xaep\x00\x Column=t:q\x80, timestamp=1436350142588, Value=\x17 00\x01\x00\x00 \x01\x00\x00 \x 02\x00\x00 \x02 1 row (s) in 0.2800 seconds
As you can see, the table has only one piece of data, so let's take a look at the column, just a column, with a value of 0x17, that is, decimal 23, the value of the metric. rowID.
The row key on the left is one of the features of Opentsdb, whose rules are:
metric + timestamp + tagk1 + tagv1… + tagkN + tagvN
The above attribute values are the UID of the corresponding name.
The metric we added above are:
sys.cpu.user 1436333416 23 host=web01 user=10001
A total of 5 UID is involved, namely metric named Sys.cpu.user, and host and user two tagk and their values WEB01 and 10001.
The row key for the above data is:
\x00\x00\x01U\x9C\xAEP\x00\x00\x01\x00\x00\x01\x00\x00\x02\x00\x00\x02
Here's how the row key is calculated, so let's take a look at the Tsdb-uid table.
Tsdb-uid: Store the mapping relationship for name and UID
The following table Tsdb-uid data, the line between the lines artificially added, for the convenience of display.
Tsdb-uid is used to preserve the relationship between the name and UID (METRIC,TAGK,TAGV), which is formed in groups, that is, given a name and UID, the (NAME,UID) and (Uid,name) two records are saved.
We see a total of 8 rows of data.
As we have seen in the Tsdb table above, the row key for the metric data is \x00\x00\x01U\x9C\xAEP\x00\x00\x01\x00\x00\x01\x00\x00\x02\x00\x00\x02
, we break it down and connect it with the + sign (the mapping from name to UID is the last 5 lines):
\x00\x00\x01 + U + \x9C\xAE + P + \x00\x00\x01 + \x00\x00\x01 + \x00\x00\x02 + \x00\x00\x02sys.cpu.user 1436333416 host = web01 user = 10001
As you can see, this is in line with the way the row key is formed in front of us.
What you need to focus on is how the timestamp is stored
Although the time we specify is in seconds, the row key is used in one hour, i.e.: 1436333416 – 1436333416 % 3600 = 1436331600
.
1436331600 is converted to 16, which is the 0x55 0x9c 0xae 0x50, and the 0x55 is the capital letter u,0x50 the uppercase P, which is the 4-byte timestamp storage method. I believe the following diagram can help you understand this meaning, that is, only one row key per hour, the data will be saved as a column every second, greatly improve the speed of the query.
In turn, the same as the UID to name, for example, to find the UID \x00\x00\x02 TAGK, we can see from the above results, the row key (\X00\X00\X02) has 4 columns, and COLUMN=NAME:TAGK value is the user, Very simple and intuitive.
Important: We see that the above metric or TAGK or TAGV, uid only 3 bytes, which is the default configuration of Opentsdb, three bytes, should be able to represent more than 16 million different data, which is long enough for metric name or TAGK, For TAGV is not necessarily, such as TAGV is the IP address, or phone number, then this field is not long enough, this can be modified by the source code to recompile opentsdb can be, at the same time to note that, after the re-compilation, old data can not be used directly, you need to export and re-import.
Tsdb-meta: Meta data table
Let's look at the third table Tsdb-meta, which is the table used to store time-series indexes and metadata. This is also an optional feature that is not turned on by default and can be enabled through the configuration file, so there is no special introduction.
Tsdb-tree: Tree-shaped table
The 4th table is Tsdb-tree, used in a tree-like hierarchy to represent the structure of the metric, only after the configuration file to open the feature, the table will be used, here we do not introduce, you can try.
Saving data via HTTP interface
Save data in addition to our previous use of the Telnet method, you can also choose the HTTP API or Bulk Import Tool "' Import (http://opentsdb.net/docs/build/html/user_guide/cli/import.html)
Suppose we have the following data, saved as file Mysql.json:
[ { "Metric ":"Mysql.innodb.rowlocktime", "Timestamp ":1435716527, "Value ":1234, "Tags ":{ "Host ":"Web01", "DC ":"Beijing"}}, {"Metric ":"Mysql.innodb.rowlocktime", "Timestamp ":1435716529, "Value ":2345, "Tags ":{ "Host ":"Web01", "DC ":"Beijing"}}, {"Metric ":"Mysql.innodb.rowlocktime", "Timestamp ":1435716627, "Value ":3456, "Tags ":{"host": DC ": " Beijing "}}, {" Span class= "Hljs-attribute" >metric ": " Mysql.innodb.rowlocktime "," timestamp ": 1435716727, "value": 6789, "tags ": {" host ": "Web01", "DC": "Tianjin"}}]
Then execute the following command:
-X POST -H “Content-Type: application/json” http://localhost:4242/api/put -d @mysql.json
You can save the data to the OPENTSDB.
Querying data
After reading how to save the data, let's look at how to query the data.
Querying data can use the query interface, which can either use the Get query string, or use post to specify the criteria in JSON format, here we will be an example of the data just saved to illustrate.
First, save the following as Search.json:
{ "start": 1435716527, "queries": [ { "metric": "mysql.innodb.rowlocktime", "aggregator": "avg", "tags": { "host": "*", "dc": "beijing" } } ] }
Execute the following command to query:
$ curl-s-x post-h"Content-type:application/json" http://localhost:4242/api/query-D @search. JSON | JQ. [ {"Metric":"Mysql.innodb.rowlocktime","Tags": {"Host": "Web01", "DC": "Beijing"}, "Aggregatetags": [], "DPS": { "1435716527" : 1234, "1435716529": 2345}}, { "metric": "Mysql.innodb.rowlocktime", " host ": " web02 ", " DC ": "Beijing"}, "Aggregatetags": [], "DPS": { Span class= "hljs-string" > "1435716627": 3456}] "", it can be seen that we have saved the Dc=tianjin data, but did not return in this query, This is because we have specified the condition of dc=beijing.
It is worth noting that the tags parameter in the new version 2.2, will not be recommended, instead of the filters parameter.
Summarize
It can be seen that the opentsdb is still very easy to get started, especially the stand-alone version, installation is very simple. With HBase as the backing, the query is also very fast, many large companies, such as Yahoo, and so on, are also using this software.
However, large-scale use, multiple TDB and multi-storage nodes, etc., should all need professional, careful operation and maintenance work.
Original
Time Series Database Dewford General Assembly OPENTSDB