Opentsdb-writing Data

Last Update:2014-10-17 Source: Internet

Author: User

Tags rrd opentsdb

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Writing Data

You could want to jump right in and start throwing data into your TSD, but to really take advantage of Opentsdb ' s power and Flexibility, want to pause and think about your naming schema. After your ' ve done so, you can procede to pushing data over the Telnet or HTTP APIs, or use an existing tool with OPENTSD B support such as ' Tcollector '.

You may be transferred here to start throwing data into TSD, but really take advantage of the power and flexibility of OPENTSDB, you may need to stop and think about your naming schema.

You can then continue to push data via telnet or httpapis, or leverage existing OPENTSDB supported tools, such as Tcollector

Naming schema naming paradigm

Many metrics administrators is used to supplying a single name for their time series. For example, systems administrators used to rrd-style systems may name their time series webserver01.sys.cpu.0.user. The name tells us that the time series are recording the amount of time in user space for CPU 0 on WebServer01 . This works great if you want to retrieve just the user time for that CPU core on that particular Web server later on.

Most of the metrics use a single name. For example, system-managed parameters are named with the rrd-format, such as Webserver01.sys.cpu.0.user. This name tells us that the time series is a record of the cpu0 user on Webser01 .

The elapsed time. This is well supported if you want to get the user-state usage time of the CPU on a particular Web server.

But what if the Web server had cores and you want to get the average time across all of them? Some systems allow specify a wild card such asWebserver01.sys.cpu.*.userThat would read all files and aggregate the results. Alternatively, you could record a new time series calledWebserver01.sys.cpu.user.allThat represents the same aggregate and you must now write ' 1 ' different time series. What if you had a thousand Web servers and do wanted the average CPU time for all of your servers? Could craft a wild card query like*.sys.cpu.*.userAnd the system would open all 64,000 files, aggregate the results and return the data. Or you setup a process to pre-aggregate the data and write it toWebservers.sys.cpu.user.all.

But what if the Web server has 64 cores and you want to get average time? Some systems allow you to use a fuzzy match, such as Webserver01.sys.cpu.*.user , then read 64 files and then aggregate them.

In addition, you can record a new time series called Webserver01.sys.cpu.user.all, which represents the same aggregation effect, but requires 64+1 a different time series.

If you have 1000 webserer, calculate the average CPU time for all of the servers in a picture? You might use *.sys.cpu.*.user , then read 64,000 files, and then aggregate the results to return data, or aggregate the data in advance, and write a new time series such as Webservers.sys.cpu.user.all.

OPENTSDB handles things a bit differently by introducing the idea of ' tags '. Each time series still have a ' metric ' name, but it's much more generic, something that can is shared by many unique time s Eries. Instead, the uniqueness comes from a combination of tags key/value pairs that allows for flexible queries with very fast AG Gregations.

Opentsdb use different processing methods, introduce the idea of tags. Each time series has a metric name, but this is more generic and is shared by many different time series.

The uniqueness comes from the tag,key/value pairs, which makes the use of queries flexible and quick to integrate.

Note

Every time series in OPENTSDB must has at least one tag.

There is at least one tag per time in the Opentsdb.

Take the previous example where the metric wasWebserver01.sys.cpu.0.user. In Opentsdb, this may becomesys.cpu.userHost=webserver01, cpu=0 . Now if we want the data for an individual core, we can craft a query likesum:sys.cpu.user{host=webserver01,cpu=42}. If we want all of the cores, we simply drop the CPU tag and ask forSum:sys.cpu.user{host=webserver01}. This would give us the aggregated results for all cores. If We want the results for all servers, we simply requestSum:sys.cpu.user. The underlying data schema would store all of theSys.cpu.userThe time series next to the aggregating the individual values are very fast and efficient. Opentsdb is designed to make these aggregate queries as fast as possible since most users start off at a high level, then Drill down for detailed information.

Go back to the metric,webserver01.sys.cpu.0.user in the previous example. In Opentsdb, it will become Sys.cpu.userHost=webserver01, cpu=0.

If you want to get the data for a single core, you can use the following query sys.cpu.user{host=webserver01,cpu=42}.

If you want to get all the cores, you can use the following query Sys.cpu.user{host=webserver01}, which gives the result of 64 cores aggregation.

If you want to get all the webserver, query the way like Sys.cpu.user.

The underlying data structure is to store Sys.cpu.user time series individually, so getting a single value is very fast and efficient.

The goal of the OPENTSDB design is to integrate the query as quickly as possible, because most users perform higher-level queries and then get more detailed information.

aggregations--Aggregation

While the tagging system was flexible, some problems can arise if you don ' t understand how the querying side of Opentsdb, H Ence the need for some forethought. Take the example query above:Sum:sys.cpu.user{host=webserver01}. We recorded unique time series forWebServer01, one time series for each of the CPU cores. When we issued this query, all of the time series for metricSys.cpu.userWith the tagHost=webserver01Were retrieved, averaged, and returned as one series of numbers. Let's say the resulting average was -For timestamp1356998400. Now we were migrating from another system to OPENTSDB and had a process this pre-aggregated all "cores so" we could Quickly get the average value and simply wrote a new time seriessys.cpu.user Host=webserver01 . If we run the same query, we ' ll get a value of -At1356998400. What happened? Opentsdb Aggregated all time series andThe pre-aggregated time series to get to that 100. In storage, we would has something like this:

Although the labeling system is flexible, you may encounter problems if you do not understand the Opentsdb query method, so you need to know more.

Take the above query as an example: Sum:sys.cpu.user{host=webserver01}

WebServer01 Records 64 different time series, each of which records one. When the query is discussed, all metric with the label Host=webserver01 Sys.cpu.user are queried and averaged to return a string of numbers.

Suppose the result is an average of 50 and a timestamp of 1356998400. Now we move to another OPENTSDB system, which has a process to pre-integrate 64 cores of data so that we will quickly get the average, write a new time series in Sys.cpu.user Host=webserver01, but run the same query, The result was 100. So what's going on here?

In the store, the data format is as follows:

sys.cpu.user Host=webserver01 1356998400 50sys.cpu.user host=webserver01,cpu= 0 1356998400 1sys.cpu.user host=webserver01,cpu=1 1356998400 0sys.cpu.user host=webserver01,cpu=2 1356998400 2SYS.CP U.user host=webserver01,cpu=3 1356998400 0...sys.cpu.user host=webserver01,cpu=63 1356998400 1

Opentsdb would automatically aggregate All of the time series for the metric in a query if no tags is gi Ven. If one or more tags is defined, the aggregate would ' include all ' time series that match on that tag, regardless of O ther tags. With the querySum:sys.cpu.user{host=webserver01}, we would include Sys.cpu.user host=webserver01,cpu =0 as well as sys.cpu.userHost=webserver01,cpu=0,manufacturer=intel,

sys.cpu.user host=webserver01,foo=bar and

sys.cpu.userhost=webserver01,cpu=0,datacenter=lax,department=ops.

The moral of this example is: Being careful with your naming schema.

If TAGS,OPENTSDB is not set in a query, automatically consolidates all time series. If you define one or more tags, the integration will only contain the time series that matches the tag, ignoring the other tags.

For example, query Sum:sys.cpu.user{host=webserver01} will include the following:

Sys.cpu.user host=webserver01,cpu=0

Sys.cpu.userHost=webserver01,cpu=0,manufacturer=intel

Sys.cpu.user Host=webserver01,foo=bar

Sys.cpu.userHost=webserver01,cpu=0,datacenter=lax,department=ops

The moral of this example is that the use of the naming schema should be cautious

Resources

1, http://opentsdb.net/docs/build/html/user_guide/writing.html

Opentsdb-writing Data

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More