SLS Evaluation Report

Source: Internet
Author: User
Tags aliyun

What is SLS?  

Simple log service, or SLS, is a service for log collection, storage, querying, and analysis. Users simply configure the location and format of the log generated information, you can query the massive log in real-time, and through the SLS to save the log archive to ODPs for data analysis.

SLS provides the ability to write and query logs in the API, which supports query expressions for multiple Boolean operations. In addition to the API, log collection can be done through an easy-to-use log collection client Logtail.

SLS Concept Interpretation  

1.1 Project Space (project)

Project space projects is the basic unit of SLS management, and project name is globally unique. Users can create one or more project for log management.

1.2 log type (category)

The journal type (category) is defined under Project, and the log type is used to differentiate between different logs in a project (for example: Access log accesslog, Application log applog, System log syslog, etc.).

1.3 A single journal (log)

Under one journal type (category) for a project, a log consists of the following sections:

Subject (TOPIC): A user-defined field that distinguishes a log under the same category (for example, to differentiate a batch of access logs based on a different site, or leave the field blank without distinction), by default the word blank (not differentiated).

Time: A field in the log that represents the time the log was produced (in seconds, the number of seconds from 1970-1-1 08:00:00), which is typically generated directly from the time in the log.

Content: Used to record the specific contents of a log. The content section consists of one or more content items, each of which consists of key and value pairs, where key represents the name of the content item and value is the specific content.

Source: Represents the location of the log source, typically the IP address of the log machine that is generated, and the default field is empty.

SLS the characteristics and advantages  

Use

Supports multiple log formats (Logtail access): With regular expressions you can convert any format text into semi-structured log data, which can be processed by the log system.

Isianvi (Logtail Access): Web-based configuration management, no need to log on to the machine.

Multiple query expressions: Based on time, source, keyword query, support and, or, not operator.

Reliable

High reliability (Logtail access): Capable of handling network outages, log file rollback, log file deletion, new log files, and many other situations.

Efficient

Real-time: Log generated to be able to be traced to a delay of less than 1 minutes

Fast response: Response speed within 100 milliseconds, one query can process thousands log in 1 seconds

Evaluation: First, create SLS project

In the SLS Control Panel, select Create Project and enter a project name to create a SLS project called "Sls-test-project".


After the creation is complete:

Ii. creating the SLS Project journal classification

The SLS log type (category), which classifies the logs to be monitored, should follow the rule that the logs in the same category have the same log format at least.

Use the Nginx Access log log as an example to create the SLS log type named "Nginx-access-log". Go to the "Sls-test-project" SLS Project management page created in the previous step and select "Create category":

In the Log data consumption mode option, for a more comprehensive evaluation of SLS, we will also check the offline archive to the ODPs option. Click "OK" and you will be prompted to collect logs via the Logtail client or through the SLS SDK API when the creation is complete.

Logtail is a log collector running on Unix/linux, and the API can write, query, and so on through the API provided by SLS. Here we choose to create the Logtail configuration.

In the specified log directory structure This step, according to the configuration of the server to fill the Nginx log path, because Nginx itself does not partition the log files, so this is only monitoring access.log files. For auto-split logs, wildcards can be used for monitoring, such as the log is split into Access.log, Access-1.log, Access-2.log 、..., using wildcards access*.log
In the log sample, you can randomly extract one or more logs from an existing log file and proceed to the next step.

In the resolution log this step, need to use the regular expression to structure the log, before this, first look at Nginx verbose log structure, in the nginx.conf can through their own requirements to the log style customization, here is configured as follows:

Corresponding to the log in the example, this configuration file is explained below:

- td> record request method, URL of request and HTTP protocol used
log field and the corresponding part of the sample field explanation
$remote _addr 113.64.66.49< /td> Client IP address
- - delimiter only
$remote _user Client user name
[$time _local] [17/aug/2014:15:24:13 +0800] Access Request time
"$request" "get/article/centos-64bit-nginx-php-fastcgi-opcache-mariadb http/1.1"
$status $ Request Response status
$body _b Ytes_sent 35177 response body size
"$http _referer" "http://www.x86pro.com/" Request Source page address (from which page link is accessed)
"$http _user_agent" "mozilla/5.0 (X11; Linux i686; rv:24.0) gecko/20100101 firefox/24.0 " User Client related information

The regular expression rules that match the fields of the log are as follows:
^ (\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}) \s-\s (. *) \s\[(. *) \]\s\ "(. *) \" \s (\d{3,}) \s (\d+) \s\ "([^\s]*) \" \s\ "(. *) \" $
About regular expressions, here need some regular foundation, to the regular is not very familiar with the classmate, recommended reference related articles on regular expressions have a general understanding, recommended "Regular expression 30 minutes introductory Tutorial", here to explain only the section related to this rule:

^ Match the start of a string
$ Match the end of a string
() The bracketed part as a whole
\ Escape character
\d Match a number
{M,n} Repeat matches m to N, example \d{1,3}: matches a number with a length of 1-3
\s Matches any whitespace character, such as normal space, \ t, \ n, \ r
. Match any character other than line break
+ Repeat one or more times

In the time format conversion items, because the software writes the log time format may be different, so need to follow certain rules, the time format manually converted to SLS can recognize the format, detailed conversion mode and conversion options see simple Log service? Problems? Time format conversion document.
The next two steps, follow the prompts to create the machine group, and then apply to the machine group on it, here no longer repeat, the process:



After about 1 minutes, the log data starts to appear.

SLS Log Query

In the search bar, you can execute the SLS query statement, sample statement: query for all requests with a status of 404 in 15 minutes:



From the query results of the example, the request field can be seen, the intranet IP is 10.168.68.10 the site root directory is missing robots.txt files, 10.168.63.86 picture server is missing/images/productimages/ Frontcol.jpg file, resulting in 404, for these problematic 404 of the page, we can promptly improve it. If combined with the logic query syntax provided by SLS, regular analysis of the logs can be very easy and convenient.

description of the query statement: In the "Status 404" query statement, the log entry that is to be queried, both the ' status ' keyword, and the ' 404 ' keyword, which is match by keyword, and the keyword being searched is not case-sensitive, and does not mean the ' status ' value is ' 404 ' log , this is very important, do not misunderstand.

So in "status 404", are all the results of status 404? The answer is no, for example, in a log body_bytes_sent is 404, then the log will also be retrieved to display. :


This is not a bug, but the author's initial understanding of the error, that status 404 will be found in the Status field of the log (in fact, here plus or without the status keyword on the results have no effect, the author only to illustrate this issue), in response to this, I will also in the final evaluation of the proposed phase Related to the recommendation, the proposed increase the relevant query rules, here for the moment, not the table.

The SLS Log query statement syntax currently retains the keyword andor not(case insensitive) and parentheses(), double quotation marks"With a back slash\, and these keywords have priority points, the order of precedence is">( )> and not>or, (and and not are siblings), the high and low of the priority determines the direction in which queries are combined between keywords. This is similar to the subtraction arithmetic, when the addition and subtraction occurs in a formula multiplication method, and then add and subtract, if there are parentheses, first in parentheses, for these query keyword usage, you can use some examples to illustrate:

A query with a page with a status of 404 and a log with the requested resource robots.txt: 404 and Robots.txt
Check the user's browser type for Firefox or Chrome: Firefox or Chrome
Query Request response status code except 200 and 304 logs: not 304

The use of double quotation marks: The contents of a pair of double quotation marks represent a string, and any content within a string is treated as a whole, which can be used to escape a keyword, such as:
Log with and characters in the query log: "and"

Use of the escape character \: \ is used to escape the double quotation mark in the keyword, "Only one" is the keyword in the SLS query syntax, but with \ "Escaped, it represents the double quotation mark itself, example usage:
Query the log with double quotes in all requests: \ "

Where is the log keyword?

In the above mentioned many times the SLS is a keyword query, and this keyword is automatically divided by SLS, the current can not be manually intervened, then how to know that their search is not the keyword is the search encountered problems, according to the author's test, in the log results, the mouse moved to the search results, The part that can be displayed as a hand type is a separate keyword:


Only by understanding this, in order to better use the query, for example, when we want to search for Google from the request, assuming that Google's spiders are from http://www.google.com/, then we search Google or google.com is not the correct display The expected results because Google and google.com are not classified as keywords by SLS, www.google.com is the right keyword.

About Topic 

Log offline archive to ODPs

When configuring the SLS log data consumption mode, when the offline archive is checked to ODPs, the log data will be automatically archived to the ODPs sls_log_archive project, then we can easily download, backup, data analysis and other operations of the log data.

Since the project is a system project and will not be displayed in the list of Control Panel items in ODPS, it needs to be operated by ODPS client, ODPS upload and download client, ODPS Client is currently available in Java version and requires Java JRE Environment support, so if the JRE environment is not installed, JRE 1.6 (http://www.oracle.com/technetwork/java/javase/downloads/) is recommended in the official documentation to be used prior to the installation of the Java operating environment. JAVA-ARCHIVE-DOWNLOADS-JAVASE6-419409.HTML#JRE-6U45-OTH-JPR). In addition to the client, ODPs also supports programmatic operation through the Java SDK API. ODPS Java SDK Development Kit

See ODPS User Manual for a complete ODPS reference document

Configuring the ODPs Client

After extracting the downloaded OSPD client odps-cli-java.zip, go to the Conf directory, edit the configuration file odps.conf, fill in a set of access key IDs generated in the Create access key with access key Secret, and Specify the default ODPs project

Access.id= your access key idaccess.key= your access key Secret endpoint=http://service.odps.aliyun.com/apidefault.project= Sls_log_archive

After the configuration is complete, enter the bin directory of the ODPs, Windows system execution Odps.bat, Linux execution ODPs, in the case of the Windows environment, directly double-click Odps.bat to enter the ODPS command line operation interface:

Configure ODPs upload Download client

Will download the Odps-dship.zip decompression, the author's decompression path for D:\SlsEnv\odps-dship, into the odps-dship directory, edit odps.conf

#Odps dship config#wed Oct 13:39:54 CST 2014tunnel-endpoint=http://dt.odps.aliyun.com (Hangzhou node intranet can be used in the Internal network/HTTP/ dt-ext.odps.aliyun-inc.com) key= your access key Secret project=sls_log_archive (default project) Id= your access key ID

Tunnel-endpoint configuration instructions See: ODPs data upload Download related issues > How to set dship when using ODPs tunnel or endpoint to download data?

Open a command prompt, switch to the Odps-dship directory, perform the Windows execution Dship.bat, Linux execution dship,:

View information for a specified table in a sls_log_archive project

As an example of creating a sample log grouping Nginx-access-log, the table name in ODPs is Sls_test_project_nginx_access_log (project name +category), and on the ODPs command line, the command:
Desc Sls_test_project_nginx_access_log


An introduction to the columns in the table (excerpted from the documentation)

Serial number Column Name Type Note
1 __source__ String Log Source Ip,sls reserved fields
2 __time__ bigint Unix timestamp generated by the log, SLS reserved field
3 __topic__ String Log topic,sls reserved fields
4 _extract_others_ String JSON string, semi-structured user log saved to the column in key:value form
5 __partition_time__ String (PT) Readable date format, partition column, computed by __time__, for example 2014_06_24_12_00

One thing to note here is that the __partition_time__ field in the table, which is a "Partition" field, can be similar to a separate folder, and is named in the form of 2014_10_08_12_00, incremented in hours, This field will be used later when you download the log data using the upload download tool.

Download log data by __partition_time__ partition

ODPs upload Download tool, currently only support download to a single file, and each download only support to download a table or a partition to a file, the partitioned table must specify the downloaded partition to download properly. example, download the log data from the SLS nginx_access_log category on October 7, 2014 20 o'clock and name it access_log_1410072000.txt:

dship download sls_log_archive.sls_test_project_nginx_access_log/__partition_time__= "2014_10_07_20_00" Access_log _1410072000.txt


In order to improve the efficiency of download and to complete the operation of batch download, we recommend using ODPs SDK for compiling and downloading, or writing executable scripts such as bat, Python, etc., using script to invoke ODPs command.

View Original: http://bbs.aliyun.com/read/178909.html
Weibo Interaction: http://weibo.com/1644971875/Br41E1xSE#_rnd1413011556756
Participating activities: http://promotion.aliyun.com/act/aliyun/freebeta/

SLS Evaluation Report

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.