Summary: Scenarios Typically, we work with data, a column of data types either as strings or as numbers, which are primitive types of data.
Scene
In general, we work with data, a column of data types that are either strings or numbers, which are primitive types of data. In some complex business scenarios, we use complex formats in a column, such as array arrays, objects (map), JSON, and other formats to represent complex data, such as:
array_column
is an array type. If we want to count the aggregated values of all the values in the Array_column, then we have to iterate through each element in the array of each row.
Unnest syntax
Unnest (Array) as Table_alias (column_name)
Indicates that the array type is expanded to multiple rows, and the name of the row is column_name.
Unnest (map) as table (Key_name, Value_name)
表示把map类型展开成多行,key的名称为key_name, value的名称为value_name
Note that since Unnest receives data from an array or map type, if the user's input is a string type, it must first be converted to a JSON type and then converted to an array type or map type, in the waycast(json_parse(array_column) as array(bigint))
Iterate through each element of an array
Use SQL to expand the array into multiple lines:
* | select array_column, a from log, unnest( cast( json_parse(array_column) as array(bigint) ) ) as t(a)
The above SQL expands the array into multiple lines of numbers, unnest( cast( json_parse(array_column) as array(bigint) ) ) as t(a)
unnest syntax expands the array, names the newly generated table as T, and uses a to refer to the expanded column. Results such as:
* | select sum(a) from log, unnest( cast( json_parse(array_column) as array(bigint) ) ) as t(a)
* | select a, count(1) from log, unnest( cast( json_parse(array_column) as array(bigint) ) ) as t(a) group by a
Traverse Map
* | select map_column , a,b from log, unnest( cast( json_parse(map_column) as map(varchar, bigint) ) ) as t(a,b)
* | select key, sum(value) from log, unnest( cast( json_parse(map_column) as map(varchar, bigint) ) ) as t(key,value) GROUP BY key
Formatting displays the results of the Histogram,numeric_histogram
1.histogram
The histogram function is similar to the Count group by syntax. Syntax reference documentation.
Usually we see the results of histogram as follows:
* | select histogram(method)
is a bunch of JSON, unable to configure the view display, we can use the Unnest syntax to expand the JSON into a multi-line configuration view, for example:
* | select key , value from( select histogram(method) as his from log) , unnest(his ) as t(key,value)
Next, you can configure the visual view:
2. Numeric_histogram
The Numeric_histogram syntax is to assign a numeric column to a number of buckets, which is equivalent to a group by of a numeric column, a specific syntax reference document
* | select numeric_histogram(10,Latency)
The output of the Numeric_histogram is as follows:
To format this result, we write sql:
* | select key,value from(select numeric_histogram(10,Latency) as his from log) , unnest(his) as t(key,value)
The results are as follows:
At the same time configure the form of histogram display:
Original link
SQL old driver, calculate array & map & JSON data in SQL