Hawk Run Tpc-ds
Create folder, put TPC tool into
cd/tpcds/v2.1.0/tools/
./dsdgen-dir/opt/3t_data-scale 3000-parallel 20-child 20-terminate N
[Root@node2/]# mkdir tpcds_3t [Root@node2/]# ls Bin boot Cgroups_test Dev etc Hadoop home lib lib64 lost+found media mnt opt proc root sbin selinux srv SYS tmp TPCDS_3T usr var [Root@node2/]# CD tpcds_3t [Root@node2 tpcds_3t]# ls Dstools.zip [Root@node2 tpcds_3t]# |
Unpack the toolkit and enter the tools to compile
[Root@node2 tpcds_3t]# Unzip Dstools.zip ----- [Root@node2 tpcds_3t]# ls Dstools.zip TPCDSVersion1.3.1 [Root@node2 tpcds_3t]# CD TPCDSVERSION1.3.1/ [Root@node2 tpcdsversion1.3.1]# ls Answer_sets dbgen2 query_templates query_variants Specification Tools [Root@node2 tpcdsversion1.3.1]# CD Tools [Root@node2 tools]# make |
[Root@node2 tools]#./dsqgen–help |
Multithreading to generate data, running in the background
Nohup./dsdgen-dir/opt/3t_data-scale 3000-parallel 30-child 1-terminate N & |
View background processes
Modify the query1-99 template under Query_template, at the end of the line plus define _end = "";
#!/bin/bash Counter=1 While [$COUNTER-LT 100] Todo Echo $COUNTER echo "Define _end = \" \ ";" >>query$counter.tpl Counter= ' expr $COUNTER + 1 ' Done |
To generate a query statement
./dsqgen-output_dir/opt/tpc_3t_queries/-input/tpcds_3t/tpcdsversion1.3.1/query_templates/templates.lst-scale 3000-dialect Ansi-directory/tpcds_3t/tpcdsversion1.3.1/query_templates-rngseed 05092045000 |
[Root@node2 tools]# su gpadmin [Gpadmin@node2 tools]$ Psql Psql (8.2.15) Type ' help ' for help. gpadmin=# |
gpadmin=# CREATE database tpcds_3t; CREATE DATABASE gpadmin=# \l list of Databases name | owner | Encoding | Access Privileges -----------+---------+----------+------------------- gpadmin | Gpadmin | utf8 | postgres | gpadmin | utf8 | template0 | gpadmin | utf8 | template1 | gpadmin | utf8 | tpcds | gpadmin | utf8 | tpch | gpadmin | utf8 | (6 rows) gpadmin=# \c Tpcds Your are now connected to database "Tpcds" as User "gpadmin". tpcds=# |
Build table
tpcds=# \d List of relations Schema | Name | Type | Owner | Storage --------+-----------------------+-------+---------+------------- Public | customer_address | Table | Gpadmin | Append only Public | Customer_demographics | Table | Gpadmin | Append only Public | Date_dim | Table | Gpadmin | Append only Public | dbgen_version | Table | Gpadmin | Append only Public | Income_band | Table | Gpadmin | Append only Public | Inventory | Table | Gpadmin | Append only Public | Item | Table | Gpadmin | Append only Public | Promotion | Table | Gpadmin | Append only Public | Reason | Table | Gpadmin | Append only Public | Ship_mode | Table | Gpadmin | Append only Public | Store_returns | Table | Gpadmin | Append only Public | Store_sales | Table | Gpadmin | Append only Public | Time_dim | Table | Gpadmin | Append only Public | Warehouse | Table | Gpadmin | Append only Public | Web_page | Table | Gpadmin | Append only Public | Web_site | Table | Gpadmin | Append only (rows) |
Copy yaml file to data path
[Root@node2 ds_data]# pwd /opt/ds_data [Root@node2 ds_data]# Ls–s |
Batch modification of YAML files (database name, port number, data path, data file name, etc.)
[Root@node2 ds_data]# sed-i ' s/5432/5430/g ' *.yaml |
Load table
[Root@node2 ds_data]# gpload-f Call_center.yaml 2016-05-06 16:14:39|info|gpload Session started 2016-05-06 16:14:39 2016-05-06 16:14:39|info|setting schema ' public ' for table ' Call_Center ' 2016-05-06 16:14:39|info|started gpfdist-p 8081-p 8082-f "Data1g/call_center.dat"-t 30 2016-05-06 16:14:46|info|running time:6.75 seconds 2016-05-06 16:14:46|info|rows Inserted = 6 2016-05-06 16:14:46|info|rows Updated = 0 2016-05-06 16:14:46|info|data Formatting errors = 0 2016-05-06 16:14:46|info|gpload Succeeded [Root@node2 ds_data]# |
Bulk Load Script
#!/bin/bash For F in *.yaml Todo Gpload-f $f Done |
View table size after loading
Select Relname, Pg_size_pretty (Pg_relation_size (relname)) From Pg_stat_user_tables where schemaname = ' public ' Order by Pg_relation_size (relname) desc; |
Generate 99 SQL log files
#!/bin/bash Counter=1 While [$COUNTER-LT 100] Todo Echo $COUNTER Touch Query$counter.log Chown Gpadmin Query$counter.log Counter= ' expr $COUNTER + 1 ' Done |
Join \timing before each SQL
[Root@node2 query_templates]# sed-i-E ' 1i\\\timing ' query* |
Performing SQL batch Processing
#!/bin/bash Time For F in query* Todo LOG=${F} ". Log" Echo $log psql-d tpcds-f $f > $log; Done |
[Gpadmin@node2 query_templates]$./sql.sh |
Merging test results
[Root@node2 query_templates]# cat Query*.log > 1g_result.log |
Clear cache after run completes
Free–m Echo 3 >/proc/sys/vm/drop_caches |
Table loading, loader sending rate of about 120MB, receiving rate of about 50MB (so at least 8 hours, why not cut plus. )