HDFs Enable SSD storage 1. Configure the data node for HDFS
[DISK]/HADOOP/HDFS/DATA,[SSD]/HADOOP/HDFS/SSD
Mount SSD disk to path/HADOOP/HDFS/SSD on all data nodes
and ensure that the owner of the/HADOOP/HDFS/SSD path is Hdfs:hadoop
drwxr-x--- 3 hdfs hadoop 4096 Oct 17 19:10 /hadoop/hdfs/ssd
Restart Data node
2. Create an HDFs path using SSDs
hdfs dfs -mkdir /ssd
3. Set the/SSD Storage policy: ALL_SSD
hdfs storagepolicies -setStoragePolicy -path /ssd-policy ALL_SSD
HAWQ CREATE TABLE Space 1. Create a file space configuration file that executes on the master node
$hawq filespace -o tpc_h_config
filespace:fs_tpc_hfsreplica:3dfs_url::mycluster/ssd/fs_tpc_h
2. Create an HDFs directory
$hdfs dfs -mkdir /ssd$hdfs dfs -chown gpadmin:gpadmin /ssd$hdfs dfs -ls /
3. Create a file space
$hawq filespace -c tpc_h_config
4. Create a table space
Psql
create tablespace ts_tpc_h filespace fs_tpc_h;
5. View all the current table spaces
SELECT spcname AS tblspc, fsname AS filespc, fsedbid AS seg_dbid, fselocation AS datadir FROM pg_tablespace pgts, pg_filespace pgfs, pg_filespace_entry pgfse WHERE pgts.spcfsoid=pgfse.fsefsoid AND pgfse.fsefsoid=pgfs.oid ORDER BY tblspc, seg_dbid;
HAWQ CREATE table 1. Build table
create table region(r_regionkey integer,r_name char(25),r_comment varchar(152),r_extra char(1))with(appendonly=true,orientation=parquet,compresstype=snappy) tablespace ts_tpc_hdistributed by(r_regionkey) ;
2. View the table space used by the table
select c.relname, d.dat2tablespace tablespace_id, d.oid database_id, c.relfilenode table_id from pg_database d, pg_class c, pg_namespace n where c.relnamespace = n.oid and d.datname = current_database() and n.nspname = ‘qbyps‘ and c.relname = ‘p‘;
SELECT pgfs.oid fs_id,pgts.oid ts_id, spcname AS tblspc, fsname AS filespc, fsedbid AS seg_dbid, fselocation AS datadir FROM pg_tablespace pgts, pg_filespace pgfs, pg_filespace_entry pgfse WHERE pgts.spcfsoid=pgfse.fsefsoid AND pgfse.fsefsoid=pgfs.oid ORDER BY tblspc, seg_dbid;
Maintenance
HAWQ uses the Libhdfs3.so API to access HDFs and currently does not support storage policies. Therefore, it is necessary to maintain the data after it is written.
hdfs mover -p /ssd/fs_tpc_h
Appendix:
Storage Policy Commands
List all storage policies
hdfs storagepolicies -listPolicies
Set up a storage policy
hdfs storagepolicies -setStoragePolicy -path <path> -policy <policy>
For example
hdfs storagepolicies -setStoragePolicy -path /tmp -policy ALL_SSD
To cancel a storage policy
hdfs storagepolicies -unsetStoragePolicy -path <path>
After that directory or file, with its superior directory, if it is the root directory, then hot
Get access Policy
hdfs storagepolicies -getStoragePolicy -path <path>
Apache HAWQ Create a table using SSD disks