Business Background
Convention five days ago HDFs data is outdated version data, write a script to automatically delete outdated version data
$ Hadoop FS-ls/user/pms/workspace/ouyangyewei/DataFound9Itemsdrwxr-XR- x -PMS PMS0 -- ,- One -:Geneva/user/pms/workspace/ouyangyewei/Data/ -- ,- onDrwxr-XR- x -PMS PMS0 -- ,- One -:Geneva/user/pms/workspace/ouyangyewei/Data/ -- ,- GenevaDrwxr-XR- x -PMS PMS0 -- ,- One -:Geneva/user/pms/workspace/ouyangyewei/Data/ -- ,-GenevaDrwxr-XR- x -PMS PMS0 -- ,- One -:Geneva/user/pms/workspace/ouyangyewei/Data/ -- ,-GenevaDrwxr-XR- x -PMS PMS0 -- ,- One -:Geneva/user/pms/workspace/ouyangyewei/Data/ -- ,- toDrwxr-XR- x -PMS PMS0 -- ,- One -:Geneva/user/pms/workspace/ouyangyewei/Data/ -- ,- .Drwxr-XR- x -PMS PMS0 -- ,- One -:Geneva/user/pms/workspace/ouyangyewei/Data/ -- ,- -Drwxr-XR- x -PMS PMS0 -- ,- One -:Geneva/user/pms/workspace/ouyangyewei/Data/ -- ,- ,Drwxr-XR- x -PMS PMS0 -- ,- One -:Geneva/user/pms/workspace/ouyangyewei/Data/ -- ,- the
Script Implementation
#--------------------- ------------------------------------ # # Delete history version (five days ago for expiration Version data) # #------------------------------------ --------------------- old_version=$ ( Hadoop fs-ls/user/pms/workspace/ Ouyangyewei/data | awk begin{five_days_ago=strftime ("%F", Systime () -5*24*3600)} {split ($8,arr, "/"); if (arr [7]<five_days_ago) {printf "%s\n", $8}} ' ) arr= (${ Old_version///}) for version in ${ arr[@]} do hadoop fs-rmr $version done
After execution
$ Hadoop FS-ls/user/pms/workspace/ouyangyewei/DataFound4Itemsdrwxr-XR- x -PMS PMS0 -- ,- One -:Geneva/user/pms/workspace/ouyangyewei/Data/ -- ,- .Drwxr-XR- x -PMS PMS0 -- ,- One -:Geneva/user/pms/workspace/ouyangyewei/Data/ -- ,- -Drwxr-XR- x -PMS PMS0 -- ,- One -:Geneva/user/pms/workspace/ouyangyewei/Data/ -- ,- ,Drwxr-XR- x -PMS PMS0 -- ,- One -:Geneva/user/pms/workspace/ouyangyewei/Data/ -- ,- the
Copyright NOTICE: This article for Bo Master original article, without Bo Master permission not reproduced.
[Linux] combined with awk to delete data before the specified date in HDFs