Or the previous XENIX system, as a place of the Public Security Department of the Household Registration inquiry and management system, records more than 500,000 of the basic information of the population. The system was developed in 1989 and was not designed with a standard database concept (not designed to be compatible), allowing only queries, printing, or generating output text, with the results of the output text being roughly as follows:
Refer to the structure in the attachment example, which, through VIM (or VI), plus a shell script, is organized into a standard database entry format for these text files:
First of all, a script, named m.sh
The contents are as follows:
#从脚本命令行中 get the file path to manipulate
VI $1<<end >&/dev/null
#替换所有 ^m line break (becomes a standard line break under Linux)
:%s/\r//g
#删除记录之间的表格行
: g/^.─.*/d
#在记录号前面加上分隔符 in order to deal with the unification behind
:%s/record number:/│ record number:/g
#去掉一条记录中间的断 line so that a single record occupies only one line of space
:%s/\n[^$]/│/
#删除第00000号记录, this record is useless.
: g/record number: 00000/d
#删除 Table Header Statistics description
: g/Public Security population Information Management System/D
#去掉 extra spaces and field delimiters
:%s/\s*││*\s*/│/g
#删除记录前缀, such as "Name: John" into "John"
:%s/│[^│][^│]*:\s*/│/g
#删除行首与行尾的字段分隔符
:%s/^│\s*//g
:%s/│$//g
: Wq
End
Save the above script after storing the export. TXT file is executed under the path:
Find. -maxdepth 1-name "*.txt"-exec./m.sh {} \;
When finished, all record fields are delimited with "│" and are collated in the Behavior record unit.
The rest of the work is much simpler, and if you want to migrate to another database, do it in text format.