Suppose you have a file that looks like this:
Chrom POS ID REF ALT QUAL FILTER INFO FORMAT samplename
1 3552841. G. 32.995. Dp=1; mq0f=0; af1=0; ac1=0;dp4=1,0,0,0; mq=40; fq=-29.9912 GT:PL:DP 0/0:0:1
1 3552842. T. 32.995. Dp=1; mq0f=0; af1=0; ac1=0;dp4=1,0,0,0; mq=40; fq=-29.9912 GT:PL:DP 0/0:0:1
2 3552843. G. 32.995. Dp=1; mq0f=0; af1=0; ac1=0;dp4=1,0,0,0; mq=40; fq=-29.9912 GT:PL:DP 0/0:0:1
2 3552844. T. 32.995. Dp=1; mq0f=0; af1=0; ac1=0;dp4=1,0,0,0; mq=40; fq=-29.9912 GT:PL:DP 0/0:0:1
3 3552845. G. 32.995. Dp=1; mq0f=0; af1=0; ac1=0;dp4=1,0,0,0; mq=40; fq=-29.9912 GT:PL:DP 0/0:0:1
3 3552846. C. 32.995. Dp=1; mq0f=0; af1=0; ac1=0;dp4=1,0,0,0; mq=40; fq=-29.9912 GT:PL:DP 0/0:0:1
4 3552847. A. 32.995. Dp=1; mq0f=0; af1=0; ac1=0;dp4=1,0,0,0; mq=40; fq=-29.9912 GT:PL:DP 0/0:0:1
5 3552848. C. 32.995. Dp=1; mq0f=0; af1=0; ac1=0;dp4=1,0,0,0; mq=40; fq=-29.9912 GT:PL:DP 0/0:0:1
6 3552849. A. 32.995. Dp=1; mq0f=0; af1=0; ac1=0;dp4=1,0,0,0; mq=40; fq=-29.9912 GT:PL:DP 0/0:0:1
7 3552850. C. 32.995. Dp=1; mq0f=0; af1=0; ac1=0;dp4=1,0,0,0; mq=40; fq=-29.9912 GT:PL:DP 0/0:0:1
Now extract the first column character as "3" and print out all columns that match the first "3", and save, you can use the following command:
Awk-f "" ' {if ($1~/^3/) print $1,$2,$3,$4,$5,$6,$7,$8,$9,$10} ' sample.vcf > SAMPLECOL.VCF #{if ($1~/^3/) print $ $2,$3,$4,$5,$6,$7,$8,$9,$10} ' means that in file sample.vcf, if there is a matching (~) regular expression (/^3/) inside the first column, then output (print) to standard output. Saved to the SAMPLECOL.VCF file.
The output files are as follows:
3 3552845. G. 32.995. Dp=1; mq0f=0; af1=0; ac1=0;dp4=1,0,0,0; mq=40; fq=-29.9912 GT:PL:DP 0/0:0:1
3 3552846. C. 32.995. Dp=1; mq0f=0; af1=0; ac1=0;dp4=1,0,0,0; mq=40; fq=-29.9912 GT:PL:DP 0/0:0:1
Linux extracts specified column characters and prints all content (awk)