05. Anatomy cel file version format and read method (non-R language)

Source: Internet
Author: User

ComparedDATfiles, more support on the networkCELlevel of files. CELhas putDATThe image is converted into data, andCELthanDAToccupy a much smaller space. IntroduceCELthe format of the file,CELThe file has a text file (Textcelfile, Version3),Binarycelfile(binary file, version4),Genericcelfile(normal file, version1) Three kinds.

1) version 3

Early CEL file is version 3 , because it is a text file, so directly open with Notepad can see the contents, as follows is GSM2899. CEL:

[CEL]

Version=3

[HEADER]

cols=640

rows=640

......

DATHEADER=[5..46118] afrgv01031201:cls=4733 rws=4733 xin=3 yin=3 ve=17 2.0 03/12/1 17:16:25 GridVerify=None HG_U95AV2.1SQ 6

......

Cellheader=x Y MEAN STDV npixels

0 0 278.0) 95.3 25

1 0 22909.3) 5244.4 20

2 0 390.0) 121.0 25

3 0 22530.0) 5102.5 25

......

638 639 20835.5) 3531.1 20

639 639 292.0) 85.2 25

Can seeversion=3, number of columnsColsand number of rowsRows's All640. can findDatheaderThere are a lot of , it plays the role of the split string (this is my first time inCin the language source to see such a garbled), put "datheader="The later sections are divided into sections and then found to". 1SQ"The end of that part, i.e."hg_u95av2.1sq"and then put". 1SQ"removed, the chip model was successfully readhg_u95av2up. cellheader=xY MEAN STDV Npixelsin theXand theYThe point is the probe (characteristic)Xcoordinates andYcoordinates,MEANrefers to the strength of the probe,STDVis the variance,NpixelsThe number of pixels used to calculateMEANand theSTDV. Each line is a probe (characteristic) of the data, which is a640*640Array, soXwill be from0Change to (640-1), in this loop640Times,Yalso from0Change to (640-1), but repeat for each number640times. That way, there's just640*640all right. The data we need to use is justMEANthat column, no need.STDVand theNpixels, whileXand theYcan be calculated by extrapolation. In this way, we can understand that the coordinates are (0,0) has a probe strength of278.0, the coordinates are (1,0) has a probe strength of22909.3, the coordinates are (2,0) has a probe strength of390.0...

2) version 4

Later appeared version 4 CEL files, they are binary files, directly open with Notepad will see a lot of garbled. You can use the CellFileConversionTool.exe tool to format conversions for version 3 and version 4 . After converting version 3 to version 4 , the file is much smaller because X and Y have been removed. These two columns of data. This version uses a small-endian byte order, and the following lists different reading methods for different data types:

Integer:

If you read integer data in Java:

such as:fileinputstream fin=new fileinputstream (" path to CEL file ");

DataInputStream din=new DataInputStream (Fin);

......

/* first read out 4 bytes * /

Int[] Bytedataint=new int[4];

for (int i=0;i<4;i++)

Bytedataint[i]=din.read ();

/* Shift, section (i-1) byte right shift i*8 bytes * /

for (int i=0;i<4;i++)

bytedataint[i]=bytedataint[i]<<8*i;

/* re-conducted | arithmetic */

int result=bytedataint[0]|bytedataint[1]|bytedataint[2]|bytedataint[3];

......

If you use C to complete the above work, it is more convenient:

such as:file *infile = fopen (" path of CEL file ", "RB"));

......

int result;

Fread_int32 (&result,1,infile);

......

In this way, an integer data is read out and stored in result .

Short:

If you are reading short-integer data in Java :

Int[] Bytedataint=new int[2];

for (int i=0;i<2;i++)

Bytedataint[i]=din.read ();

for (int i=0;i<2;i++)

bytedataint[i]=bytedataint[i]<<8*i;

int result=bytedataint[0] | BYTEDATAINT[1];

In C language:

Fread_int16 (& (Result,1,infile);

Float:

If you are reading floating-point data in Java :

Int[] Bytedataint=new int[4];

for (int i=0;i<4;i++)

Bytedataint[i]=din.read ();

int symbol=bytedataint[3] & 8; Get the symbol

int power= (bytedataint[3]<<1 | bytedataint[2]>>7)-127; Get the Power

int temp= bytedataint[2] & 127; Let the 8th bit to be 0

int a=temp<<16 | bytedataint[1]<<8 | BYTEDATAINT[0];

float result=1;

for (int i=1;i<=23;i++)

{

int x=a& (int) (Math.pow (2, i-1)); Keep value of the I bite and make others bites to be 0

int xx=x>> (I-1); Move the I bite to the right end;

Double addcount=xx* (Math.pow (2,-(23-(i-1))); Computing the Increment

Result=result+addcount;

}

result=result* (int) (Math.pow (2, Power));

if (symbol==1)

Result=-result;

In C language:

Fread_float32 (& (Result,1,infile);

Above the3An example can be seen,Javaand theClanguages can do the same, butJavaIt's a lot of trouble.,and the experiment proves thatJavaIt takes a lot more time. such as version4The probe strength isfloattype, if a chip640*640Each probe strength is usedJavawill take a long time to read, andCLack of language1seconds to complete.

3) version 1

Version1in version3on the basis of this and removedSTDVand theNpixelsthese two columns, and there areFread_be_int32,fread_be_uint16,Fread_be_float32waitClanguage reading methods, all of which have equivalentJavaimplementation method, but withJavato readCELthe file is always slow.

05. Anatomy cel file version format and read method (non-R language)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.