GFF3 Format Files

Source: Internet
Author: User

GFF3 is the new standard for GFF annotation files. A property of each behavioral genome in the file, divided into 9 columns, separated by tab.

In turn:

1. Reference sequence: Reference sequence

The object that indicates the comment. such as a chromosome, clone or fragment. You can have multiple reference sequences.

The name of the ID cannot begin with ' > ' and cannot contain spaces.

2. Source: Sources

The source of the comment. If unknown, the dot (.) is used instead.

3. Type: Types

The type of the property. It is recommended to use a name that complies with the so Convention (sequence ontology, see [[Sequence Ontology Project]]), such as Gene,repeat_region,exon,cds.

4. Start position: Starting point

Property corresponds to the beginning of the fragment. Counting starting from 1.

5. End Position: End point

property corresponds to the end point of the fragment. Generally larger than the starting value.

6. Score: Score

For some properties that can be quantified, you can set a value here to indicate the degree of difference. If empty, replace with a dot (.).

7. Strand: Chain

"+" means positive chain, "-" means negative chain, "." Indicates no need to specify a positive and negative chain.

8. Phase: Stepping

For CDs that encode proteins, this column specifies where the next codon begins. can be 0,1 or 2, which indicates the number of bases to skip before reaching the next codon.

For other properties, the dot (.) is used instead.

9. Attributes: Properties

A list that contains many properties. The format is "label = value" (tag=value). Different attributes are separated by semicolons. There can be spaces, but if there is ", =;" The URL is escaped (url escaping rule), and the tab needs to be converted to "". All labels that start with uppercase captions are reserved for popular use, while labels that start with a lowercase letter are applied arbitrarily according to their own arrangement.

The following labels are defined:

Id

Specifies a unique identity. Classification of attributes is very useful (for example, to find an exon in a transcription unit).

Name

Specifies the name of the property. This property is displayed to the user. The value of name is displayed when visualized. As a result, name can be arbitrarily valued according to what it shows.

Alias

Name of the McCartney or other. This property is used when there are other names.

Parent

Indicates the last-level ID subordinate to the feature. Used to aggregate exons into transcript and transripts into gene.

Target

Specifies the target area of the alignment, which is generally used to indicate the alignment result of the sequence. The format is "target_id start end [Strand]", where strand is optional ("+" or "-"), and if the target_id contains spaces, it is converted to '.

Gap

The gap information compared to the result, together with Target, is used to indicate the alignment result of the sequence.

Note

Description of the descriptive.

Is_circular

Indicates whether the featrue is a torus. For the cyclic genome sequence.

Same tag if there are multiple values, separate the multiple values with commas, for example:

Parent=af2312,ab2812,abc-3

Alias=m19211,gna-12,gamma-globulin

The tags that can use multiple values are: Parent, Alias, Note, Dbxref and Ontology_term.

Reference: http://blog.sina.com.cn/s/blog_670445240102uxh2.html

GFF3 Format Files

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.