First time job

Source: Internet
Author: User

Answer:1.1: data compression can be divided into: (1) lossy data compression and (2) lossless data compression two kinds. To compress the ① physical space: such as memory, disk, tape, u disk and other data storage media.  ② time interval, such as the time required to transmit a given message set, ③ electromagnetic bands such as the spectrum, bandwidth, etc. required to transmit a given message set. That is, the airspace occupied by a set of signals, time domain and frequency domain space

1.2: data in the computer storage capacity is very large, if the direct storage of data will make a large amount of storage, the computer is running slowly, access slow, inefficient, inconvenient. In addition, text, sound, animation, graphics, images and video and other media information. After digital processing, the amount of data is very large, if not data compression, the computer system can not be stored, exchanged and transmitted. Using data compression is a technical method of reducing the amount of data to reduce storage space and improve its transmission, storage, and processing efficiency. We compress the data by re-combining the regularity or repetition of the data by some algorithms or models, which reduces the redundancy and storage space of the data.

1.6: A classification method is based on whether the data can be completely lost to recover the original data after decoding, can be divided into: (1) lossless compression: Also known as reversible compression, distortion-free coding, entropy coding. Principle: Remove or reduce redundant values, but these values can be reinserted into the data when decompressed, restoring the original data. (2) lossy compression: Also known as irreversible compression and entropy compression. It is not possible to recover the data information that is reduced by this method when compressing.

The second classification method is divided according to the method adopted by the compression technique: Static image coding, TV coding, entropy coding, etc.

Reference Books 1-4

1. Use the compression tool on your computer to compress different files. Study the effect of the size of the original file on the ratio of the compressed file to the size of the source file.

By compressing the files found, although some files are compressed after the size of the space will not change too much, but most of the files in the compressed space is much smaller than the original file, in the transmission of a lot of time convenient. You can also perform a lossless restore.

2, extracts a few paragraphs from a popular magazine, and removes all text that does not affect comprehension, and achieves compression. For example, after deleting is, the, and that, and to my friend in "This is a dog," belongs, you can still pass the same meaning. Measure the redundancy in the text with the ratio of the number of words deleted to the total number of words in the original text, and repeat the experiment with the text in a technical journal. Do we quantitatively discuss the redundancy of text that is excerpted from different sources?

Redundancy it characterizes the excess of the source information rate and is a physical quantity that describes the objective statistical characteristics of the source. Due to the existence of redundancy in the source, that is, there is no need to transmit information, so the source also has the possibility of further compressing the information rate. The greater the redundancy, the greater the compression potential.

The popular talk is the repetition of the data.

Reference book "Introduction to Data Compression (4th edition)" Page 3, 5, 7 (a)

3, given the symbol set A={A1,A2,A3,A4}, to find the first-order entropy under the condition:

(a) P (A1) =p (A2) =p (A3) =p (A4) =1/4

(b) p (A1) =1/2, P (A2) =1/4, P (A3) =p (A4) =1/8

(c) p (A1) =0.505, P (A2) =1/4, P (A3) =1/8, P (A4) =0.12

H=-σ (AI) logp (AI)

=-p (A1) log2p (A1)-P (A2) log2p (A2)-P (A3) log2p (A3)-P (A4) log2p (A4)

= -1/4*log2 ( -1/4*log2) -1/4*log2 (quarter) -1/4*log2 (1/4)

=4*1/2

= 2

H=-σ (AI) logp (AI)

=-P (A1) log2p (A1)-P (A2) log2p (A2)-P (A3) log2p (A3)-P (A4) log2p (A4)

= -1/2*log2 ( -1/4*log2) -1/8*log2 (1/8) -1/8*log2 (1/8)

= 1/2+1/2+3/8+3/8

=1.75 (BITS)

H=-σ (AI) logp (AI)

=-P (A1) log2p (A1)-P (A2) log2p (A2)-P (A3) log2p (A3)-P (A4) log2p (A4)

= -0.505*log20.505-1/4*log2 ( -1/8*LOG2) (1/8) -0.12*log20.12

=1.74

5. Consider the following sequence:

Atgcttaacgtgcttaacctgaagcttccgctgaagaacctg

Ctgaacccgcttaagcttaagctgaaccttctgaacctgctt

(a) The probability values are estimated based on this sequence and the first, second, Sankai and four order entropy of the sequence are calculated.

(b) Based on these entropy, can you infer what kind of structure this sequence has?

(a) by test instructions:

The first order is:

P (A) =21/84=1/4, P (G) =16/84=4/21, P (C) =24/84=2/7, P (T) =23/84

H=-σp (AI) logp (AI)

=-P (a) log2p (a)-P (c) log2p (c)-P (g) log2p (g)-P (t) log2p (t)

-1/4LOG2 ( -2/7LOG2) (2/7) -4/21log2 (4/21) -23/84log2 (23/84)

=1/2+ 0.52+0.46+0.52

=2

7. Do an experiment to see how accurately a model can describe a source.

(a) Writing a procedure to randomly select letters from a 26-letter set of symbols {A, b,..., Z} to make up 100 four-letter words, how many of these words make sense

123456789101112131415161718192021222324 #include "stdafx.h"#include<cstdlib>#include<ctime>#include<iomanip>#include<iostream>usingnamespacestd;intmain(){    intr,i,j;    chara[100][100];    srand(time(NULL));    for(i=0;i<100;i++)    {        for(j=0;j<4;j++)        {            r=rand()%26;            a[i][j]=r+‘a‘;        }        a[i][4]=‘\0‘;        cout<<i+1<<" "<<a[i]<<"\t\t";    }return0;}

First time job

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.