Basic data structure and algorithm Learning (I)

Source: Internet
Author: User

Basic concepts and terms

1. Data)
Data is the carrier of information in the external world. It can be recognized, stored and processed by computers and is the raw material for computer program processing. Computer programs process a variety of data, such as integer, real, or plural data, or non-numeric data, such as characters, text, graphics, images, and sounds.

2. Data Elements and Data items are the basic units of Data. They are usually considered and processed as a whole in computer programs. Data elements are also called elements, nodes, vertices, records, and so on. A Data element can be composed of several Data items. A data item is an inseparable minimum unit of data that contains an independent meaning. A data item is also called a Field or Domain ). For example, in a database information processing system, a record in a data table is a data element. The student ID, name, gender, place of origin, date of birth, score, and other fields in this record are data items. There are two types of data items: primary items, such as the gender and nationality of the students, which cannot be separated during processing. The other is a combination item, such as the score of the students, it can be further divided into smaller items such as mathematics, physics, and chemistry.

3. A Data Object is a collection of Data elements of the same nature and a subset of Data. For example, the integer data object is {0, ± 1, ± 2, ± 3 ,...}, The character data object is {a, B, c ,...}.

4. Data Type)
Data type is a concept in advanced programming languages, and is the value range of data and the sum of operations on data. The data type specifies the properties of objects in the program. The results of each variable, constant, or expression in the program should belong to a certain data type. For example, the String type in C # (string, often written as String ). A String represents a constant set of character sequences. All character sequences constitute the value range of String. We can evaluate the length, copy, and connect two strings to the String.
Data types can be divided into two types: non-structured atomic types, such as basic types in C # Language (integer, real, complex, and so on), and structure types, its components can be composed of multiple structural types and can be decomposed. The components of the structure type can be non-structured or structured. For example, the components of arrays in C # Can be basic types such as integer or array.

5. Data Structure)
A Data Structure is a collection of data elements that have one or more specific relationships with each other. In any problem, data elements are not isolated, but have a certain relationship. This relationship is called Structure ). Based on the different characteristics of the relationship between data elements, there are usually four types of basic data structures:
(1) Set: As shown in 1.1 (a), the data elements in this structure do not have any other relationships except for the "same Set.
(2) Linear Structure: As shown in 1.1 (B), the data elements in this Structure have a one-to-one relationship.
(3) Tree Structure: As shown in 1.1 (c), there is a one-to-many relationship between the data elements in the Structure.
(4) Graphic Structure: as shown in Figure 1.1 (d), the data elements in this Structure have many-to-many relationships.

  

The relationship between elements in the set is very loose and can be expressed by other data structures. The concept of a set is described in section 1.3.1.
The formal definition of the Data Structure is DS, a binary group,
DS = (D, R)
Where: D is a finite set of data elements,
R is a finite set of relationships between data elements.
The following are examples to further understand the last three types of data structures.
(Example 1-1) student information table (as shown in Table 1.1 .) it is a linear data structure, and each row in the table is a record (in the database information processing system, a Data Element in the table is called a record ). A record consists of student ID, name, administrative class, gender, date of birth, and other data items. The relationship between data elements in a table is one-to-one.
Table 1.1 student information table

[Example 1-2] a family relationship is a typical tree structure, and Figure 1.2 shows a three-generation family relationship. In the figure, grandpa, son, daughter, grandson, granddaughter, or granddaughter are a node (in a tree structure, the data element is called a node), and they have one-to-many relationships. Grandpa has two sons and one daughter, which is a one-to-three relationship; a son has two sons (grandsons), which is a one-to-two relationship; the other son has a son (Grandfather's grandson) and a daughter (Grandfather's granddaughter). This is a two-to-two relationship. The daughter has three daughters (Grandfather's granddaughter ), this is a one-to-three relationship. The tree structure has a strict hierarchy. Grandpa is at the top of the tree structure. The middle layer is son and daughter, and the lower layer is Sun Tzu, granddaughter, and granddaughter. This relationship cannot be reversed, because there will never be a son or daughter, a grandfather, a grandson or granddaughter, a granddaughter, or a daughter.

  

[Example 1-3] Figure 1.3 shows the highway traffic diagram of four cities. This is a typical figure structure. In the figure, each city is a vertex (in a graph structure, data elements are called vertices), and there is a many-to-many relationship between them. Chengdu and Dujiangyan, ya'an direct access to roads, Dujiangyan and Chengdu, Qingchengshan direct access to roads, Qingchengshan and Dujiangyan, Chengdu and ya'an direct access to roads, ya'an and Chengdu, Qingchengshan direct access to roads. These highways constitute a highway traffic Network. Therefore, the graphic Structure is called a Network Structure)

The concept of data types and data structures shows that the relationship between the two is very close. Data types can be viewed as simple data structures. The value range of data can be seen as a finite set of data elements, while the set of operations on data can be seen as a set of relations between data elements.
The data structure includes the logical structure and physical structure of the data. The definition of the above data Structure is the Logic Structure of the data. The logical Structure of the data is a mathematical model abstracted from a specific problem to facilitate the discussion of the problem, it has nothing to do with the specific storage of data on the computer. However, we discuss the purpose of data structures to operate on them in computers. Therefore, we also need to study how to represent and store data structures in computers, that is, the Physical Structure of the data (Physical Structure ). The physical Structure of data, also known as the Storage Structure, is the representation and Storage of data in computers, including the representation and storage of data elements and the representation and storage of the relationship between data elements.
Data storage structures include sequential and chained storage structures. The Sequence Storage Structure expresses the logical relationship of data elements by the relative position of data elements in computer memory, generally, logically adjacent data elements are stored in storage units adjacent to physical locations. In C #, arrays are used to implement the sequential storage structure. Because the storage space allocated by arrays is continuous, arrays are inherently capable of sequential data storage structure. The Linked Storage Structure does not require logical adjacent data elements to be stored in adjacent locations. The Data Element in the chain storage structure is called a Node. The Address Domain is attached to the Node to store the Address of the Node adjacent to the Node to realize the logical relationship between the nodes. This address is called Reference, and this address Domain is called Reference Domain ).
From the end of 1960s to the beginning of 1970s, a large program emerged, and the software was relatively independent. People paid more and more attention to the data structure and thought that the essence of the program design was to determine the data structure and design a good algorithm, this is what people often call "program = Data Structure + algorithm ". The next section describes the algorithm issues.

Algorithm features
An Algorithm (Algorithm) is a description of the steps for solving a specific type of problem. It is a finite sequence of commands. Each command indicates one or more operations. An algorithm should have the following five features:
1. Finity: An algorithm always ends after a poor step is executed, that is, the execution time of the algorithm is limited.
2. Unambiguousness: each step of an algorithm must have a definite meaning, that is, it has no ambiguity and can only have the same output for the same input.
3. Input: An algorithm has zero or multiple inputs. It is the amount given before the algorithm starts. These inputs are data objects in a data structure.
4. Output: An algorithm has one or more outputs, and there is a specific relationship between these outputs and inputs.
5. realizability: each step in an algorithm can be achieved through a limited number of basic operations that have been implemented.
The meaning of an algorithm is very similar to that of a program, but the two are different. A program may not necessarily meet the need for poverty. For example, the operating system will never stop as long as the entire system is not damaged. In addition, a program can only be described in computer languages. That is to say, the commands in the program must be executable by machines, and the algorithms may not be described in computer languages, natural Language, block diagram, and pseudocode can all describe algorithms.
For a specific problem, the data structure is different, and the algorithm design is generally different. Even in the same data structure, different algorithms can be used. Which algorithm is suitable for solving the same problem and how to improve the existing algorithm to design an algorithm that is more suitable for the data structure, this is the issue of algorithm evaluation. The main criteria for evaluating an algorithm are as follows:
1. Correctness ). The execution result of an algorithm must meet the predefined functional and performance requirements. This is the most important and basic criterion for evaluating an algorithm. The correctness of the algorithm also includes clear and unambiguous descriptions of input and output processing.
2. Readability ). Algorithms are mainly used for reading and communicating, followed by machine execution. Therefore, an algorithm should be clear, hierarchical, simple, and easy to read and understand. Even if the algorithm has been converted into a program that can be executed by machines, you need to consider that people can better read and understand it. At the same time, a readable algorithm also helps to exclude hidden errors in the algorithm and transplant the algorithm.
3. Robustness ). An algorithm should have a strong fault tolerance capability. when inputting illegal data, the algorithm should be able to handle it appropriately so as not to cause serious consequences. Robustness Requirements indicate that the algorithm should fully and carefully consider all possible boundary and exception situations, and properly handle these boundary and exception situations, try to avoid unexpected situations in the algorithm.
4. Running Time ). Running time refers to the time spent by the algorithm running on the computer. It is equal to the total execution time of each statement in the algorithm. If multiple algorithms are available for the same problem, try to select an algorithm with a short execution time. Generally, the shorter the execution time, the better the performance.
5. Storage Space ). Space occupied is the storage space occupied by algorithms stored on computers, this includes the storage space occupied by the storage algorithm itself, the storage space occupied by the input and output data of the algorithm, and the storage space temporarily occupied by the algorithm during running. The storage space occupied by algorithms is the maximum storage space required during Algorithm Execution. If multiple algorithms are available for a problem, you should select algorithms with low storage capacity as much as possible. In fact, the algorithm's time efficiency and space efficiency are often in conflict. We need to handle the problem flexibly according to the actual needs. Sometimes we need to sacrifice space for time, and sometimes we need to sacrifice time for space.
Generally, the size of the storage Space temporarily occupied by an algorithm during running is called Space Complexity ). The space complexity of the algorithm is relatively easy to calculate. It mainly includes the storage space occupied by local variables and the storage space occupied by the stack used by the system for recursion.
If the algorithm is described in a computer language, it depends on the size of the program code. For the same problem, when the results evaluated using the above five criteria are the same, the less the code, the better. In fact, the larger the amount of code, the more storage space occupied, the longer the running time of the program, the higher the possibility of errors, and the more difficult to read.
In the above standards, this book mainly considers the running time of the program and the space occupied by the execution program. There are many factors that affect the running time, including algorithms, input data, and computer systems of the running program. The performance of a computer is determined by the following factors:
1. Hardware conditions. Including the type and speed of the processor used (for example, using a dual-core or single-core processor), available memory (Cache and RAM), and available external storage.
2. The computer language used to implement the algorithm. The higher the language level of the algorithm, the lower the execution efficiency.
3. Compiler/interpreter of the language used. In general, compilation is more efficient than interpretation, but interpretation is more flexible.
4. operating system software used. The main function of the operating system is to manage the software and hardware resources of the computer system and provide an interface for computer users to conveniently use the computer. Various language processing programs, such as compiling and interpreting programs, and applications are all running under the control of the operating system.

Time complexity of the algorithm
The Time Complexity of an algorithm is the correspondence between the running Time of the algorithm and the problem scale. An algorithm consists of the control structure and the original operation. The execution time of an algorithm depends on the comprehensive effect of the two. To make it easier to compare different algorithms of the same problem, the number of times (frequency) of repeated operations in the algorithm is usually used as the time complexity of the algorithm. Basic operations in an algorithm generally refer to the statements in the deepest loop of an algorithm. Therefore, in an algorithm, the frequency of basic operation statements is a function f (n) with the problem scale n: T (n) = O (f (n )). "O" indicates that, as the problem scale n increases, the algorithm execution time growth rate is the same as that of f (n), or the concept of an order of magnitude is represented by the "O" symbol. For example, if) 1n (n21) n (T −=, then) 1n (n21−'s order of magnitude is the same as n2, so T (n) = O (n2 ).

  

Using System; using System. collections. generic; using System. linq; using System. text; namespace ConsoleApplication1 {class Program {static void Main (string [] args) {int n = 4; int x = n; int y = 0; while (y <x) {y = y + 1; // if the frequency is n, the time complexity of the program segment is T (n) = O (4 ).} Console. writeLine ("running result: x = {0}, y = {1}, loop {2} Times", x, y, n); Console. read (); int n = 4; int a = 0; for (int I = 0; I <n; I ++) {for (int j = 0; j <n; j ++) {a = I * j; // if the frequency is 4*4, the time complexity of the program segment is T (n) = O (16 ).} Console. writeLine ("{0}", a); Console. read (); int n = 4; int x = n; int y = 0; while (x> = (y + 1) * (y + 1 )) {y = y + 1; // if the frequency is √ 4, the time complexity of the program segment is T (n) = O (2 ).} Console. writeLine ("{0}", y); Console. read (); // for (I = 0; I <m; I ++) // {// for (j = 0; j <t; j ++) // {// for (k = 0; k <n; k ++) // {// c [I] [j] = c [I] [j] + a [I] [k] * B [k] [j]; ① //} // solution: This is a triple loop program. The number of for loops in the outermost layer is m, and the number of for loops in the middle layer is t, the number of loops in the for loop is t. Therefore, if the frequency of statement ① In this program segment is m * n * t, the time complexity of the program segment is T (n) = O (m * n * t ). }}}

 

  

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.