This article is quoted from the "new data structure exercises and analysis" (Li Chunbao, etc.) the 1th chapter.
1. Basic concepts of data structure 1.1
Data is a symbolic representation of an objective thing, which in computer science refers to all the symbols that can be entered into a computer and processed by a computer program. For example, integers, real numbers, and strings are data.
1.2 Data elements
A data element, also known as a node, is a basic unit of data that is usually considered and processed in a computer program as a whole.
1.3 Data Items
A data item is the smallest unit of data. Data elements can consist of several data items. For example, a student record is a data element that consists of a number, name, gender, and other data items.
1.4 Data Objects
1.5 Data structures
Data structure refers to the collection of elements that exist in relation to each other, and the relationships in the data structure mainly refer to neighboring relationships . The data structure includes three aspects: the logical structure, the storage structure and the operation of the data.
1.6 Formal definition of data structure
The formal definition of data structure is a two-tuple: Data structure ds= (d, R), where D is a finite set of data elements, and R is a finite set of relationships on D. Examples see "New data structure exercises and analysis."
1.7 Logical structure of data
The logical structure of data refers to the "relationship" in the definition of data structure, which describes the logical relationship of the elements, which is independent of the storage structure of the data, and the same logical structure can correspond to a variety of storage structures. To sum up, the logical structure of data has three major categories:
(1) Linear structure
The linear structure refers to the existence of a one-to-one relationship between elements in the structure. The feature is that both the start element and the terminal element are unique, and in addition, each of the remaining elements has only one precursor element and one successor element. A linear table is a typical linear structure.
(2) tree-shaped structure
The tree structure refers to the existence of a one-to-many relationship between elements in the structure. Only one element is the start element (also known as the root node), there can be multiple terminal elements, each element has 0 or more successor elements, and each element has only one precursor element in addition to the start element.
(3) Graphic structure
The graphical structure refers to the existence of many-to-many relationships between elements in the structure, each of which can have multiple precursor elements and multiple successor elements.
The tree structure and the graphic structure are collectively referred to as nonlinear structures.
1.8 Physical structure of the data
The physical structure of the data, also known as the storage structure, is the storage form (aka image) of the logical structure of the data in the computer. It includes representations of data elements and representations of relationships. When a data element is composed of several data items, the representation of the data item is called the data field.
The relationships between data elements are represented in two different ways in the computer: sequential and non-sequential images. The corresponding two different storage structures are sequential storage structure and chained storage structure respectively.
Sequential images represent the logical relationship between data elements by means of their physical location in memory, and non-sequential images represent a logical relationship between data elements by means of pointers, which indicate the storage address of the data element. In fact, there are 4 common storage methods in the data structure:
(1) Sequential storage method
The method is to store the data element in a contiguous storage unit, logically adjacent to its stored physical location is adjacent, the logical relationship between the data elements and the physical relationship is consistent. The resulting storage representations are called sequential storage structures, and the sequential storage structure is typically described by means of an array of computer programming languages.
(2) Chain-store method
The method is to store the data elements in any storage unit, not requiring the logically adjacent nodes to be adjacent to the physical location, and the logical relationship between the nodes is represented by additional pointer fields. The resulting storage representation, known as a chained storage structure, is typically described by means of a pointer type in the computer programming language.
(3) Index storage method
This method is usually used to store the node information, but also to create additional index tables. Each item in the index table is called an index entry, and the general form of the index entry is: (keyword, address), the keyword uniquely identifies a node, and the address serves as a pointer to the node. This storage structure with indexed tables can greatly improve the speed of data lookups.
(4) hash (or hash) storage method
The basic idea of this method is to calculate the storage address of the node directly by the hash function according to the keyword of the node. This method of storage is essentially an extension of sequential storage methods and chained storage methods.
1.9 operation of the data
The type and number of data operations contained in a structure, and the number and type of parameters in each operation, should be set according to the actual use and needs of a data structure. They only have real meaning when they are implemented on a certain storage structure, so the implementation and execution efficiency of data operation are related to the storage structure.
1.10 Data types
A data type is a collection of values and a generic term that defines a set of operations on that set of values. For example, an int in C is an integer data type, and an int variable takes an integer on a range (for example, an interval of -32768~32767 on a 16-bit machine), and the operation defined on it is the addition, subtraction, multiplication, and divide operations.
1.11 Abstract data type (ADT)
Abstract data type (ADT) refers to a data model and a set of operations defined on that model. ADT is typically a user-defined data model that represents data for an application problem, and ADT consists of a basic data type and includes a set of related actions. It is characterized by separation of use and implementation, implementation of encapsulation and information hiding, that is, in ADT design, the declaration of the type is separated from its implementation.
2. Algorithm and algorithm Analysis 2.1 algorithm
An algorithm is a description of a specific problem solving step, which is a finite sequence of instructions, where each instruction represents one or more operations.
2.2 Characteristics of the algorithm
(1) Poor sex
An algorithm must always (for any legal input value) end after execution has a poor step , and each step can be completed in a poor time . That is, an algorithm must be able to end after performing a poor step for any set of valid input values.
(2) Certainty
There are definite rules in the algorithm for what should be done in each case, so that the performer or reader of the algorithm can define its meaning and how to execute it. And under any conditions, the algorithm has only one execution path.
(3) Feasibility
All the operations in the algorithm must be basic enough to be implemented with a finite number of implementations of the basic operational operations.
(4) with input
As the measure of the algorithm processing object, it usually manifests as a set of variables in the algorithm. Some inputs need to be entered during the execution of the algorithm, and some of the algorithm surfaces can have no input and are actually embedded in the algorithm.
(5) With output
It is a set of measures with the "input" has a definite relationship between the value of the algorithm for information processing results obtained, the determination of the relationship is the function of the algorithm.
Note: The algorithm and the program are different, the program can not satisfy the poor sex. For example, an operating system is in a "wait" loop until the user is not operational until a new user action occurs. This is not the case with normal programs, so in many cases the terms of the algorithms and procedures are not strictly differentiated.
2.3 Algorithm Description
The algorithm describes the instruction sequence of the algorithm in natural language or some computer language.
2.4 Algorithm design goals
(1) Correctness
(2) Usability
(3) Readability
(4) Robustness
The algorithm has good fault tolerance, that is, to provide exception handling and to check unreasonable data. Unusual interruptions or freezes occur infrequently.
(5) High efficiency and low storage demand
The algorithm efficiency is described by the algorithm time complexity and the space complexity degree.
2.5 Algorithm Time complexity
The time complexity of the algorithm is measured by the number of repetitions of the basic operation in the algorithm (referred to as frequency) as the time measurement of the algorithm. In general, it is not necessary to accurately calculate the time complexity of the algorithm, as long as the corresponding order of magnitude can be calculated, such as O (1), O (n) and so on.
The form of O is defined as: if f (n) is a function of a positive integer n (n), then T (n) =o (f (n)) indicates the existence of a positive constant m, making it satisfying when N>=n0 | T (n) |<=m*|f (n) |. In other words, O (f (n)) gives the upper bounds of the function f (n).
T (n) =o (1) When the algorithm time complexity T (N) is independent of N, t (n) =o (n) when the algorithm time complexity T (n) is linear with N; Generally, the usual time complexity has the following relationship:
O (1) <=o (log2n) <=o (n) <=o (nlog2n) <=o (n2) <=o (n3) <=...<=o (NK) <=o (2n)
The solution method is summed up:
1) Determine the problem size N: Usually given in formal parameters.
2) The frequency of the statements in the calculation algorithm T (n): Usually in the algorithm of the basic operation (if there is a loop, the deepest statement in the loop as the basic operation) as the core, to find out the number of executions.
3) with large O: only the highest order of T (n) is reserved, if the ordinal of the highest order is not 1, the ordinal is removed.
2.6 Complexity of space
It is the measure of the storage space required by the algorithm, mainly considering the size of the storage space occupied by the algorithm in the course of running, which is usually given in order of magnitude.
The temporary storage space of an algorithm refers to the newly opened space in the function body, excluding the space occupied by the formal parameters. Such as:
1 //The function body opens up the space of the i,s variable, which is independent of N, so the space complexity is O (1),2 //the space occupied by the parameter A is not counted. 3 intFunintA[],intN)4 {5 intI, S =0;6 for(i =0; I < n; i++)7 {8s + =A[i];9 }Ten returns; One}
Basic concepts and algorithms and algorithms of data structure--from "new data structure exercises and analysis" (Li Chunbao)