Data structure and algorithm notes-Introduction
1. What is calculation
2. Reference to evaluate the merits of DSA (ruler)
3. Scale (scale) to measure DSA performance
4. Methods of performance metrics for DSA
5. Design and optimization of DSA performance
X1. The difference between theoretical model and actual performance
X2. Limits of DSA optimization (lower bound)
Computer and algorithm
The core of computer Science (computer sciences) is to study the law of computational methods and processes, not just the computer itself as a computational tool, so e. Dijkstra and his followers are more inclined to call this science computational Science (computing).
Calculation = Information Processing
Computational model = computer = Information Processing tool
1. The essence of a computer is computing, and computing is the law of finding objects and finding techniques from them. The goal of computing is to be efficient and low-cost.
2. The algorithm is the use of certain tools, under certain rules, in a clear and mechanical form of the calculation.
Algorithm definition: a sequence of instructions designed to solve an information processing problem based on a particular type of calculation.
3. The algorithm must have the following elements
Input and output
Input: A description of the specific instance of the problem being solved
Output: Information obtained after calculation and processing, i.e. an answer to an input problem instance
Certainty and feasibility: the algorithm should be described as a sequence of instructions consisting of a number of semantically explicit basic operations, and each basic operation can be cashed in the corresponding computational model.
Poor: Any algorithm should terminate and give output after performing a limited number of basic operations
Correctness: The output given by the algorithm should be able to meet the conditions determined by the problem itself in advance
Degradation and robustness: for example, the various extreme input instances of the algorithm data are degenerate (degeneracy) situations, and robustness (robustness) requires that such cases be adequately addressed as much as possible.
Reusability: Algorithmic patterns can be generalized and applied to the characteristics of different types of basic elements
It proves that the algorithm is poor and correct : Looking at the whole calculation process from the proper angle, and finding out some invariance and monotonicity of it.
Monotonicity: The effective scale of the problem decreases with the advance of the algorithm
No deformation: Not only in the initial state of the algorithm to meet the natural, but also with the final correctness of the echo----when the scale of the problem is reduced to 0 o'clock, invariance should be immediately equivalent to the correctness
For example, the correctness of the bubble sort proves that: (invariance) after the K-Scan exchange, the largest first k elements must be in place; (poor) after the K-scan exchange, the effective scale of the problem to be solved will be reduced to n-k.
Good algorithms: The most efficient (as fast as possible, with as little storage space as possible) while taking care of the correct (the algorithm solves the problem correctly), robust (fault tolerant), readable (easy to read). Like Marca fast, but also eat less.
algorithm + data structure = Program (algorithm + data structure) * efficiency = calculation
Computational models
1. A good program should not only consider the data structure and algorithm, but also consider the efficiency, namely: (Data structure + algorithm) * efficiency = program and application.
2. Two important indicators of algorithmic analysis (need to be measured)
Correctness: is the algorithm functionally consistent with the problem
Cost: Time consumption and storage space consumption
3. Define: T (n) is the step that an algorithm needs to operate in the worst case scenario. The difference between the different algorithms mainly see the size of T (n), t (n) is to shield the computer hardware differences, language differences, compilation differences and other differences in the ideal platform under the operating indicators, such as Big O, big Ω, large θ and so on.
4. The general computational model has Turing model and RAM model, they all convert the arithmetic time of the algorithm to the basic operation number of the algorithm execution.
Turing model
Three constituent elements of Turing
1. Limited alphabet: What is stored in the cell
2. Read/write header: Only current position, readable and writable
3. Status table: Status of the current read/write header
Turing State conversion process transform (q,c; d,l/r,p)
Q: Current status
C: The current contents of the cell that the read and write header refers to
D: Read and write the contents of the cell that the header refers to
L/R: Shift Left/Right
P: The state after the read/write header is converted
RAM model
1. Similar to Turing, all assume infinite space
2. Consists of a series of sequential numbered registers, but the total number is unlimited
3. The number of times that the algorithm is running to convert the time of the algorithm operation
Data
Data structure schematic
Data objects consist of data elements, data elements consist of data items, and data items are the most basic units
The relationship between data elements in an index of an object
The data structure mainly studies the operation objects and their relationships in non-numerical computational program problems.
Logical structure of data structures
Collection structure
Linear structure
Tree structure
Graph structure
Physical structure of data structures
Sequential storage structure
Chained storage structure
Operation of the data
Insert
Delete
Modify
Find
Sort
Complexity metrics
1. The efficiency of the algorithm mainly depends on time consumption and storage space consumption, here we block the consumption of storage space, only consider the time consumption.
2. Definition of large O: T (n) =o (f (n)), F (n) is a function. When C>0,t (n) <c?f (n), that is, the large O notation represents an upper bound of T (n), the property is:
O (n) =o (c?n)
O (n2+n) =o (n2)
3. Definition of large Ω: T (n) =ω (f (n)), F (n) is a function. When C>0,t (n) >c?f (n), that is, the large o notation represents a lower bound of T (n).
4. Definition of large Θ: T (n) =θ (f (n)), F (n) is a function. When C1>c2>0,c1?f (n) >t (n) >c2?f (n), that is, the large o notation represents an interval of T (n).
5. Classification of large o marks:
Constant class: O (1) =2 or222222, effective
The logarithm class: O (LOGCN) is independent of constant base and constant power, and the complexity is close to constant, which is effective.
Polynomial: O (NC)
Linearity: O (n)
Index: cn=o (2n) any c is available. Costs are growing very fast and are not effective.
6. Time complexity T (N): The time required for a particular algorithm to handle a problem of scale n, because N is the same, but T (N) is different,----> simplified to:
Select the oldest time as T (n) in all inputs of scale n and measure the time complexity of the algorithm with T (N).
7. Progressive time complexity: focus on the overall change trend of time complexity with the increase of the problem scale n
Large O notation (the progressive upper bound of T (n)):
If there is a positive constant C and a function f (n), so that for any n>>2 have: t (n) <= c * F (n), that is, after N is large enough, f (n) gives a progressive upper bound of the T (n) growth rate, recorded as: t (n) = O (f (n))
8. Large O Mark Nature:
For any constant c > 0, there is O (f (n)) = O (c * f (n)): In large O notation: the constant coefficient of the positive of the function can be ignored and equal to 1
For any constant a > B > 0, there is O (n ^ a + N ^ b) = O (n ^ a): In large O notation: the lower term in the polynomial can be ignored
9. Spatial complexity (space complexity) is a measure of the amount of storage space that is temporarily occupied by an algorithm while it is running. The storage space occupied by an algorithm in the computer memory, including the storage space occupied by the storage algorithm itself , the storage space occupied by the input and output data of the algorithm and the storage space occupied by the algorithm in the running process three aspects.
The spatial complexity of the algorithm is realized by calculating the storage space required by the algorithm, and the computational formula for the spatial complexity of the algorithm is as follows:S (n) = O (f (n)), where n is the scale of the problem, and F (n) is the function of the statement about the storage space occupied by N.
In general, when a program executes on a machine, it needs to store the storage unit for the data operation in addition to the instructions, constants, variables, and input data stored in the program itself. If the input data occupy space only depends on the problem itself, and the algorithm is independent, so only need to analyze the algorithm in the implementation of the necessary auxiliary units . If the auxiliary space required by the algorithm is a constant relative to the amount of input data, the algorithm is said to work in situ and the space complexity is O (1).
On the question of O (1), O (1) says that the size of the data is independent of the number of temporary variables, not that it defines only a temporary variable. For example: Regardless of the size of the data, I define 100 variables, which is called data size independent of the number of temporary variables. This means that the spatial complexity is O (1).
When the spatial complexity of an algorithm is a constant, that is, it can be represented as O (1), if it is not changed with the size of N of the processed data, and when the spatial complexity of an algorithm is proportional to the logarithm of the base N of 2, it can be represented as 0 (10g2n); can be represented as 0 (n). If the parameter is an array, it is only necessary to allocate a space for it to store an address pointer transmitted by the argument, that is, a machine word space, and if the formal parameter is a reference, it is only necessary to allocate a space for it to store the address of the corresponding argument variable. To automatically reference the argument variable by the system.
10. For an algorithm, its time complexity and spatial complexity are often influenced by each other . When the pursuit of a better time complexity, the performance of the spatial complexity may be poor, that is, may lead to more storage space; Conversely, a better spatial complexity may result in poor performance of time complexity, which can lead to a longer run time. In addition, all the performance of the algorithm has more or less mutual influence. Therefore, when designing an algorithm (especially large algorithm), we should consider the performance of the algorithm, the frequency of use of the algorithm, the size of the data amount processed by the algorithm, the characteristics of the algorithm description language, the machine system environment of the algorithm running, and so on, to design a better algorithm.
11. In general, we use "time complexity" to refer to the time-to-run requirements, using "spatial complexity" to refer to space requirements. When using "complexity" without qualifying words, it usually refers to the complexity of time.
Algorithm analysis
1. Algorithm analysis has two main tasks:
Correctness
Complexity of
2. There are three ways to analyze complexity:
Iterative algorithm: Series summation
Recursive algorithm: recursive tracking + recursive equation
Guess + Verify
3. Number of counts: the same order of squares as the last item
4. Power-Squared series: One order higher than power
5. Geometric progression (a>1): In the same order as the last
6. Convergence progression: O (1)
7. May not be convergent, but limited in length
8. Cycle: Generally exponentially increasing as the number of layers increases
9. Recursive tracking analysis: Check Each recursive instance, the cumulative time required (into the statement itself, counted into the corresponding sub-instance), the sum is the algorithm execution time.
Iteration and recursion
Two methods of thinking of algorithm: Divide and conquer, reduce and conquer.
Divide and conquer: Divide a problem into two sub-sizes of similar size.
To divide a problem into a sub-problem and a problem of scale reduction.
Dynamic planning
1. Purpose of Dynamic planning:
Make it work (recursive can be guaranteed)
Make it right (recursion is guaranteed)
Make it fast (iterations are guaranteed)
2. Recursion is expensive for resources and O (n) is large, however iterations can reduce the use of storage space and sometimes can also reduce O (n)
3. Subsequence: A new sequence of several elements in the original sequence, arranged in the original relative order.
4. The longest common subsequence (longest common subsequence, LCS) is a sequence of two that has the longest length in its same subsequence and may contain more than one.
--------------------------------------------------------------------------------
Reference http://blog.csdn.net/horizontalview/article/details/50804801
Reference http://www.cnblogs.com/joh-n-zhang/p/5759226.html
Data structure and algorithm notes-Introduction