Introduction to "Algorithm" algorithm and data structure

Source: Internet
Author: User

Algorithm and algorithm analysis

It's irrelevant to say something first. Junior High School, know that there are CS this specialized discipline exists when the first concept of CS is equivalent to the algorithm. This may be because the former table was later a generation of CS legendary WJMZBMR. Because at that time looked very high-end, coupled with the direction of the subsequent efforts completely and CS do not lap, so for the algorithm of the word has been holding a kind of awe, feel is the whole CS in the dry parts. After deciding to enter this line, my leader said to me that although the algorithm is very big, but in the daily work we do not use much (our department mainly do the transport and devops, do not have a small demand for this area) so it has been delayed. But as in-depth, as well as on-line and various occasions access to more and more data in the contact algorithm and some algorithmic terminology more and more frequently. I think it's time to study. So I bought a copy of a Beijing university. Data structure and algorithm ~python language description ~, on the one hand want to understand a bit of knowledge of algorithms and data structure, on the other hand can also learn python. This book is not very thick, the author also said did not involve very advanced knowledge, although I may not be able to learn, but I think, try to try to do, nothing is better than in bed Brush B station, play games better.

Problems, problem instances, and algorithms

To understand some of the concepts of the algorithm, it is important to differentiate the three concepts. The problem corresponds to a requirement, and one can use analysis and inference to abstract a problem that a computer needs to solve. The problem has general characteristics, such as determining whether a positive integer is a prime number, which is a problem. The problem example is a specific problem, it clearly points out a very specific description of a problem, generally with the correct solution. For example, 1013 whether the positive integer is a prime number, relative to the above problem, is a problem instance. Clearly, a problem reflects the commonality of all relevant instances of the problem. The algorithm is a strict description of the problem-solving process. Because the algorithm is corresponding to the problem, all instances of the problem can be solved by the algorithm. For example, to determine whether a positive integer is a prime number of the algorithm A, this algorithm a corresponds to the above problem, of course, the algorithm A is applied to the problem instance, we can draw 1013 and other various positive integers is not a prime number.

The nature of the algorithm

Algorithm is a specific description of the problem solving process, in order to make it strictly effective, the algorithm usually has the following properties

Poor (the algorithm describes the poor): the algorithm should be able to use limited language, especially in the language of the limited imperative (or computer language is the instruction) to describe.

Feasibility: The instructions in the algorithm must be clearly defined, and the process described can be mechanically executed by the machine.

Certainty: According to a problem (usually also given in the form of a problem instance, through the analysis of the problem instance and the test of the matching algorithm to abstract a problem), the algorithm will produce a unique sequence of actions, any one of the relevant problems of the case through this determined action sequence, you can get the relevant solution

Termination (poor behavior of the algorithm): For any instance, the sequence of actions produced by the algorithm is limited

Input/output: The algorithm has explicit input and output

The descriptive form of the algorithm

Algorithms can be described in natural language, which is more friendly to people who do not understand computer language, but is often verbose and prone to ambiguity.

If the algorithm is described in computer language, it can be very precise (because the final algorithm is presented to the program in this form). But for the average reader, even a computer-literate person needs some effort to read, which is not very friendly to the reader.

Compromise and describe it in pseudo-code form. Pseudo-code combines computer language (typically used for representations of logical structures) and natural language (for specific content manipulation representations). The pseudo-code description forms a combination of natural language expression friendliness and machine language simplicity and clarity.

Algorithm Design and analysis

The so-called algorithm design, is from a problem, through analysis and thinking to get a solution to the problem of the algorithm. There are some common design patterns in algorithmic design:

Enumeration method: Enumerates all possible solutions to the problem and filters out the appropriate solutions from them. This approach can be said to take advantage of the computer's powerful computational performance. What makes a computer smarter than a human brain is that it can quickly repeat a lot of similar or identical work to get results quickly.

Greedy Hair: The partial solution is obtained according to the information of the problem, and the partial solution can be used as a solution or the partial solution is gradually expanded to obtain the complete solution. When the problem is more complex, the が is used, and the solution usually found is not the best solution

Divide-and-conquer method: The problem is broken down into small problems, and then gradually solve these small problems, the combination of each small problem solution to get the whole problem

Backtracking: When a problem is resolved without a clear path, the program needs to step through the error, and when a method is found to go back to the previous path point to try a new path

Dynamic planning: When the problem is difficult to solve directly in the local, when more information is needed, it can accumulate information in the process of solving the problem, which can be used in the process of solving the problem, and the later process can accumulate more information.

Branch-bound method: can be seen as an optimization of backtracking, in the search process may get some useless information, the deletion of this information to reduce the cost of solving

The above algorithm design patterns, not strictly deduced, but the predecessors in countless practice in some of the summary, of course, these descriptions are very abstract, light may not be able to know any useful information. However, it is necessary to keep in mind that the real algorithm design process often requires a comprehensive consideration of multiple design patterns.

In addition, the algorithm is implemented as a program and needs to start the operation, and the operation as a process of information processing must be to have operation consumption. such as the consumption of time and the consumption of space. This consumption is in addition to the algorithm, with the hardware situation, the operating environment, the implementation of the way (which language) and so on. When the above conditions are the same, the algorithm determines the consumption of a program. The smaller the algorithm, the higher the efficiency of the operation.

When we design an algorithm, we need to analyze whether the algorithm is efficient enough. In some cases, because of the efficient nature of the computer, the efficiency of the algorithm may not seem so meaningful, but more often, it is likely to determine the value of the algorithm has no existence. For example, an algorithm that takes three days to calculate tomorrow's weather forecast and three hours to figure out what tomorrow's weather forecast means is totally different. To measure whether the algorithm is efficient, we also need a measure.

Unit and method of algorithm measurement

In the computation process, the hardware each execution algorithm in the time and the space consumption is different, and in order to measure the algorithm can have a certain commonality (compare the efficiency of different operation algorithms), in the formulation of the algorithm measurement needs to be a certain abstraction, such as the following two assumptions:

1. The computing equipment used prepares a set of storage units, each of which can hold a fixed, limited amount of data (in order to standardize the consumption of space)

2. One basic operation that the machine can perform is to consume a unit of time (in order to standardize the time consumption)

The size of the storage unit mentioned in the hypothesis, and the length of the unit time, may vary depending on the hardware, environment and other conditions, but this is not an algorithm metric to consider. In algorithmic comparisons, the default is to compare the execution efficiency of two programs that are identical to other conditions except the algorithm. Therefore, the above two assumptions can be used to abstract the algorithm and standardize the measurement.

Although the algorithm is to solve the problem, but the machine can not understand the description of a problem, so usually the algorithm is measured by a specific problem instance. This brings out a concept, the size of the problem. For example, whether the solution 1013 is a prime number and 10331310131 is the prime number of the two problems, it is obvious that the two can apply the same set of algorithms, but the consumption is completely different. For such an algorithm, whether efficient or not efficient, not through a problem instance of the specific consumption can be determined. Therefore, the measure of the algorithm is usually a kind of function relation that calculates the resource consumption and the problem scale. If the size of the problem is very small, no matter which algorithm consumes the same amount, and within an acceptable range, then the algorithm metric is not so meaningful. And when the scale of the problem is getting larger and bigger, if the computational consumption increases faster and quicker, then it can be said that the efficiency of the algorithm is not too good, should be avoided. The size of the problem in the above two instances, can be considered as the number of numbers, or number of digits, etc., generally as long as there is a unified measure of the problem instance, what this measure is not very important. In a word, you can see which problems are larger and smaller.

It is also important to note that even the same-sized problem instances are consumed differently in some algorithms. For example, to determine whether 1013 and 1012 is a prime, such as in the algorithm is the first to add a judgment: If an even number is directly returned to no, so the two consumption is a lot of difference. In this case, in fact, for the same size of the problem instances, we usually focus on the worst case of the algorithm's consumption (and sometimes pay attention to the average consumption), but not much attention to the more optimistic situation of consumption.

Complexity of the algorithm

The algorithm complexity is a measure method of the algorithm. As mentioned above, for the abstract algorithm is usually unable to give a precise measurement, so to do is to estimate the complexity of the algorithm, and the complexity of the algorithm is a bit of the algorithm's consumption in the magnitude (because no matter what the external conditions, the unit time and space in the algorithm measurement is very small, so one less is not very so-called). In estimating the magnitude of the process, the constant factor can be considered to be of little value, such as 100n**2 and 3n**2 are n**2 magnitude (n is the description of the problem size). This borrows the concept of infinitesimal, which is commonly used in calculus, and takes an infinitesimal notation f (n) = O (g (n)). F (n) is the algorithm measure of algorithmic complexity (a function that consumes the size of the problem), and g (n) is a function similar to the n**2,logn,n,1 (constant function) of the size of the problem n. Putting g (n) into large o indicates that the algorithm complexity f (n) increases with N, and its growth rate is limited by g (N). Two algorithms, as long as their g (n) is the same, it can be considered that the magnitude of the two algorithms are the same, it is considered that the complexity of the two basically the same.

The commonly used g (n) has 1,logn,n,nlogn,n**2,n**3,2**n. These functions grow at a faster rate from the front to the back. The complexity of the algorithms with these g (N) is also known as constant complexity, logarithmic complexity, squared complexity, exponential complexity, and so on. If an algorithm A1 is logarithmic complexity and A2 is the square complexity, usually the same size of the problem instance with the A1 algorithm for the operation of the consumption is much less than the consumption of the A2 algorithm calculation (of course, this is only usually, the above also the decision-making method of measurement is only concerned about the worst case, If an instance happens to be A2 optimistic, then it may A2 soon.

Algorithm analysis

Algorithm analysis is the process of its complexity through a known algorithm. Taking the time complexity of time cost as an example, from the algorithm level, an ordinary program usually contains the basic operation, the sequence structure, the loop structure and the selection structure.

The complexity of basic operations is often thought of as constant complexity, such as assignment, arithmetic, and the combination of these are basic operations.

A sequential structure is a case in which multiple operations are compounded sequentially. Often its complexity is the sum of the complexity of each step.

The complexity of the cyclic structure is the complexity of the cycle head multiplied by the complexity of the loop body.

The complexity of the selection structure is the maximum complexity of the individual selection clauses (which in turn reflects the worst case scenario)

For example, a Python program like this:

# The multiplication of n-order matrices M1 and M2 into matrix M  for inch range (N): #O (n)    for inch range (N): #O (n)     = 0.0 #O (1) for in      Range (n): #O (n)      = x + m1[i][k] * m2[k][j] #O (1)    = k #O (1)

The complexity of T (N) is:

T (n) = O (n) *o (n) * (O (1) +o (n) *o (1) +o (1)) = O (n) *o (n) *o (n) = O (n**3)

You can see the for-I in range (n) statement in Python, because it is traversing a list of length n, with a complexity of N O (1) added, i.e. O (n). The O (n) of the loop head is multiplied by the complexity of the loop body, and the nested loops are also nested in the calculation with parentheses. The simplification process after getting the formula follows the operation law between infinitesimal. For example, only the highest order of infinity is considered in parentheses, and the added low-order infinity is ignored, and the multiplication of the infinitesimal is the sum of its parameters.

The complexity of the algorithm that can be obtained in the end is cubic complexity.

The complexity of Python

The complexity of the algorithm described above is general, and there are some special cases in Python. Python, for example, is a relatively advanced (relatively low-level) language that has provided a lot of packaged "basic operations". When using these operations, we sometimes assume that we are doing a basic operation but it is possible that the complexity is not an O (1) operation. Here are some simple explanations for the specific analysis to be left behind in the specific chapters

Basic operation and assignment are basic operations, the complexity is O (1)

The copy and slice operation of the sequence is O (n), which is related to the length of the sequence

List,tuple element Access, assignment, and modification are all O (1)

Constructs an empty object is O (1), if it is a type constructed like list,str if the content of length n is specified then O (n)

Dict adding new key values to the worst case is O (n) But the average complexity is O (1)

All of these complexities are for time consumption. For space consumption, it is important to note that

Python has no preset maximum number of elements for various combinations of elements (usually the higher-level data types that python comes with, such as Str,list,tuple,dict). However, in actual use, the number of elements in the memory angle will only increase or decrease. For example li = [three-to-one] Li is really 3 elements in length. If the Li.append (4) is followed by a 4 length. That's good to understand. However, if after Del (Li[3]), although Len (Li) becomes 3 but the in-memory Li object remains 4 elements in length, it is important to note that

Data structures in Python

What is a data structure

There are a lot of more academic explanations in the book. In my experience, I think that the so-called data structure is to artificially specify a number of formats to facilitate the abstraction and programming of the problem. In terms of set theory, generally, a data structure d = (e,r). where e represents a data collection with a poor number and r represents some relationship between the data in E. In other words, a specific data structure is to have the concrete and the logical relationship between the data.

Some typical data structures are:

Collection structure: There is no explicit relationship between the data elements specified, that is, R is an empty set, such data structure is to wrap the elements into a whole, is the simplest kind of data structure

Sequence structure: There is a definite relationship between data elements, and there is an element that is ranked first. Each element has a unique post element except for the last element. The sequence structure can also be subdivided into simple linear structures, annular structures and ρ-type structures

Hierarchy: Its data elements belong to a number of different levels, an upper element can be associated with one or more underlying elements, the relationship R form a hierarchical nature.

Tree structure: One of the hierarchical structures.

Graph structure: The data elements can be arbitrarily interrelated, its r is very complex and flexible, is a kind of complex data structure. In fact, all of the previous data structures can be considered as a simplification or limitation of the graph structure.

According to the different characteristics of data structure, it is possible to subdivide structural data structure and functional data structure. Structural data structures (such as list,str,tuple in Python) are structured to indicate a structure with specific structural requirements. Functional data structures do not have the structure of dead rules, can be seen as a container to support the storage of data, and then use its characteristics to perform some operations, functional data structure examples of stacks, queues, priority queues and so on.

Memory units and addresses

(I don't know why this part of the content is in the data structure.) )

The basic structure of memory is a group of linear data units, each cell has a unique number called the cell address, in-memory data access must know the address of the relevant unit. In many computers, the contents of multiple cells can be accessed one time, and in today's common 64-bit computers, the CPU can access 8 bytes of data at once, which means that 8 data units can be accessed at once.

As mentioned above, most of the combined data type access value is an O (1) operation, which also shows that the cell address based on the memory of a storage unit access is an O (1) operation, which is the location of the unit, the overall size of the memory is irrelevant.

Python objects and data structures

Variables and objects in Python

For beginners of Python, these two concepts are often confusing, but Python is inherently different from C,java and other languages in terms of data storage. In Python, constraining a variable to a value looks similar to C, but in fact, Python first constructs the value into an object that is stored in memory and then constrains the address of the object in memory to the corresponding variable. So in Python, we don't need to point out the type of a variable and the length it should have, because whatever the variable is, it's an address, and all the variables require the same amount of space. The memory storage unit that the address points to (or in a memory space where the cell is started) is the real data. The way this variable is implemented is referred to as the variable's citation semantics. The practice of storing the value directly in the variable, like C, is called the value semantics of the variable.

In Python, the use of variables to obtain some specific data is also O (1) So the consumption is not much larger than the low-level language.

Representation of objects in Python

A representation is a data structure that allows a computer to better understand the structure of logical data. The representation of objects in Python is actually designed and does not require much attention, but it helps us to work better.

The implementation of the Python language is based on a well-designed set of link structures, where the relationships between variables and their value objects are implemented in a linked way, and the links between objects are also linked. A complex object may also contain several sub-sections within it. Connecting with each other through links, such as a list contains 10 strings, in the implementation, the list in memory actually saved the link between the 10 strings.

The composition objects in Python can be arbitrarily large, with each object having a different number of storage units and an internal complex structure. For such a complex situation, it is cumbersome to manage memory efficiently. Fortunately, Python comes with a storage management system that manages available memory, frees up memory that is no longer in use, and arranges storage for various objects for flexible and efficient memory management. When an object is required in a program, the management system arranges storage for it, and when some objects are no longer in use, they reclaim their occupied memory. The storage management System shields the details of specific memory usage, reducing the burden on programmers.

Introduction to "Algorithm" algorithm and data structure

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.