Programming ability Seven paragraph theory

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Objective

Programmer's programming skills will gradually increase with the accumulation of experience. I think the programming ability can be divided into some levels.

The following is a discussion of the programming competency hierarchy model through two dimensions.

One dimension is the programming skill level and the other dimension is the domain knowledge level.

Programming skill levels

Programming skill level, refers to the programmer's ability to design and write programs. This is the root of the programmer.

0 Segment-Non-programmer:

Novice programmer, encounter problems, is completely mengmengdongdong, do not know how to program to solve the problem. In other words, it is still a layman and cannot be called a "programmer". The computer was still a mysterious black box in front of him.

1 Segment-Basic programmer:

After a period of learning programming, you can write a program to complete the task.

Written code, normally can work, but in the actual operation, encountered some special conditions will appear various kinds of bugs. That is, with the ability to develop demo software, but the development of software is really delivered to the customer use, I am afraid the customer will be scolded dead.

The programmer program is written, but why does it sometimes work and sometimes not, programmers don't know.

Running in a bug, or changes in requirements, need to modify the code or add code, soon the program becomes chaotic, code bloat, bug-clustered. Soon, even the original developers themselves were reluctant to take over the maintenance of the program.

2 Segment-Data structure:

After a period of programming practice, programmers will recognize the meaning of the adage "data structure + algorithm = program". They will use algorithms to solve the problem. In turn, they will realize that algorithms are inherently dependent on data structures, and that once a good data structure is designed, a good algorithm will emerge.

Design the wrong data structure, it is impossible to grow a good algorithm.

Remember a foreign sages once said: "Show me your data Structure!" ”

3 Segment-Object oriented:

Then, the programmer will appreciate the power of object-oriented programming. Most modern programming languages support object-oriented. But it's not that you're programming in object-oriented programming languages, you're using classes, or even inheriting classes, you're writing object-oriented code.

I've seen a lot of process-oriented code written in Java,python,ruby.

Only you have mastered the interface, mastered the polymorphism, mastered the class and class, object and the relationship between objects, you really mastered the object-oriented programming technology.

Even if you're using a traditional non-object-oriented programming language, you can still develop object-oriented programs as long as you have "objects" in your mind.

For example, when I was programming in C, I would consciously use object-oriented techniques to write and design programs. A struct is used to simulate a class, and a function of the same class concept is put together to simulate a class. If you doubt if you can write object-oriented code in C, you can look at the Linux kernel, which is written in C, but you can also see the thick "object" flavor that is emitted from the lines of its source code.

It's not easy to really master object-oriented programming techniques.

In my technical career, there are two of them that make me feel most headache.

A gateway is the evolution of DOS to Windows development, the concept of framework, for a long time I can not understand. The DOS era is a call to a function library, and your program invokes the function actively. In the Windows era, the framework was replaced. Even if it is your main program, it is actually called by the framework. The UI thread gets the message from the operating system and then sends it to your program to process it. The spring framework familiar to Java programmers is also a framework for such a reverse invocation.

Now because the term "framework" appears to be very large, many "class libraries"/"libraries" call themselves "frameworks". It seems to me that this is all abuse of name.

"Class Library"/"library" is the code I wrote called them.

The "framework" is what I register the callback function to the framework that the framework calls me to write the function.

Another hurdle is object-oriented. For a long time, I didn't know how to design the relationship between classes and classes, and the class hierarchy could not be well designed.

I remember seeing a book from a foreign Daniel, who spoke a very simple and practical object-oriented design technique: "The narrative problem." Then find the noun and use it to build the class. Find the verb and use it to construct the class. Although this technique is very useful, but also too grassroots point, there is no theoretical basis, nor rigorous. If the problem is not well described, then the class system obtained will be problematic.

There are many ways to master object-oriented thinking, and I have gained inspiration from relational databases to understand and master object-oriented design ideas.

In my opinion, the table of a relational database is actually a class, and each row of records is an instance of a class, that is, an object. The relationship between the tables is the relationship between the classes. O-rmapping techniques, such as hibernate, are used to map from object-oriented code to database tables, which also shows that classes and tables are indeed logically equivalent.

Since database design and class design are equivalent, designing object-oriented systems requires only the design techniques of relational databases.

Relational database table structure design is very simple:

1, identify the relationship between the table and the table, that is, the relationship between the class and the class. A one-to-many, a-to-many, or many-to-many. This is the relationship between classes.

2, identify the fields of the table. An object of course has numerous attributes (such as: height, weight, gender, age, name, ID number, driver's license number, bank card number, passport number, Hong Kong-Macau pass number, work number, medical history, married etc), we write the program need to record only the attributes we care about. The attributes of these concerns are the fields of the table, that is, the properties of the class. "Weak water 3,000, I take a scoop of drink"!

4 Segment-Design mode:

Once on the Internet to see such a sentence: "No 100,000 lines of code, don't talk to me about what design patterns." Deep thought.

Remember the first time you look at Gof design patterns that book, found that although not previously known design patterns, but in the actual programming process, in fact, still consciously used some design patterns. The design pattern is the objective law of programming, not who invented it, but what some early senior programmers first discovered.

Instead of designing patterns, you can write programs that meet your needs. However, once the subsequent requirements change, then your program is not flexible enough to be unsustainable. And the real process, after delivery of the customer, there will be further demand feedback. The development of subsequent versions will certainly increase demand. This is a reality that programmers can't avoid.

Write UI programs, whether it's web,desktop,mobile,game, be sure to use MVC design patterns. Otherwise your program will face the subsequent changes in the UI requirements that would not be considered to be following.

Design patterns, the most important idea is decoupling, through the interface to decouple. In this way, if the demand changes in the future, then only a new implementation class will be provided.

The main design patterns, in fact, are object-oriented. Therefore, the design pattern can be considered an object-oriented advanced stage. Only master the design pattern, can think is really thoroughly mastered the object-oriented design skills.

When I learn a new language (including non-object-oriented languages such as functional programming languages), I will always look at how the various design patterns are implemented in this language after understanding their syntax. This is also a trick to learning programming languages.

Paragraph 5-language experts:

After a period of programming practice, programmers are quite proficient in a common programming language. Some have become "language lawyers" who are adept at explaining language usage and various pits to other programmers.

Programmers at this stage are often faithful believers in their own language, often arguing with users in the community and forums and other languages which language is the best programming language. They think that the language they use is the best programming language in the world, not one of them. They believe that the programming language they use is suitable for all scenarios. In their eyes, there are only hammers, so all tasks are treated as nails.

6--Multi-lingual experts:

This stage of the programmer, because of the working relationship, or purely because of the interest in technology, has learned and mastered several programming languages. Have learned different programming language different design ideas, the strengths and weaknesses of each language have more understanding.

They now think that programming languages are not the most important and that programming languages are just basic skills.

They will now solve the problem by choosing different programming languages based on different task requirements, or different resources, no longer complaining about not using a favorite programming language to develop.

Programming languages have many genres and ideas, and some programming languages support multiple programming paradigms at the same time.

Static type programming paradigm

In a programming language with statically typed programming paradigms, the variables need to be explicitly specified types. Representative language: C,c++,pascal,objective-c,java,c#,vb.net,swif,golang.

The benefits of doing this are:

1, the compiler can identify type errors at compile time.

2, you can improve performance by knowing the type information when the compiler compiles.

This paradigm holds that programmers must know the type of variable, and if you don't know the type of the variable, don't mix it up! At compile time, the program will error.

Both Swift and go languages are statically typed programming languages, but they do not need to specify the type explicitly, but can be inferred by the compiler to automatically determine its type.

Dynamic type programming Paradigm

A programming language with a statically typed programming paradigm, whose variables do not need to be explicitly specified. Any variable that can point to any type of object. Representative language: Python,ruby,javascript.

The philosophy of dynamic type can be summed up by the concept of duck type (English: ducktyping). Jameswhitcombriley's duck test can be said: "When you see a bird walking like a duck, swimming like a duck, and barking like a duck, the bird can be called a duck." ”

This paradigm holds that programmers must know the type of variables and the methods and properties that they support, and if you don't know the type of the variable, don't mix it up! The run-time program will crash! Who's the program crash? Blame yourself, you are not a qualified programmer!

The benefits of a dynamic type are:

There is no need to explicitly define interfaces and abstract types. As long as a type supports the required methods and properties, then OK. The program will be quite flexible and simple. C++,java,c# as the lifeblood of the interface/base class, in dynamic language here as nothing!

The disadvantages are:

1, if the type is not correct, the compiler cannot find the error, but the run-time program crashes.

2, because the compiler does not know the type of the variable and therefore cannot optimize performance.

Object-Oriented Programming paradigm

The object-oriented programming paradigm began to emerge from the late 70. It supports instances of classes and classes as modules that encapsulate code. Representative language: Smalltalk,c++,objective-c,java,c#,vb.net,swift,go,python,ruby,actionscritp,ocaml.

Early programming languages are process-oriented. is the order, condition, loop, form a function. As the size of the code grows, it is found necessary to modularize the code. A concept corresponding to the code is placed in a file, which facilitates concurrent development and code management.

The law of "program = data structure + algorithm" has also been found. Therefore, a concept corresponding to the data structure and function should be placed in a file. This is the concept of classes.

The object-oriented programming paradigm, which does greatly improve the production efficiency, has been widely used, so it is very popular to support the object-oriented programming paradigm in language level.

Although the C language does not support the object-oriented programming paradigm at the language level, modern C language development applies object-oriented modular thinking, putting the same class of data structures and functions in a file, using a similar naming method.

After all, C doesn't support object-oriented at the language level, so there are a lot of programmers who want to add object-oriented support to the C language. The representatives are C + + and objective-c.

C + + is a new language, but most of the language elements are compatible with C.

The OBJECTIVE-C is fully compatible with C. Objective-c is adding a thin layer of syntactic sugar to C to support interfaces (that is, classes in other languages) and protocols (that is, interfaces to other languages). Even the first implementation of OBJECTIVE-C is a C-language precompiled compiler. Objective-c Frankly, the object-oriented system design is quite subtle except that the added syntax does not conform to the C-flow. Mr Jobs's early eyes on the beads objective-c the capsule because it was closed in the apple/nextstep system, so few people knew it. With the popularity of iOS systems, Objective-c has only renowned in recent years.

Functional Programming Paradigm

The functional programming paradigm is a programming language invented by mathematicians who think that a program is a mathematical function. Representative language: Lisp,erlang,javascript,ocaml,prog.

Many of Daniel's most powerful advocates of functional programming languages have been highly revolutionary. But I think they overestimate the power of the functional programming paradigm, and I don't think the functional programming paradigm is as clever as the object-oriented programming paradigm.

Functional programming languages, the core is functions, they do not have the concept of class. But its function is not the traditional process-oriented language function, its function supports the concept of "closure".

In my opinion, functional programming language functions, that is, "closures", plainly speaking, is actually "class". The development of programming language to today, is the need for modularity, is the need to "data structure" and "algorithm" together. No matter what language, do not combine them in the programming way, there is no way out.

Object-oriented programming language, using classes to combine "data structure" and "algorithm". The core of a class is the "data structure", which is its "attribute", not "algorithm", its "function". In a class, a function is attached to a property.

In the functional programming language, the "Data structure" and "algorithm" are combined with closures. Is that the function is able to fetch external fields. Is "attribute" attached to "function".

"Class" is essentially equivalent to "closure". Many object-oriented programming languages now include support for closures. Observing their code, we can see that they are actually using "classes" to implement "closures".

Who is easier to use for "classes" and "closures"? is obviously "class".

"Closures" are more concise, so "closures" are often used in object-oriented programming languages to replace anonymous classes. A class with only one function, written as a class, is too cumbersome to be written as a closure and more concise.

Spit it out. OCaml language, the predecessor of the CAML language itself is a very good functional language, abruptly added a complete set of object-oriented mechanism, while supporting object-oriented and functional programming paradigm, it is easy to like C + + brain crack.

There are also many object-oriented language controls that look at JavaScript wearies and always want to add object-oriented support to JavaScript. ActionScript is one of those attempts. I've used it, and it's really not much different from Java.

Then spit out the ExtJS. ExtJS and jquery were compared when choosing the Web front-end development framework.

ExtJS is obviously developed by Java experts, abruptly with JavaScript to simulate the design of swing, a set of UI library.

JQuery developers clearly understand the functional programming paradigm of JavaScript, creating a UI library based on the features of JavaScript's dynamic functional programming language, instantly killing extjs in seconds.

From the ExtJS and jquery stories, we can see how important it is to have multilingual programming capabilities. ExtJS's author is proficient and loves Java, so he used the scalpel JavaScript as a hammer Java to make, a thankless.

Functional programming language, there are some tips such as tail recursion. Tail recursion can be used without stacks, preventing the stack overflow when recursive calls.

Template Programming Paradigm

Template programming, that is, the type as a parameter, a set of functions can support any number of types. Representative Language: C + +.

The need for template programming is invented when developing a container library in C + +. Because containers need to hold any type of object, there is a need for generics.

The template programming of C + + is to create the corresponding type code at compile time according to the usage in the source code. In addition to the C + + approach, java,c# has a similar mechanism called generics, but they are implemented in a different way from C + + templates. Their compilers do not generate new code, but instead implement them in a way that enforces type conversions.

In a programming language without templates/generics, how do you store objects in a container? The object that accesses the public base class type (java,c#), or the void* pointer (C), can be removed by forcing the type to be cast to the actual type. Dynamic type language, do not care about the type, it does not matter, any object directly into the container to throw in, take out the direct use can.

Some C + + experts in the template based on the "template meta-programming." Because the template programming, is the C + + compiler to get it done, template meta-programming is to let the compiler operation, compile the results also even out. I don't know what's the use of this thing besides research and dazzle.

Summary

Whether a language is worth learning, I think there are several criteria:

1, whether to use, you have to learn, so no doubt. After all, we all have to eat.

2, whether its language features give you a refreshing feeling. If it is, it will be worth the price. If the go language is out of the ordinary, return multiple values instead. I thought so. I've actually been active for years. Because, I think since C does not support the exception also live very well, why need an exception? The error code is returned. Irreparable error, the direct Abort program can be! Moreover, exceptions are actually violations of process-oriented programming principles. A function should have only one entry for an exit. Throw out the exception is more export.

3, is good at a certain area. If you have only one hammer in your hand, you can only hammer all the tasks as nails. But if there are many tools in the toolbox, it is much easier to face different tasks.

7 Segment-Architecture design

Also need to master the ability of architecture design, in order to design excellent software. Architecture design has some tips:

1, layered

A software is usually divided into:

Presentation Layer--ui part

Interface Layer--the communication interface part of the background service

Service Layer--Actual service part

Storage Layer-The persisted storage section, stored in a file or database.

Layered software that decouples individual modules, supports parallel development, is easy to modify, and is easy to improve performance.

2,soa

The modules are connected to each other via network communication, loosely coupled. Each module can be deployed independently, increasing deployment instances to improve performance. Each module can be developed using different languages and platforms, and can be reused for previously developed services. SOA, common protocols are webservice,rest,json-rpc and so on.

3, Performance bottleneck

1) synchronization is asynchronous.

Implemented using a memory queue (Redis), a workflow engine (JBPM), and so on. Memory queues are easy to lose data, but are fast. The workflow engine saves the request to the database.

With synchronous requests being asynchronous requests, basically 99.99% of performance issues can be resolved.

2) processing with single machine parallel hardware.

For example, using hardware such as GPU,FPGA to improve performance.

3) Use a clustered computer for processing.

For example, a Hadoop cluster that uses multiple computers to process data in parallel.

own software stack, you can also put a module to deploy multiple copies, parallel processing.

4) Use the cache to satisfy the request. Popular content after heating the cache, a large number of user requests are only memory read data, performance will be greatly improved.

The cache is God's algorithm, remembering that it seems to have only a lower performance than the best performance, as if you were God and could foresee the future. Now X86CPU encountered a frequency limit, the main way to improve CPU performance is to increase the high-speed cache.

4, large system small do

Don't panic when you encounter a large system, cut it into multiple modules, and use multiple small programs to work through SOA collaboration. This is a concept of UNIX design. UNIX has developed a large number of single-purpose small programs, it advocates the user through the pipeline to allow multiple small programs to work together to solve the user's needs. Of course, there are too many restrictions on pipeline communication, not flexible enough. So now we can make multiple programs work together through a URI, in the form of SOA. Applications on Andorid and iOS are now collaborating through URIs. Is this the modern development of UNIX design ideas?!

5,sharding slices

Now there is a trend to go to the IoE. I-IBM Mainframe, o-oracle database, E-EMC storage. Previously, large systems used the IoE to architect, deploy an Oracle database on a mainframe, and the Oracle database saved data with EMC storage. The IOE is today's strongest computer, database, and storage. But they also have an irresistible day facing massive systems.

The Oracle database is shareeverything and can be run on a cluster of computers (no more than 16 server nodes). Computer clusters share a single store.

Go to the IoE movement, marking the bankruptcy of the shareeverything model. You must use Sharenothing to extend the system indefinitely.

With MySQL database, you can handle data of any size. The premise is that you will sharding shards. Divide the big system into several small systems, slicing into several inexpensive servers and storage. More modern, is to slice to a large number of virtual machines.

For example, the Ministry of Railways's 12306 website. We know that fire tickets are from a certain train. Then we divide each train as a unit, we can divide the 12306 website into thousands of modules. A single virtual machine can host several modules. When some trains become a performance bottleneck, they can be migrated to a separate virtual machine. The system will not be completely unavailable even if some of the services listed are eventually unavailable.

12306 site, only a global part, is the user login. This can be entrusted to a third party. If you can allow users to use, Weibo, QQ and other accounts login.

You can also implement the user login service yourself. It is also a slicing way to service multiple Redis servers. The Redis server stores information such as SessionID and UserID, roles, permissions, and so on for each logged-on user. The SessionID is randomly generated, and its partial bit is chosen to identify which Redis server it is on. After the user logs in, send the SessionID to the customer. The user sends the SessionID back to the server each time it is requested. The server sends SessionID to the REDIS server to query for its user information and process the user request. If SessionID is not found on the Redis server, let the user log in. Even if all registered users log in at the same time, they do not need too much memory. Also, you can delete the oldest logged-in user's session when the session memory is too large, forcing him to log in again. There are not too many active users at the same time.

Domain Knowledge level

In front of all levels, are concerned about the skills of programming itself, plainly speaking, is the basic skills, itself can not produce too much value. But too many programmers waste too much time on those built-in layers.

Some programmers especially like to delve into programming languages, and every new programming language comes out or the old language is stir, and it's going to go into it. I was one of them, wasting a lot of energy on programming languages, on artifice.

I think the C + + language is a particularly big pit. At first it was developed as an object-oriented C. Later, template programming was discovered, and the template programming and further template meta-programming were vigorously advocated. Recently introduced new standards such as C++11,C++14, adding a lot of new things, functional programming, type inference and so on. C + + is overly complex, and too many pits consume a lot of programmer's energy. When I use C + +, I use only object-oriented and template parts, and other overly profound features are not used.

Computer science is a very wide range of disciplines, there are many areas of knowledge needs and worthy of our deep research, we can write a valuable program. Software must be combined with the industry, to the landing of value. Only the study of programming skills, do not understand the field of knowledge is not written a valuable program.

There are many fields of computer science, listing some of the following:

Storage----block devices, file systems, cluster file systems, distributed file systems, fiber Scsi,iscsi,raid, etc.

Network----Ethernet, Fiber optic network, cellular network, Wifi,vlan and so on.

Computer architecture, mainly CPU instruction set. X86,arm and so on.

USB protocol. Need to know URB package.

PCI protocol, PCI-E protocol. The peripherals of modern computers are both PCI protocol and PCI-E protocol. The video card is now all connected to the computer via the PCI-E protocol. The relative reduction of a lot of knowledge needed to learn. Virtualization requires deep mastery of the PCI protocol.

Image processing-image compression, video real-time encoding and so on.

3D Games

relational database

NoSQL Database

Operating system

Distributed operating system

Compilation principle

Machine learning-now big data to use Oh!

Knowledge of these areas also includes understanding of existing commercial hardware, commercial software, and open source software in the field. Most of the time, the work you are going to accomplish already has a ready tool. You can accomplish tasks with the tools you already have, and you don't need to develop them. Sometimes, you just need to combine the existing tools and write some scripts to complete the task.

For example, I want to implement a two-way synchronization task at a time. Found a good open source software unison, write a configuration file to complete the task satisfactorily. You do not need to write any code.

Another time, to be highly available, it was easy to use Python to invoke several open source software.

Write the installer, customize the operating system, know the domain knowledge of the operating system, write a few lines of script can be easily done.

People who do not have domain knowledge may have to do a lot of pointless development, even after a long time to discover that this is simply a dead end.

In addition, solid domain knowledge, can greatly improve the ability of programming debugging, error-checking. Knowing how the compiler and the programming language runtime works, you can quickly modify code based on compilation errors and warning messages.

Knowing the underlying operating system, you can quickly find the root cause of a run-time error. For example, once I wrote a Windows Upgrade service program. It is a Windows service that needs to execute a DOS script that will replace the Windows service itself. found that sometimes the script execution is invalid, checked the night, found that when the Windows Service installation, the first time when the script execution will have permissions problems, log is correct, but the actual execution of this script has no effect. But once the Windows service program starts once, it's OK. This must be a problem with the underlying security of the Windows operating system, because I don't know much about the Windows kernel, so it took a long time to discover the problem and the root cause of the problem was unclear.

0 Segment-Field knowledge rookie

There is not much knowledge of the domain, through the search engine to find some of the field of software and hardware introductory articles, according to the article instructions to configure and use the software. Barely able to use existing hardware and software.

1 Segment-Domain knowledge expert

Understand the common hardware in the field, and gain a thorough understanding of the configuration and usage skills of common software in the field. Can use the existing hardware and software to build a solution, can solve the problems encountered in the actual work.

2 Segment-Domain knowledge experts

When you have not only mastered the field of software and tools, know how to use, but also know its principle, "know it, also know its why", is the field of knowledge experts.

You know how the network protocol works so that you know where the problem is when there is a problem with the network. Is it a Mac conflict, an IP conflict, or a network loop?

You know how the storage works, and you know why this storage is not suitable for virtualization, which is suitable for virtualization, and another way for data backup.

You know the PCI protocol, and you know how you can virtualize a hardware device.

You know the network Card Hardware protocol, you can simulate a virtual function of the normal use of virtual network card.

You know the video encoding format and principle to know what video format consumes the least bandwidth and what video format consumes the least CPU.

You know the INTELVT/AMD v instruction set to see how virtualization is implemented.

You understand that a workflow is a state machine, and you know how to design a workflow engine that meets your requirements when you encounter complex workflow processes.

Section 3-Scientists

You are a domain knowledge expert, but your knowledge comes from books, from others.

If you are content to be a domain knowledge expert, you can only merely, never to surpass. Other people's research results, may not be willing to tell you. When someone tells you, it may have discovered a newer theory, and a new generation of products may soon be released.

The scientist is the person who explores the unknown, dares to innovate, is promotes the human society progress.

Legend has it that one of Cisco's top executives once said in half jokingly: "If Cisco stops developing new technologies, will not find a way". This is a mockery of is only at the level of domain knowledge experts, only the cottage can not be surpassed. I do not know the actual situation of , but hope that the current has reached the forefront of the position of the leader.

Owen Yas Booth found the CDMA Code Division multiple Access principle, and found that it has a promising communication, the formation of Qualcomm company. Qualcomm, which is mainly based on royalties, employs a large number of scientists to conduct research in the field of communications. Some say Qualcomm is a patent rogue. These people do not understand the value of knowledge. In their eyes, the reasonable price of Windows should be 5 yuan, a CD-ROM price. The iphone should be priced at more than $1000 bare metal. Qualcomm is a patent rogue, that you also rogue a cdma,lte out to me to see!

The X86 chip is not designed with virtualization in mind. So there will be a so-called "virtualization Vulnerability" to appear. That is, some CPU privilege instructions do not throw an exception in the virtual machine environment, so you cannot switch to host. This way, the virtual machine cannot be run on the X86 chip.

VMware was created in 1998 by several scientists in the United States. They found that it was possible to run virtual machines on X86 computers using binary translation techniques.

The Xen virtualization software was also invented by several scientists. They found that as long as the kernel of the virtual machine operating system and host operating system is modified, the function of calling host directly when it is necessary to execute the "virtualization vulnerability" directive can be virtualized and greatly improve the performance of the virtual machine.

Later, Intel added the INTELVT instruction set to its own chip, and AMD added the AMDV instruction set to its chip to make up for the "virtualization vulnerability." Then there is the KVM virtual machine software, which directly with the CPU hardware instructions for virtualization.

KVM runs directly on the physical CPU when executing CPU instructions and is therefore highly efficient. However, when virtual machines run virtual peripherals, they must be simulated with software, so the virtual machine's IO access is slow.

Rustyrussell, an IBM scientist, has created Virtio technology with the help of Xen's research and development experience. is to write a set of PCI virtual devices and drivers in the virtual machine, the virtual PCI device has a piece of virtual device memory. This virtual device memory host is accessible and can be accessed by the virtual machine via the Virtio driver. That is, the virtual machine and the host are shared within a piece, which solves the IO performance problem of the virtual machine.

Let's talk about a search engine story:

A long time ago, I wanted to add a search function to a program. When I first started using SQL query implementations, I found it was too slow. The open source Lucene project was later found. It uses reverse indexing technology, which greatly improves the search speed by creating a reverse index in the file.

Google's two founders discovered the secrets of link in HTML and found that they could set weights for each HTML page through a link to an HTML page. The PageRank algorithm. As a result, Google's automatic search engine defeated Yahoo's Manual search engine classification.

OK, with reverse indexing and PageRank, and a simple HTML bot, we can create a search engine. However, the Internet is large, generating a large number of new pages every day, it is difficult to set up a reverse index for the entire internet.

Several years later, Google published three papers: Googlefs,mapreduce,bigtable. The developer of the Lucene project developed a Hadoop project based on Google's MapReduce paper. MapReduce is the use of a large number of computers to store data and calculate, and finally summarize the results. With hadoop+ reverse index +pagerank, you can create a search engine. Yahoo,baidu and other companies have developed their own search engines based on Hadoop.

But other companies ' search engines aren't as effective as Google's. This is the best we programmers know. Like me, always go through the wall, just for Google.

Google Blackboard published some of Dr. Wu's articles, which introduce a lot of knowledge about machine learning. As you can see from the text, Google actually uses machine learning to analyze the pages it collects. Google is obviously not going to expose this formula. Even if one day Google really public this formula, then you can want to see Google certainly also developed a more sharp cheats, Shanzhai search engine effect is still inferior to Google.

Shanzhai is the only road to innovation. Before becoming a leader in the field, it is necessary to go through the stages of learning and imitation. But to become the boss of the industry, become champion, must be brave to overtake the bend, bravely embark on the road of innovation, become a real scientist, the real Daniel!

Summarize

Programming capability can be divided into two dimensions: one is the level of programming skills and the other is the level of domain knowledge.

Some programmers may devote their energies to improving their programming skills, and they know little about the domain knowledge, which in fact is extremely harmful in their daily work. Some requirements may already have a ready-made, open-source free solution, or just combine several existing software to get it done quickly, but they have to spend a lot of time developing them. In addition, lack of domain knowledge, in the event of unexpected conditions, it is difficult to quickly locate the root cause of the problem, it is difficult to resolve the bug.



This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More