Preface: Transposition thinking. Today's life, fast pace, busy task. Slowly overlooked a lot of things around, a lot of people. Coupled with the acceptance of "higher" education more and more people, "have their own thinking" more and more people, slowly are accustomed to thinking from their own point of view to consider the problem, especially the students who read engineering is like to think in accordance with their own perspective. Slowly ignored the transposition of thinking. Many friends say that people who study engineering like to go to extremes. Perhaps this is like Jin Yong's novel Inside Shaolin monks to two to steal learn Shaolin martial arts of People's advice. In the busy life and tense work, find a time, can let oneself stop, think about doing things, let oneself busy footsteps, rest for a while.
Often in team development, encounter problems, it is necessary to communicate and exchange, but the basis of communication and communication is the transposition of thinking. In an equal environment of communication and communication can be counted in the real sense of the exchange of ideas. In fact, when learning engineering, there is a little trick, that is to find the rules. There are established rules, and that is the theorem and the definition. If you can find a new rule, it is a new discovery that can be written paper. When we meet new things, we'd better find the shadow in our own thinking and find the same rules. So that you can learn new things very well. However, often learn engineering thinking more regular, in addition to the usual reading of the engineering books are too strong, in the long run it is easy to form a paranoid personality. Usually need to see some of the expansion of thinking books, perhaps to reduce some hostility bar.
Text: The front has explained a lot of conceptual things, in fact cuda the most important two things, is the thread and memory. As long as mastered these two things, cuda things are very simple. Its writing language is C extension, so, just as the C language used on the line, but mainly its special several signs on the OK. The thread and memory model, presumably, should, it seems, have a concept in your mind. As long as there is this concept, the purpose of my article is achieved. Front of the "Cuda Hardware Implementation Analysis (i)------Camp-----GPU Revolution" has explained the thread in the cuda of the concrete running process. Let's look at some of the provisions in the CUDA hardware implementation. This is more reasonable, army camp, it should be promulgated rules system, only understand the rules of CUDA system, can really put the various threads are managed well. To enable the program to run efficiently on this platform.
Here we first clear a few
A Threads,warps, Blocks 1. A warp has a maximum of 32 Threads. It is possible to have fewer than 32 threads in a warp if the bus path is less than 32. 2. Each block has a maximum of 16 warp. It means that there are up to 512 thread in a block. 3. Each block is executed on the same SM, which means that the warp of the same block runs on the same SM. 4.G80 has 16 SM. 5. So at least 16 blocks can account for all SM. 6. If the resources (see the previous explained threads are from device where the resources) enough thread points, a SM above can run redundant block of thread. You can run 2 at a time, 3 ... A block thread.
Two Access speed REGISTER-HW A time period Shared Memory------HW A clock cycle local Memory---dram,no cache, slow Global Memory---DRAM, no cache, slow Co Nstant Memory---DRAM, cached, 1 ... 10s ... 100s cycles, this is related to the locality of the cache. Texture Memory---dram,cached, 1 ... 10s ... 100s cycles, this is related to the locality of the cache. Instruction Memory (not visible)---dram,cached
Three Cuda program Architecture as shown
Four Language extensions from the language of learning, I feel that one should be fine, and then the others will be analogy. In learning a new language, there is a knack for getting started quickly. 1. How to define variables. 2. How the function is defined. 3. Logical control Mode (If,loop ...) )。 Just get these 3 things figured out, whatever the new language, you can start in 20 minutes ... Then the introduction is all right, that should be slowly more into the study, it depends on your understanding of the language. In fact, the change is not away from them. In fact, from the computer programming language point of view, is to define some data, and then operate the data, so ... Learning the language from this point of view, it is very simple. Some languages, such as Java or C #, are more new than C, which is easy for you to develop.
So we are here to say Cuda language, nothing more than to expand the C language, in order to facilitate the GPU graphics on the run, the provision of a specific environment. Define certain variables to show that they are on the GPU. Here is a description of memory, and functions, on the GPU ... so, Cuda expands the C language's variable assignment definition and function definition.
The above diagram is the Fall 2007 syllabus, which has clearly stated the position and the life cycle of each variable definition. In fact, in the C language of the general variables, the definition of the location of the variable.
One constraint is that the pointer variable, the pointer variable inside the kernel, can only point to the memory allocated from the global.