Ray tracing is one of the most famous techniques in the field of graphics, the first of which is the point of view (camera, eye) passing through the center of the pixel, emitting a ray, the main ray (secondary light refers to the light emitted from the surface of the object, such as reflection or refraction). This step looks relatively simple, but still involves some details and concepts that need to be clarified.
My system of learning these things is from the RenderMan specification began, see the first book is Advanced RenderMan, this is more difficult, I would recommend to see a Introduction to Ray tracing (1989).
Start the chase. First of all, to figure out the FOV angle, the advanced RenderMan will talk about the focal length, speak very meticulous depth, if you want to dig deep can read, I will not discuss (after all, I do not remember). The three-dimensional scene the camera sees is a cone of sight, also called a flat-truncated body, such as:
At the apex there will be two angles, one large, one small, the smaller one is the FOV angle, and the larger one is determined by the aspect ratio of the screen we see. For example, the screen width and height are width=640,height=480, the small FOV angle is fov_little, the large angle is not_fov_but_big, has the following relationship: Tan (FOV_LITTLE/2)/tan (not_fov_ BUT_BIG/2) = Height/width. What is the FOV angle for? It determines the angle of the cones we can see, or, more bluntly, determines the size of the world we can see, of course, this is not accurate, we still talk about the previous example, the width and height of the screen is 640 and 480, the eyes are looking straight ahead, The angle of the FOV is determined by the eyes can see the scope of the world, and the left and right can see the range is determined by 640, if the width becomes 1000, naturally can see more, if not so, 1000 and 640 see the world is the same, then will inevitably lead to image stretching, Usually this is not the result we want. On the other hand, if the width becomes small, the natural world will become smaller, but if the width is less than 480, then the FOV becomes the range of the world that determines what the eyes can see.
Then, we emit light through the pixels, the space in which the pixels are is rasterized space, for example, the width is from 0 to 639, the height is from 0 to 479, but we now need to emit light in the three-dimensional world (camera coordinate system), so we need to transform the pixels (px,py) into the three-dimensional world (WX,WY,WZ). If we have transformed the past and got it (WX,WY,WZ), then the point where the camera is connected (0,0,0) and (WX,WY,WZ) is the ray we want. How do we do this transformation? Intuitively, it is to put this screen (640,480) into the three-dimensional world, so that the z-axis through the center of the screen (forget to say, this article is based on RenderMan discussion, so the camera toward the z-axis positive direction), while the x-axis of the screen (0~639) and the x-axis of the three-dimensional world parallel, the 479) parallel to the y-axis of the three-dimensional world. Thus, we introduce another coordinate system, the screen coordinate system, in fact, is the rasterization of the space normalization, the 0~479 to -1~1, the corresponding 0~639 change to -1.333~1.333. (This is only a convention, or more convenient, not necessarily in numerical terms) then, we put the normalized screen in the position of the Planez=cot (FOV/2), at this time, the screen is just the peace of the cutting head in the body cut (do not know the exact term, anyway, is just in the flat head inside the body, A little bigger will come out),
Then say the transformation, for any pixel (px,py), do a translation first, so that the z-axis through the center of the screen, the translation is obtained, px1 = Px-width/2,py1 = PY-HEIGHT/2, (here only the width and height can be divided into 2 cases). Then, note that we emit light from the center of the pixel, and px1 and Py1 are the coordinates of the upper-left corner of the pixel, so they need to be added 0.5, px2 = px1 + 0.5,py2 = py1 + 0.5, then normalized, px2 and py2 divided by height/2,px3 = PX2/(2*height), py3 = Py2/(2*height), this is the coordinates of the screen space we want to end up with. Notice that we put the screen in the Planez=cot (FOV/2) position, so we've got the three-dimensional coordinates (PX3,PY3,PLANEZ) where the ray intersects the screen space. We know that the starting point of the Ray is (0,0,0), and the ray passes through the point (Px3,py3,planez), according to the nature of the similar triangle, can get the ray at Z=z0, the point that passes is (px3/planez*z0,py3/planez*z0,z0). Connecting the ray start point (0,0,0) and any point on the ray can get the vectors we need, so we can choose the Z=planez point or the z=z0 (for example, 1). The resulting vector is the same after normalization.
If I can not make things clear, I will have a strong sense of frustration, so I try to express accurately, at the same time hope that the other side can accurately ask questions, such as clear where they understand, where do not understand, not sure of the place to say their own understanding, see right, Feel very confused where you can say what you think it should be like, ask why not. Listening is a very good virtue, but the exchange of problems, or should be appropriate to publish their own views, right and wrong is not important, it is important that you say your own ideas, others will know where you are not clear, where to find out, more targeted to explain, otherwise, the respondents will have a sense of getting lost in the forest, One will ask a question, one will ask the B question, completely do not know that you figure out which is not clear which, will be frustrated and distressed. Or that truth, we do anything, we need to motivate, whether from the outside, or self, explain the problem is the same, the respondents want to ask the party to give clear feedback, the question of which sub-questions to understand, but other sub-issues do not understand, so, on the one hand, the answer party was motivated, Will not feel the other hand, when the answer will be more targeted.
Ray tracing emits the main ray