About Coordinate Transformation of 3D Models
substance
first of all
You may often see 3D images on TV screens and displays in games, but how do you display objects that exist in 3D space, such as characters and buildings that move around on the screen, on a 2D screen?
In 2D games, there are only two-dimensional elements of "X and Y" as coordinate values, and the display is also 2D, so if you draw an object by specifying the coordinate position of X and Y, you will intuitively understand how it will be drawn at which position.
However, in 3D, it's not so easy. As the name suggests, 3D is "3 dimension" and has three coordinate information: "X, Y, and Z". Since the coordinates are different from the display that is 2D, it is not possible to draw the object as it is.
Then, what to do is to "convert three-dimensional information into two-dimensional information". This is commonly referred to as a "coordinate transformation." Keep in mind that this coordinate transformation is essential for 3D programming.
There are several types of coordinate transformations for converting 3D to 2D, but there are three main types of coordinate transformations that programmers handle: "world transformation", "view transformation", and "projection transformation". Here, we will explain everything related to coordinate transformation.
Left-handed and right-handed coordinate systems
In 3D, there are two coordinate systems, the "left-handed coordinate system" and the "right-hand coordinate system", which have different orientations for each coordinate as shown in the figure below.
Direct3D primarily used a left-handed coordinate system, but there are also functions for calculating for right-handed coordinate systems. However, XNA only provides calculation methods for right-handed coordinate systems. This seems to be in line with the fact that other applications often use right-handed coordinate systems.
All of the XNA tips on this site use the right-handed coordinate system.
Local Coordinate System (Model Coordinate System)
Each model has a coordinate system centered on the origin. When creating a model with modeling software, I think it is easier to understand if you imagine creating it with the origin as the center.
World Coordinate System
The world coordinate system allows you to place the model anywhere. If you do nothing in this world transformation, the model will be placed at the origin in the same way as the local coordinates. Placement is not only to move from the origin, but also to rotate and scale.
View Coordinate System
Once you have placed the model in world coordinates, you need information about where you are looking and where you are looking at the 3D space. This is what we call "view transformations". View transforms are generally often represented as cameras.
The parameters required for this conversion are "camera position", "camera point of interest", and "camera upward direction". The orientation of the camera is determined by these three parameters. The figure below shows the camera from a third-party perspective.
The figure below is actually seen from the camera's point of view with the arrangement shown in the figure above (at this point, we have not yet converted the coordinates to the screen, so it is just an image).
In the previous explanation, it seems that the camera is positioned and the coordinates are transformed, but in the actual calculation, the world coordinates are converted according to the position and orientation of the camera. Therefore, the origin is the position of the camera as shown in the figure below.
Projective coordinate system
Once you have decided from which position to view the 3D space, the next step is to process the display of "small objects that are far away" and "large things that are nearby". This is called a projective transformation. There are two methods of projection transformation, "perspective projection" and "orthographic projection", but the commonly used "perspective projection" image is as follows.
Perspective projection uses the following parameters: Viewing Angle, Aspect Ratio, Forward Clip Position, and Rear Clip Position. The area labeled "frustum" in the figure above will finally appear on the screen.
"Viewing Angle" specifies the viewing range visible from the camera. Decreasing the angle zooms in, increasing it zooms out. The viewing angle will be the vertical value of the frustum.
Aspect Ratio is used to determine the horizontal angle of view, whereas the viewing angle is a vertical angle. The horizontal angle is usually determined by the "viewing angle × aspect ratio", and the aspect ratio is basically the value of the "width ÷ height" of the screen you are trying to display. If you change this value, the displayed 3D object will appear to stretch horizontally or vertically.
The Forward Clip Position and Rear Clip Position are specified to determine whether the object is displayed in the front or back range. Due to the nature of the computer, it is not possible to display up to infinity, so we will set a limit. This value also affects the accuracy of the Z-buffer, so it is not recommended to include it in the drawing area beyond the range that does not need to be displayed.
The perspective transformed object is converted to a space like the one below. Objects that were close to the camera are zoomed in, and objects that were far away are scaled down.
This is illustrated in an easy-to-understand diagram below.
If you actually look at it from the camera's point of view, it looks like below.
Another method of projective transformation is orthographic projection, which projects a visible area such as the one below. Because the width and height are constant regardless of depth, the size of the object does not change with depth.
Screen Coordinate System
After the projection transformation, it is converted to the coordinates of the actual screen. Even though it is a screen, the position and range of the display change depending on the viewport settings set on the device. However, in the case of games, the client coordinates of the window are often the viewport as it is, so I don't think you need to worry too much.
The coordinates of the screen (0, 0) are converted from the projection coordinates (-1, 1, z). Similarly, the screen coordinates (width, height) are converted from the projection coordinates (1, -1 ,z).