A crude (but simple) approach to camera handling
Rotating the camera
At this stage, our renderer has no explicit concept of a camera:
depth (the \(z\)-coordinate) does not influence the size or shape of objects, and is largely ignored.
We do use the \(z\)-coordinate in the \(z\)-buffer to hide occluded surfaces, but this is a minor detail.
Ignoring depth leads to orthographic projection, where objects retain their dimensions regardless of distance from the viewer.
As before, our renderer outputs two images: zbuffer.tga
and framebuffer.tga
.
To simulate a camera, we transform the scene rather than moving the camera itself. For example, instead of rotating the camera to the left, we rotate the scene to the right. This keeps the camera fixed conceptually, and shifts the viewpoint via model-view transformations. This method simplifies camera handling and offers intuitive control over the visible scene.
See this commit:
Rotating the Object
In this example, I apply the function vec3 rot(vec3 v)
(lines 1–5) to each model vertex (lines 24–26) before projection.
This rotates each vertex by \(30^\circ\) around the \(y\)-axis to the right, giving the impression of rotating the camera to the left:
Here, a constant rotation matrix is used. Thanks to the previous homework, we now have basic vector and matrix operations available. If you're not familiar with rotation matrices, don't worry - we'll soon explore alternative model-view transformations. However, be sure to review basic vector math, as it will be essential.
Central projection
Orthographic projection is useful, but central (or perspective) projection offers more realism: closer objects appear larger than distant ones.
Consider projecting a point \(P = (x, y, z)\) onto the plane \(z=0\) using a camera located at \(C = (0, 0, c)\) on the \(z\)-axis:
The projection point \(P'\) lies at the intersection of line \(CP\) and the screen plane \(z=0\). Given \(P = (x, y, z)\) and camera parameter \(c\), we want to find \(P' = (x', y', 0)\).
Let’s first compute \(x'\) using the plane \(y = 0\):
Using the intercept theorem:
Similarly, in the plane \(x = 0\):
As expected, both expressions are structurally similar. So, just like rotation, we implement perspective projection by transforming vertices: we replace each vertex \((x, y, z)\) with \((x, y, z) \cdot \frac{1}{1 - \frac{z}{c}}\). We then apply orthographic projection as usual. See lines 7–10 and 29–31:
Central Projection
Here is the resulting image (see commit):
Personally I find it much more convincing than the previous one made with an orthographic projection.
If this is unclear, consider the following 2D example. Suppose we have a polygon with vertices: \((2,0)\), \((0,2)\), \((-2, 2)\), \((-2,-2)\), and \((2,-2)\):
The camera is at \(C = (10, 0)\). We want to project all vertices onto the green line. Each vertex \((x, y)\) is transformed to:
New coordinates:
Here is the deformed object:
We can now apply an orthographic projection to these transformed points to get the desired central projection.
Homework assignment: find the bug
Everything works well - except with the Diablo model. The hand closest to the viewer is somewhat clipped:
Your task is to understand why, and how to fix it.
Spoiler Alert!
The \(z\)-buffer is stored as an 8-bit grayscale image.
Spoiler Alert #2!
The model was assumed to fit within the cube \([-1, 1]^3\), but this doesn't hold after rotation.
Spoiler Alert #3!
The hand's depth exceeds 1, resulting in values >255 after viewport transformation, causing integer overflow. While storing the \(z\)-buffer as an image is helpful for debugging, it’s better to use a 2D array of floating-point values. To visualize it, write a few lines of code to convert the depth array into an image.