When exploring a new domain, one of the most exciting and difficult parts is learning all the new concepts and terminology and then attempting to implement the code to realize models and algorithms in that domain. Computer graphics has a wide range of relevant domains which ultimately boils down to interpreting data and then rendering a scene as an image. While interpreting data is what all programs do, I find learning computer graphics to be a unique experience compared to other domains. In the end, there are 3 competing factors which I keep in mind: intentions, definitions, and implementations.

Intentions

Code is written with an intention. The code may not be bug free or efficient or even correct, but there is always an underlying intent behind what the code should do. Unlike most domains, for graphics, there is also an artistic desire mixed in with the intention. The final image could be a photo realistic representation of a scene, a cel-shaded image like in games, animation, and comic books, or many other artistic and stylish representations. While most business applications have requirements based on processes or rules, graphics rendering has more freedom for creativity which can lead to a greater variety of intentions.

Definitions

While the underlying intention is a major influence, the realization of the intention begins with defining models which describe how things should work. For instance, the intention could be for a photorealistic scene, but there needs to be a definition for how concepts like light should be rendered. A model for light can be built from math and physics. There is a wide range of definitions for light from simplistic models to complex physics based models. Lights, shadows, materials, and everything else which can affect a rendered scene can have various definitions depending on the needs and the level of understanding.

Implementations

Ultimately, there is the implementation of the definitions in code. Code is shaped by available APIs and resources such as the underlying hardware. Whether the scene is rendered in real-time or offline, code must be written to realize the desired model and the limitations of the APIs and hardware can constrain the ability to realize the model.

In the end, the rendered scene is molded by intentions, definitions, and implementations. In many cases, an individual factor may influence the others. For example, perhaps the hardware limits what can be displayed on screen at one time with the given definition of how light data should be processed. Then, the underlying vision of the intention may be compromised. Beyond having a wide range of domains to understand to render, perhaps the most interesting thing about graphics is trying to realize the original intention by figuring out ways to make definitions and implementations more efficient.

Metal is fairly specialized knowledge partially due to it being a proprietary API only available on Apple platforms. Furthermore, most Apple developers use higher level frameworks for drawing to the screen, so there is not a huge demand for tutorials or books about Metal.

Still, there are a few resources which are available:

There are also samples and other notes in Apple’s archived documentation. Metal usually has at least 3 WWDC sessions every year so it is still an evolving technology that Apple is investing in.

Also, I’ve found 3 books related to Metal:

If I were to choose one, I would pick Metal by Tutorials. It gives the basic information to start building many game rendering engine features. The book can be a bit overwhelming because it usually gives a brief introduction to a rendering engine concept and then shows a code example through a mini-project. However, if you do not understand the concept, it can be hard to follow the code, so you may need to use other resources. The book does provide other references for further reading. The book has been recently updated for Metal advancements up to iOS 13 in 2019.

I hope to have a more robust list of math and general graphics programming books in a future blog post.

One of my relatively recent interests is GPU programming. Graphics programming has always fascinated me because there has been this other part of any modern computer which was kind of like the CPU but had its own unique way to process data. Whether the GPU is part of an integrated system on a chip or a discrete chip with its own memory, the GPU can efficiently process large amounts of data in parallel. However, for most programmers, the GPU is rarely directly used because their problem domains are not suitable for the type of parallel processing which the GPU is great at.

So why do I have a recent interest in GPUs? My theory is that GPUs may play a bigger role in the next generation of devices capable of features like AR. Parallel data processing whether it be displaying and manipulating image data on a screen or running machine learning algorithms to recognize hand gestures or your voice will all utilize the GPU or similar processors. Of course, GPUs will still be specialized domain knowledge, and there will be many libraries and off the shelf tools and engines which more people will use rather than directly programming to the GPU. The market for programming web services will always be orders of magnitude greater than graphics programming. But sometimes it is good to work with some of the more foundational API layers and somewhat closer to the hardware. Plus, unlike most business logic programming, if you learn graphics programming, you can output something pretty versus some histogram or bar chart.

In particular, I am trying to learn Metal, Apple’s low level API for GPU programming. Metal powers all of Apple’s platforms. For instance, most of Apple’s UI and game frameworks use Metal. I’m picking up Metal more due to practicality since I have many Apple devices, and Metal is the preferred way to access the GPU. DirectX and Vulkan would also be equally nice to learn but are unavailable or unsupported on Apple platforms.

As an aside, for browsers, WebGL is fine but due to it evolving from OpenGL ES, the API seems like a couple of generations behind. If you are interested in WebGL and new (or need a refresher) to graphics programming, I recommend the WebGL Programming Guide book. The book requires only basic JavaScript knowledge, and it gives enough to jump start simple rendering projects. I picked up the WebGL book after reading a few chapters in a couple of Metal books because the Metal programming was already becoming complicated and I wanted to see how it was in WebGL for comparison. From my very brief foray, WebGL is capable but a bit awkward (e.g. setting up the shaders is cumbersome); WebGL made me appreciate Metal more. Maybe WebGPU will be better.

In the end, I expect to stumble quite a bit since this is largely a new domain for me. As I progress through some books and projects, I will try to blog a bit about what I’ve learned along the way.