Chapter 15. 3D Deep Learning with PyTorch
This is where things get a bit more complex. Handling 3D data in deep learning architectures is quite different from working with text or images. But don’t worry—we’ll tackle it step by step at a comfortable pace. One of the primary hurdles lies in data representation. 3D data can be conveyed in various formats, such as point clouds, 3D meshes, or voxel grids as shown in Chapter 4.
Do you remember our previous experiments and writings on 3D data structures and representations? You’ll be glad to know that these are crucial factors when working with 3D deep learning. In fact, the choice of 3D data representation significantly impacts the architecture and paradigms of your 3D deep learning solution. At this stage, we can distinguish between four data representations supported by 3D deep learning approaches: 3D point clouds, 3D voxel grids, 3D meshes, and multiview image datasets.
Each representation has its strengths and challenges, influencing the choice of which deep learning architecture is the right fit. Jumping into the specifics at this stage would be like throwing you into the deep end before you’ve learned to swim. You might feel overwhelmed by layers of complexity and end up completely lost. Jokes aside, let me first share some key concepts and tools that will be helpful before we dive into 3D deep learning architectures. I structured this chapter to guide you through the fundamentals of 3D deep learning with PyTorch (Figure 15-1 ...