CHAPTER 12Data Engineering on the Cloud
In the early days of computing, computers were very large and expensive. Organizations that could afford them had mainframes housed in dedicated rooms where the temperature was controlled, and users interacted with them through terminals. With this setup, every single byte of processing or storage was managed internally. As personal computers and servers became more affordable, many companies transitioned to building their on-premises (in-house) infrastructure. This meant buying physical servers, installing them in racks, and having a dedicated IT team to manage everything from hardware maintenance to software updates.
However, this setup had several limitations. First, it required heavy capital investment because companies had to predict their future computing needs, which, to be honest, often changed. If they underestimated, they couldn’t handle sudden spikes in traffic. If they overestimated, they wasted money on hardware that won’t be used. Second, maintaining on-premise systems was complex. IT teams had to worry about cooling, power supply, backups, disaster recovery, hardware failures, and security, all while trying to support the business’s evolving data needs.
Later on, a game-changing idea began to emerge along with questions like, “What if companies could access computing power over the Internet?,” without having to buy and maintain the hardware themselves. This idea wasn’t entirely new, because something similar was happening ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access