Designing systems for data analytics is a bit of a trapeze act—requiring you to balance frontend convenience and access without compromising on backend precision and speed. Whether you are evaluating the upgrade of current solutions or considering the rollout of a brand new greenfield platform, planning an analytics workload today requires making a lot of tough decisions, including:
Deciding between leaving data “in place” and analyzing on the fly, or building an unstructured “data lake” and then copying or moving data into that lake
Selecting a consolidated analytics and visualization frontend tool that provides ease of use without compromising on control
Picking a backend processing framework that maintains performance even while you’re analyzing mountains of data
Providing a full end-to-end solution requires not only evaluating a dozen or so technologies for each tier, but also looking at their many permutations. And at each juncture, we must remember we’re not just building technology for technology’s sake—we’re hoping to provide analytics (often in near real time) that drive insight, action, and better decision making. Making decisions based on “data, not opinions” is the end game, and the technologies we choose must always be focused on that. It’s a dizzying task.
Only by looking at the past, present, and future of the technologies can we have any hope of providing a realistic view of the challenges ...