High performance computing (HPC) data center.
High performance computing (HPC) data center. (source: Department of Energy on Flickr)

Do you really need yet another database?

Yes, in the age of big data, you do—and it’s called an analytical database. It’s the right tool for the job when it comes to transforming large amounts of raw data into actionable intelligence.

After all, the purpose of a big data initiative is to leverage data for better business decision-making. That requires getting super fast turnarounds to queries. Yet, with the volumes of data accruing at most enterprises today, such turnaround is getting more and more difficult to achieve with traditional data warehouses.

Although relational databases can support data warehouses for analytics and business intelligence (BI), an analytics database offers performance and scalability advantages over traditional database software, especially as data volumes continue to grow faster than IT budgets.

Of course, businesses can go the data lake route by setting up a data repository on Hadoop that can accommodate even the heaviest floods of data. But this won’t help much with performing the analytics that are necessary to glean knowledge from all the bits and bytes.

For example, Hadoop is an extremely cost-effective way to store and process large volumes of structured or unstructured data. It’s also designed to optimize batch jobs. But fast, it is not. When time isn’t a constraint, Hadoop can be a boon. For more urgent business analytics tasks, however, it’s not a big data panacea.

This is where analytical databases come in. Analytical databases typically sit next to the system of record—whether that’s Hadoop, Oracle, or Microsoft—to perform speedy analytics of big data. It’s the right tool for the job of fast, powerful, actionable analytics.

You should choose your analytical database based on three factors: the structure of your data, the size of your data, and the types of questions you want to ask of your data.

Criteria for choosing an analytical database

Let’s look a little more closely at each of these factors:

  • Structure: Does your data fit into a nice, clean data model? Or is it unstructured with a schema that either lacks clarity, is dynamic, or both? Different analytical databases have different specialties when it comes down to structured or unstructured data.
  • Size: Is your data “big data” or does it have the potential to grow into big data? How big is big? If your answers are “yes” and “very big,” you need an analytics database that can scale appropriately.
  • Analytics: What questions do you want to ask of the data? Short-running queries or deeper, longer-running, or predictive queries?

Of course, you have other considerations when choosing an analytical database. You’ll want to consider the total cost of ownership (TCO) based on the cost per terabyte. A very important point to consider is your staff’s familiarity with the database technology involved. (Hadoop is notoriously difficult and unfriendly to new users.) Finally, you’ll want to take into account the openness of the database in question and the size of the user community and ecosystem surrounding it.

What should drive your database investment decision

In the end, what drives your database investment decision are the same forces that drive IT decisions in general. You want to:

  • Increase revenues: You do this by investing in a big data analytics solution that allows you to reach more customers, develop new product offerings, focus on customer satisfaction, and understand your customers’ buying patterns.
  • Enhance efficiency: You accomplish this by choosing a big data analytics solution that reduces software-licensing costs, performs processes more efficiently, takes advantage of new data sources effectively, and accelerates the speed that information is turned into knowledge.
  • Improve compliance: Finally, you must choose an analytics database that helps you to comply with local, state, federal, and industry regulations, and ensure that your reporting passes the robust tests that today’s regulatory mandates place on it. Plus, your database must be secure to protect the privacy of the information it contains, so that it’s not stolen or exposed to the world.

In short, the right analytics database will help drive business success. Choose wisely.

This post is a collaboration between O’Reilly and HPE Vertica. See our editorial statement of independence.

Article image: High performance computing (HPC) data center. (source: Department of Energy on Flickr).