Layers of Metrics
At Flickr, we have at least three different types of metrics collection:
Application and business-level metrics stored in a database (MySQL) on a daily basis, gathered nightly (less than daily isn't needed for these purposes)
Feature-specific, application-level metrics stored in a database (MySQL) in real time (as those events happen)
Higher-resolution systems and service-level metrics stored in the RRD (with Ganglia) on a 15- or 60-second basis
High-Level Business or Feature-Specific Metrics
These metrics are usually for tracking website-specific events. In the case of Flickr, this means values for photos uploaded, user registrations, average photo size, total disk space consumption, Pro accounts sold, help cases logged and resolved, and so on. Because these are generally used for long-term trending for forecasting product or capacity planning needs, the daily resolution is fine. Adding higher resolution more than once per day wouldn't change any of the results and would only increase the amount of time it would take to run reports or make it a pain to move the data around. Gathering these metrics once a day can be as simple as a nightly cron job working on a replicated slave database kept solely for crunching these numbers.
Because we store these metrics in a database, being able to manipulate or correlate data across different metrics is pretty straightforward, because the date is held constant across metrics.
For example, it might not be a surprise that during the ...
Get Web Operations now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.