In Proceedings of the 38th Conference on Very Large Databases (VLDB), 2012.
As the complexity of enterprise systems increases, the need for
monitoring and analyzing such systems also grows. A number of
companies have built sophisticated monitoring tools that go far beyond
simple resource utilization reports. For example, based on
instrumentation and specialized APIs, it is now possible to monitor
single method invocations and trace individual transactions across
geographically distributed systems. This high-level of detail enables
more precise forms of analysis and prediction but comes at
the price of high data rates (i.e., big data). To maximize the benefit
of data monitoring, the data has to be stored for an extended period
of time for ulterior analysis. This new wave of big data analytics
imposes new challenges especially for the application performance
monitoring systems. The monitoring data has to be stored in a system
that can sustain the high data rates and at the same time enable
an up-to-date view of the underlying infrastructure. With the advent
of modern key-value stores, a variety of data storage systems
have emerged that are built with a focus on scalability and high data
rates as predominant in this monitoring use case.
In this work, we present our experience and a comprehensive performance evaluation of six modern (open-source) data stores in the context of application performance monitoring as part of CA Technologies initiative. We evaluated these systems with data and workloads that can be found in application performance monitoring, as well as, on-line advertisement, power monitoring, and many other use cases. We present our insights not only as performance results but also as lessons learned and our experience relating to the setup and configuration complexity of these data stores in an industry setting.