Distributed key-value stores have become the solution of choice for warehousing large volumes of data. However, their architecture is not suitable for real-time analytics, as batch processing is a time- intensive task. To achieve the required velocity, materialized views can be used to provide summarized data for faster access. The main challenge is, the incremental, consistent maintenance of views at large scale. Thus, we introduce our View Maintenance System (VMS) to maintain SQL-like queries in a data-intensive real-time scenario. VMS can be scaled independently and at the same time provides guarantees for consistency, even under high update loads. We evaluate our full-fledged implementation of VMS on top of Apache’s HBase using a synthetic as well as a TPC-H workload. Exploiting parallel maintenance, VMS manages thousands of views in parallel, achieves up to 1M view updates per second and provides <5 ms access to view data.
Readers who enjoyed the above work, may also like the following: