In SIGMOD PhD Symposium, 2012.
The efficient processing of large collections of patterns (Boolean expressions, XPath expressions, or continuous SQL queries) over data streams plays a central role in major data intensive applications ranging from user-centric processing and personalization to real-time data analysis. On the one hand, emerging user-centric applications, including computational advertising and selective information dissemination, demand determining and presenting to an end-user only the most relevant content that is both user-consumable and suitable for limited screen real estate of target (mobile) devices. We achieve these user-centric requirements through novel high-dimensional indexing structures and (parallel) algorithms. On the other hand, applications in real-time data analysis, including computational finance and intrusion detection, demand meeting stringent subsecond processing requirements and providing high-frequency and low-latency event processing over data streams. We achieve real-time data analysis requirements by leveraging reconfigurable hardware -- FPGAs -- to sustain line-rate processing by exploiting unprecedented degrees of parallelism and potential for pipelining, only available through custom-built, application-specific, and low-level logic design. Finally, we conduct a comprehensive evaluation to demonstrate the superiority of our proposed techniques in comparison with state-of-the-art algorithms designed for event processing.
Readers who enjoyed the above work, may also like the following: