How to Conduct Distributed Incomplete Pattern Matching

By S. Liu📧, L. Kang, L. Chen, and L. M. Ni

In IEEE Transactions on Parallel and Distributed Systems, 2014, 25 (4): 982–992. https://doi.org/10.1109/TPDS.2013.128

In this paper, we first propose a very interesting and practical problem, pattern matching in a distributed mobile environment (e.g., mobile phone networks), where one person’s pattern could be separately stored in a number of different stations, and such a local pattern is incomplete compared with the global pattern. A simple solution to pattern matching over a mobile environment is to collect all the data distributed in base stations to a data center and conduct pattern matching at the data center afterwards. Clearly, such a simple solution will raise huge amount of communication traffic, which could cause the communication bottleneck brought by the limited wireless bandwidth to be even worse. Therefore, a communication efficient and search effective solution is necessary. In our work, we present a novel solution which is based on our well-designed Weighted Bloom Filter (WBF), called, Distributed Incomplete pattern matching (DI-matching), to find target patterns over a distributed mobile environment. Specifically, to save communication cost and ensure pattern matching in distributed incomplete patterns, we use WBF to encode a query pattern and disseminate the encoded data to each base station. Each base station conducts a local pattern search according to the received WBF. Only qualified IDs and corresponding weights in each base station are sent to the data center for aggregation and verification. Through non-trivial theoretical analysis and extensive empirical experiments on a real city-scale mobile networks data set, we demonstrate the effectiveness and efficiency of our proposed solutions.

Keywords: Incomplete pattern matching; Time series; Distributed mobile environment; Weighted bloom filter