arXiv Analytics

Sign in

arXiv:1207.0302 [math.PR]AbstractReferencesReviewsResources

Towards More Realistic Probabilistic Models for Data Structures: The External Path Length in Tries under the Markov Model

Kevin Leckey, Ralph Neininger, Wojciech Szpankowski

Published 2012-07-02, updated 2012-09-18Version 3

Tries are among the most versatile and widely used data structures on words. They are pertinent to the (internal) structure of (stored) words and several splitting procedures used in diverse contexts ranging from document taxonomy to IP addresses lookup, from data compression (i.e., Lempel-Ziv'77 scheme) to dynamic hashing, from partial-match queries to speech recognition, from leader election algorithms to distributed hashing tables and graph compression. While the performance of tries under a realistic probabilistic model is of significant importance, its analysis, even for simplest memoryless sources, has proved difficult. Rigorous findings about inherently complex parameters were rarely analyzed (with a few notable exceptions) under more realistic models of string generations. In this paper we meet these challenges: By a novel use of the contraction method combined with analytic techniques we prove a central limit theorem for the external path length of a trie under a general Markov source. In particular, our results apply to the Lempel-Ziv'77 code. We envision that the methods described here will have further applications to other trie parameters and data structures.

Comments: minor revision; to appear in Proceedings of ACM-SIAM Symposium on Discrete Algorithms (SODA) (2013)
Categories: math.PR, cs.DS
Subjects: 60F05, 68P05, 68Q25
Related articles: Most relevant | Search more
arXiv:1505.07321 [math.PR] (Published 2015-05-27)
A Limit Theorem for Radix Sort and Tries with Markovian Input
arXiv:1107.0785 [math.PR] (Published 2011-07-05)
A Markov model of land use dynamics
arXiv:0809.1824 [math.PR] (Published 2008-09-10, updated 2010-02-19)
A Markov model for the spread of Hepatitis C