arXiv Analytics

Sign in

arXiv:0911.2031 [math.PR]AbstractReferencesReviewsResources

Closeness to the Diagonal for Longest Common Subsequences in Random Words

C. Houdré, H. Matzinger

Published 2009-11-10, updated 2016-04-20Version 4

The nature of the alignment with gaps corresponding to a longest common subsequence (LCS) of two independent iid random sequences drawn from a finite alphabet is investigated. It is shown that such an optimal alignment typically matches pieces of similar short-length. This is of importance in understanding the structure of optimal alignments of two sequences. Moreover, it is also shown that any property, common to two subsequences, typically holds in most parts of the optimal alignment whenever this same property holds, with high probability, for strings of similar short-length. Our results should, in particular, prove useful for simulations since they imply that the re-scaled two dimensional representation of a LCS gets uniformly close to the diagonal as the length of the sequences grows without bound.

Comments: Final version to appear in ECP, 2016
Categories: math.PR, math.CO
Subjects: 05A05, 60C05, 60F10
Related articles: Most relevant | Search more
arXiv:1001.1273 [math.PR] (Published 2010-01-08, updated 2010-11-12)
Fluctuations of the Longest Common Subsequence for Sequences of Independent Blocks
arXiv:1408.1559 [math.PR] (Published 2014-08-07, updated 2014-09-12)
A Central Limit Theorem for the Length of the Longest Common Subsequence in Random Words
arXiv:1803.03238 [math.PR] (Published 2018-03-08)
Length of the longest common subsequence between overlapping words