Item Details

Toward a Lempel-Ziv '78 Compression-Aware Algorithm for Computing the Longest Common Substring

Duric, Petar
Thesis/Dissertation; Online
Duric, Petar
Duric, Petar
Computing directly on compressed data offers opportunities for significant resource savings for enterprise applications, as well as to one day make heavily resource-intensive computation tractable for peer-to-peer applications. However, a general algorithm for computing on compressed data with sub-exponential runtime without decompression, which incurs a high memory cost, has not been found. Compression-aware methods tend to be developed in relation to specific compression schemes. It is possible that further develop on an algorithm-by-algorithm basis will one day illuminate a general method, on top of being fruitful in its own right. This thesis describes a research methodology for producing random Lempel-Ziv '78 compressions of varying compression ratios, and presents findings on the lengths of Longest Common Substrings computed between pairs of random decompressed Lempel-Ziv '78 compressions, in the hopes of illuminating patterns that could be used to devise heuristics for a Lempel-Ziv '78-aware algorithm for computing the Longest Common Substring. While such patterns were not found, this thesis presents analysis on why this method was thwarted and what further augmentations of the method could be fruitful.
University of Virginia, School of Engineering and Applied Science, BS (Bachelor of Science), 2020
Published Date
BS (Bachelor of Science)
Libra ETD Repository
Logo for Creative Commons Attribution LicenseCreative Commons Attribution License


Read Online