An Efficient High-Throughput LZ77-Based Decompressor in Reconfigurable Logic

Jian Fang; Jianyu Chen; Jinho Lee; Zaid Al-Ars; Peter Hofstee

doi:10.1007/s11265-020-01547-w

An Efficient High-Throughput LZ77-Based Decompressor in Reconfigurable Logic

Jian Fang^*, Jianyu Chen, Jinho Lee, Zaid Al-Ars, Peter Hofstee

^*Corresponding author for this work

Computer Engineering

Research output: Contribution to journal › Article › Scientific › peer-review

5 Citations (Scopus)

185 Downloads (Pure)

Abstract

To best leverage high-bandwidth storage and network technologies requires an improvement in the speed at which we can decompress data. We present a “refine and recycle” method applicable to LZ77-type decompressors that enables efficient high-bandwidth designs and present an implementation in reconfigurable logic. The method refines the write commands (for literal tokens) and read commands (for copy tokens) to a set of commands that target a single bank of block ram, and rather than performing all the dependency calculations saves logic by recycling (read) commands that return with an invalid result. A single “Snappy” decompressor implemented in reconfigurable logic leveraging this method is capable of processing multiple literal or copy tokens per cycle and achieves up to 7.2GB/s, which can keep pace with an NVMe device. The proposed method is about an order of magnitude faster and an order of magnitude more power efficient than a state-of-the-art single-core software implementation. The logic and block ram resources required by the decompressor are sufficiently low so that a set of these decompressors can be implemented on a single FPGA of reasonable size to keep up with the bandwidth provided by the most recent interface technologies.

Original language	English
Pages (from-to)	931-947
Number of pages	17
Journal	Journal of Signal Processing Systems
Volume	92
Issue number	9
DOIs	https://doi.org/10.1007/s11265-020-01547-w
Publication status	Published - 2020

Keywords

Acceleration
CAPI
Decompression
FPGA
Snappy

Access to Document

10.1007/s11265-020-01547-w

Fang2020_Article_AnEfficientHigh-ThroughputLZ77Final published version, 2.36 MBLicence: CC BY

Cite this

@article{f234763d1e524f09a4dcae89fa57f1f5,

title = "An Efficient High-Throughput LZ77-Based Decompressor in Reconfigurable Logic",

abstract = "To best leverage high-bandwidth storage and network technologies requires an improvement in the speed at which we can decompress data. We present a “refine and recycle” method applicable to LZ77-type decompressors that enables efficient high-bandwidth designs and present an implementation in reconfigurable logic. The method refines the write commands (for literal tokens) and read commands (for copy tokens) to a set of commands that target a single bank of block ram, and rather than performing all the dependency calculations saves logic by recycling (read) commands that return with an invalid result. A single “Snappy” decompressor implemented in reconfigurable logic leveraging this method is capable of processing multiple literal or copy tokens per cycle and achieves up to 7.2GB/s, which can keep pace with an NVMe device. The proposed method is about an order of magnitude faster and an order of magnitude more power efficient than a state-of-the-art single-core software implementation. The logic and block ram resources required by the decompressor are sufficiently low so that a set of these decompressors can be implemented on a single FPGA of reasonable size to keep up with the bandwidth provided by the most recent interface technologies.",

keywords = "Acceleration, CAPI, Decompression, FPGA, Snappy",

author = "Jian Fang and Jianyu Chen and Jinho Lee and Zaid Al-Ars and Peter Hofstee",

year = "2020",

doi = "10.1007/s11265-020-01547-w",

language = "English",

volume = "92",

pages = "931--947",

journal = "Journal of Signal Processing Systems",

issn = "1939-8018",

publisher = "Springer",

number = "9",

}

TY - JOUR

T1 - An Efficient High-Throughput LZ77-Based Decompressor in Reconfigurable Logic

AU - Fang, Jian

AU - Chen, Jianyu

AU - Lee, Jinho

AU - Al-Ars, Zaid

AU - Hofstee, Peter

PY - 2020

Y1 - 2020

N2 - To best leverage high-bandwidth storage and network technologies requires an improvement in the speed at which we can decompress data. We present a “refine and recycle” method applicable to LZ77-type decompressors that enables efficient high-bandwidth designs and present an implementation in reconfigurable logic. The method refines the write commands (for literal tokens) and read commands (for copy tokens) to a set of commands that target a single bank of block ram, and rather than performing all the dependency calculations saves logic by recycling (read) commands that return with an invalid result. A single “Snappy” decompressor implemented in reconfigurable logic leveraging this method is capable of processing multiple literal or copy tokens per cycle and achieves up to 7.2GB/s, which can keep pace with an NVMe device. The proposed method is about an order of magnitude faster and an order of magnitude more power efficient than a state-of-the-art single-core software implementation. The logic and block ram resources required by the decompressor are sufficiently low so that a set of these decompressors can be implemented on a single FPGA of reasonable size to keep up with the bandwidth provided by the most recent interface technologies.

AB - To best leverage high-bandwidth storage and network technologies requires an improvement in the speed at which we can decompress data. We present a “refine and recycle” method applicable to LZ77-type decompressors that enables efficient high-bandwidth designs and present an implementation in reconfigurable logic. The method refines the write commands (for literal tokens) and read commands (for copy tokens) to a set of commands that target a single bank of block ram, and rather than performing all the dependency calculations saves logic by recycling (read) commands that return with an invalid result. A single “Snappy” decompressor implemented in reconfigurable logic leveraging this method is capable of processing multiple literal or copy tokens per cycle and achieves up to 7.2GB/s, which can keep pace with an NVMe device. The proposed method is about an order of magnitude faster and an order of magnitude more power efficient than a state-of-the-art single-core software implementation. The logic and block ram resources required by the decompressor are sufficiently low so that a set of these decompressors can be implemented on a single FPGA of reasonable size to keep up with the bandwidth provided by the most recent interface technologies.

KW - Acceleration

KW - CAPI

KW - Decompression

KW - FPGA

KW - Snappy

UR - http://www.scopus.com/inward/record.url?scp=85085762888&partnerID=8YFLogxK

U2 - 10.1007/s11265-020-01547-w

DO - 10.1007/s11265-020-01547-w

M3 - Article

AN - SCOPUS:85085762888

SN - 1939-8018

VL - 92

SP - 931

EP - 947

JO - Journal of Signal Processing Systems

JF - Journal of Signal Processing Systems

IS - 9

ER -

An Efficient High-Throughput LZ77-Based Decompressor in Reconfigurable Logic

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this