Low-Density Parity-Check (LDPC) codes have gained popularity in communication
systems and standards due to their capacity approaching error correction performance.
Among all the hard-decision based LDPC decoders, Gallager B (GaB), due
to simplicity of its operations, poses as the most hardware friendly algorithm and an
attractive solution for meeting the high-throughput demand in communication systems.
However, GaB suers from poor error correction performance. In this work,
we rst propose a resource ecient GaB hardware architecture that delivers the best
throughput while using fewest Field Programmable Gate Array (FPGA) resources
with respect to the state of the art comparable LDPC decoding algorithms. We
then introduce a Probabilistic GaB (PGaB) algorithm that disturbs the decisions
made during the decoding iterations randomly with a probability value determined
based on experimental studies. We achieve up to four orders of magnitude better
error correction performance than the GaB with a 3.4% improvement in normalized
throughput performance. PGaB requires around 40% less energy than GaB as the
probabilistic execution results with reducing the average iteration count by up to
62% compared to the GaB. We also show that our PGaB consistently results with
an improvement in maximum operational clock rate compared to the state of the art
implementations.
In this dissertation, we also present a high throughput FPGA based framework
to accelerate error characterization of the LDPC codes. Our
exible framework allows
the end user adjust the simulation parameters and rapidly study various LDPC
codes and decoders. We rst show that the connection intensive bipartite graph based
LDPC decoder hardware architecture creates routing stress for longer codewords that
are utilized in today's communications systems and standards. We address this problem
by partitioning each processing element (PE) in the bipartite graph in such a way that the inputs of a PE are evenly distributed over its partitions. This allows
depopulating the Loo Up Table (LUT) resources utilized for the decoder architecture
by spreading the logic across the FPGA. We show that even though LUT usage increases,
critical path delay reduces with the depopulation. More importantly, with the
depopulation technique an unroutable design becomes routable, which allows longer
codewords to be mapped on the FPGA. We then conduct two experiments on error
correction performance analysis for the GaB and PGaB algorithms, demonstrate our
framework's ability to reach a resolution level that is not attainable with general purpose
processor (GPP) based simulations, which reduces the time scale of simulations
to 24 hours from an estimated 199 years. We nally conduct the rst study on identifying
all possible codewords that are not correctable by the GaB for the case where
a codeword has four errors. We reduce the time scale of this simulation that requires
processing 117 billion codewords to 4 hours and 38 minutes with our framework from
an estimated 7800 days on a single GPP. |