Description
TitleDistributed processing for large-scale antenna systems
Date Created2022
Other Date2022-05 (degree)
Extent84 pages : illustrations
DescriptionIn this dissertation, we present novel implementations of baseband algorithms for Large-Scale Antenna Systems (LSAS) in general-purpose software-based architectures like CPU and GPU. LSAS based wireless systems require processing of complex baseband algorithms for large volume of data going through the antenna array. So the focus is on developing distributed implementations of various baseband processing blocks while considering the issues related to processing latency and error performance occurring in LSAS. Since Massive MIMO systems consist of LSAS at the front-end, the thesis will be mainly concerned with Wideband Massive MIMO systems. Massive MIMO systems help in providing simultaneous access to a high number of mobile wireless devices. The LSAS combined with the wide bandwidth requirements of next-gen systems requires processing a huge volume of data that goes through the antenna array during transmission/reception. So, we consider a Wideband Massive MIMO system with OFDM-based data transmission. Specifically, we consider the baseband processing at the receiver end, where a wireless transmitter with an LSAS at front-end is transmitting to a single receiver which consists of an LSAS. We implement algorithms for various baseband processing blocks, such as OFDM-based demodulation, Channel estimation, QAM detection, and LDPC decoding of baseband data. For describing the implementations, we divide the thesis into two parts. Firstly, we explain the implementation of channel estimation, and secondly we explain the implementation of the QAM detection and the LDPC decoder. We implement the algorithms in a multi-core CPU and GPU for the channel estimation, QAM detection, and Error correction and decoding blocks. We then compare the implementations and show the advantages and limitations of each type of architecture.
For the first part, we consider the issue of high pilot overhead and high processing latency when performing channel estimation for LSAS. We consider the implementation of correlation based channel estimation schemes while using PN Sequences. We implement the scheme in a parallel manner using a GPU and consider the implications of its implementation in novel GPU micro-architecture. We also compare it with a frequency-domain Least Squares channel estimation scheme. The advantages and limitations of both the schemes are shown in terms of processing latency and error performance. We then concentrate on the PN Sequence channel estimation scheme and its utilization in decreasing pilot transmission overhead for LSAS. We implement a multiplexed pilot transmission scheme by using PN Sequences and show its implications on pilot overhead reduction, processing latency reduction, and error performance. We use the Tensor Core architecture, which is newly introduced micro-architecture for performing fast matrix multiplication in GPUs, for a parallel implementation of the multiplexed pilot transmission scheme. We show the trade-off between latency and reliability when using the aforementioned architecture for factors such as PN sequence size, antenna array size, and number of multiplexed pilots.
For the second part, we delve deeper into and QAM detection aspects of baseband processing. Algorithms such as ZF (Zero Forcing), MMSE (Minimum Mean Square Error), Sphere detection, and K-best detection have been studied from the computation and error performance perspective. While QAM detection algorithms such as Sphere detection can provide near-ML (Maximum Likelihood) levels of reliability, the processing latency for such algorithms can be very large, and also such algorithms require perfect channel state information for optimal performance. We show that using the K-best QAM detection algorithm can provide near-ML performance while having consistent processing latency. We implement the K-best detection in a multi-core CPU and GPU, and compare the merits and limitations of both architectures. We then extend the implementation to have soft-output detection instead of hard-output detection since soft-output detection can be combined with decoders such as LDPC to provide increased reliability. A Goodput metric is proposed which combines the processing latency and error performance of the detection and decoder implementations. Finally, we provide a combined implementation of the soft-output K-best detection and LDPC decoder algorithms in a multi-core CPU. We use the Goodput metric for small to large-scale MIMO systems and show the effect of using parallel processing on various factors such as the QAM constellation density, antenna array size, the LDPC decoding rate.
NotePh.D.
NoteIncludes bibliographical references
Genretheses
LanguageEnglish
CollectionSchool of Graduate Studies Electronic Theses and Dissertations
Organization NameRutgers, The State University of New Jersey
RightsThe author owns the copyright to this work.