Processing In Memory Using Emerging Memory Technologies
Download Processing In Memory Using Emerging Memory Technologies PDF/ePub or read online books in Mobi eBooks. Click Download or Read Online button to get Processing In Memory Using Emerging Memory Technologies book now. This website allows unlimited access to, at the time of writing, more than 1.5 million titles, including hundreds of thousands of titles in various foreign languages.
Processing in Memory Using Emerging Memory Technologies
Recent years have witnessed a rapid growth in the amount of generated data, owing to the emergence of Internet of Things (IoT). Processing such huge data on traditional computing systems is highly inefficient, mainly due to the limited cache capacity and memory bandwidth. Processing in-memory (PIM) is an emerging paradigm which tries to address this issue. It uses memories as computing units, hence reducing the data transfers between memory and processing cores. However, the application of present PIM techniques is restricted by their limited functionality and inability to process large amounts of data efficiently. In this thesis, we propose novel techniques which exploit the analog properties of emerging memory technologies. Not only do these support more complex functions such as addition, multiplication, and search but also manage and process large data more efficiently. We present a new blocked PIM architecture which uses inter-block interconnects to accelerate data intensive processing. We also introduce a heterogeneous architecture having general purpose cores and PIM-enable memory and a data-dependent task allocation scheme for it. We also apply application specific optimizations and approximation techniques to further design accelerators for neural networks and database query systems. While we design a multiplication-by-constant hardware for neural networks, query processing is accelerated by a novel in-memory nearest search technique. Our neural network accelerator achieves 113.9x higher energy efficiency and 56.3x speedup as compared to AMD GPU. Also, the query accelerator provides 49.3x performance speedup and 32.9x energy savings as compared to recent Intel CPU.
Building Scalable Architectures Using Emerging Memory Technologies
A confluence of trends is reshaping computing today. On one end, the massive amounts of data being generated by the proliferation of sensing and internet services are creating a demand for better computer architectures and systems. The other stream of the confluence is the nanotechnology advances that are unearthing new memory device technologies with the potential to replace (or be combined with) conventional memories. Given these trends, this thesis examines emerging memory device technologies that provide a unique opportunity to build computer architectures with efficient and scalable data storage and processing capabilities. The associated memory architectures of these new systems promise to offer distinctive features such as intrinsic non-volatility, highly dense memory structures, extremely low-power consumption and even embedded processing capabilities. Among others, some examples of emerging memory technologies with such features are PCM, 3D Xpoint, STT-RAM and ReRAM. A central question with the new memory architectures built with emerging memory technologies is whether or not the resultant systems are scalable. Towards answering this question, this thesis identifies that conventional memory architecture specific scaling methods may not directly apply in case of emerging memory technologies. These methods were developed mostly for SRAM and DRAM, and today, they do not provide the desired outcomes for emerging memory technologies. As a result, there exist fundamental unsolved problems concerning scalability in building memory architectures. Unfortunately, this means that even though emerging memory technologies provide distinctive features, they may be largely left untapped. Given the scalability concerns, this thesis then advocates a scalability-first approach for building computer architectures using emerging memory technologies while being aware of the limitations and opportunities associated with them. As demonstrations of the scalability-first approach, the thesis discusses several scalability problems encountered in systems using emerging memory technologies. It also brings out potential solutions for each of these problems in the form of novel techniques and tools. For instance, the thesis discusses the problem and a solution for scaling write order enforcement mechanisms for data persistence on large non-volatile main memory systems, followed by the problem and a potential solution for scaling write bandwidth and thereby reducing memory interference on systems with dense non-volatile memory caches. Also discussed are methods for scaling system architectures with in-memory processing capability subject to its operational complexity and other limits. The proposed scalability-first approach points to prospects and ways for better adoption of emerging memory technologies within existing systems. The approach and the solutions also lead to likely transition paths to even more scalable and markedly different systems of the future.
Neuro-inspired Computing Using Emerging Non-Volatile Memories
Data movement between separate processing and memory units in traditional von Neumann computing systems is costly in terms of time and energy. The problem is aggravated by the recent explosive growth in data intensive applications related to artificial intelligence. In-memory computing has been proposed as an alternative approach where computational tasks can be performed directly in memory without shuttling back and forth between the processing and memory units. Memory is at the heart of in-memory computing. Technology scaling of mainstream memory technologies, such as static random-access memory (SRAM) and Dynamic random-access memory (DRAM), is increasingly constrained by fundamental technology limits. The recent research progress of various emerging nonvolatile memory (eNVM) device technologies, such as resistive random-access memory (RRAM), phase-change memory (PCM), conductive bridging random-access memory (CBRAM), ferroelectric random-access memory (FeRAM) and spin-transfer torque magnetoresistive random-access memory (STT-MRAM), have drawn tremendous attentions owing to its high speed, low cost, excellent scalability, enhanced storage density. Moreover, an eNVM based crossbar array can perform in-memory matrix vector multiplications in analog manner with high energy efficiency and provide potential opportunities for accelerating computation in various fields such as deep learning, scientific computing and computer vision. This dissertation presents research work on demonstrating a wide range of emerging memory device technologies (CBRAM, RRAM and STT-MRAM) for implementing neuro-inspired in-memory computing in several real-world applications using software and hardware co-design approach. Chapter 1 presents low energy subquantum CBRAM devices and a network pruning technique to reduce network-level energy consumption by hundreds to thousands fold. We showed low energy (10×-100× less than conventional memory technologies) and gradual switching characteristics of CBRAM as synaptic devices. We developed a network pruning algorithm that can be employed during spiking neural network (SNN) training to further reduce the energy by 10×. Using a 512 Kbit subquantum CBRAM array, we experimentally demonstrated high recognition accuracy on the MNIST dataset for digital implementation of unsupervised learning. Chapter 2 presents the details of SNN pruning algorithm that used in Chapter1. The pruning algorithms exploits the features of network weights and prune weights during the training based on neurons' spiking characteristics, leading significant energy saving when implemented in eNVM based in-memory computing hardware. Chapter 3 presents a benchmarking analysis for the potential use of STT-MRAM in in-memory computing against SRAM at deeply scaled technology nodes (14nm and 7nm). A C++ based benchmarking platform is developed and uses LeNet-5, a popular convolutional neural network model (CNN). The platform maps STT-MRAM based in-memory computing architectures to LeNet-5 and can estimate inference accuracy, energy, latency, and area accurately for proposed architectures at different technology nodes compared against SRAM. Chapter 4 presents an adaptive quantization technique that compensates the accuracy loss due to limited conductance levels of PCM based synaptic devices and enables high-accuracy SNN unsupervised learning with low-precision PCM devices. The proposed adaptive quantization technique uses software and hardware co-design approach by designing software algorithms with consideration of real synaptic device characteristics and hardware limitations. Chapter 5 presents a real-world neural engineering application using in-memory computing. It presents an interface between eNVM based crossbar with neural electrodes to implement a real-time and high-energy efficient in-memory spike sorting system. A real-time hardware demonstration is performed using CuOx based eNVM crossbar to sort spike data in different brain regions recorded from multi-electrode arrays in animal experiments, which further extend the eNVM memory technologies for neural engineering applications. Chapter 6 presents a real-world deep learning application using in-memory computing. We demonstrated a direct integration of Ag-based conductive bridge random access memory (Ag-CBRAM) crossbar arrays with Mott-ReLU activation neurons for scalable, energy and area efficient hardware implementation of DNNs. Chapter 7 is the conclusion of this dissertation. The future directions of in-memory computing system based on eNVM technologies are discussed.