New Memory Research Teases 100x Density Jump, Merged Compute and Memory

admin July 22, 2023

0 5 minutes read

New research along the forefront of materials engineering Truly amazing performance gains in computing devices are expected. A research team led by Markus Helbrand and colleagues at the University of Cambridge believes that a new material based on a hafnium oxide layer tunneled by a voltage-varying barium spike will blend the properties of memory and process-dependent materials. This means that the device can act as data storage, provide 10-100 times the density of existing storage media, or be used as a processing unit.

Published in Science Advances journalThis research paves the way for potentially significant improvements in computing device density, performance, and energy efficiency. In fact, typical USB sticks (what they call) based on this technology are very important. continuous range) could contain 10 to 100 times more information than is currently in use.

RAM density doubles every four years, As pointed out by JEDECIt will take decades for RAM manufacturers to finally reach the same level of density that this technology is demonstrating today.

The device is also a light in the neuromorphic computing tunnel. Matter (known as), like the neurons in our brain resistive switching memory) are expected to serve as both storage and processing media. This is simply not the case with current semiconductor technology. There is currently no way to integrate the transistor and material design layouts, as the requirements for memory cells and processing cells are very different (mainly in terms of durability, such as the ability to not suffer performance degradation).

The inability to merge these means that information must be continuously flowing between the processing system and its various caches (if you consider a modern CPU) and its external memory pool (the best DDR5 kit on the market). In computing, this is known as the von Neumann bottleneck. This means that a system with separate memory and processing capabilities is fundamentally limited by the bandwidth between them both (usually known as a bus). This is why every semiconductor design house (from Intel to AMD, Nvidia, etc.) is designing purpose-built hardware to accelerate information exchange, such as Infinity Fabric and NVLink.

The problem is that this exchange of information has an energy cost, and this energy cost limits the performance ceiling that is currently achievable. Remember that as energy circulates, there are also inherent losses. The result is increased power consumption (currently severe limitations in hardware design and the increasing priority of semiconductor design) and increased heat. Yet another severe limitation has led to the development of increasingly exotic cooling solutions that allow Moore’s Law to linger ahead for some time yet. Of course, there is also the element of sustainability. Computing is expected to consume his 30% of the world’s energy demand in the not too distant future.

“This explosion in energy demand is largely due to deficiencies in current computer memory technology.” Lead author Dr. Markus Hellenbrand of the Department of Materials Science and Metallurgy, University of Cambridge, said: “In traditional computing, you have memory on one side, processing on the other, and data is shuffled between the two, which takes energy and time.”

As you can imagine, the benefits of consolidating both memory and processing are quite impressive. While traditional memory has only two states (1 or 0, hence the nomenclature “binary”), resistive-switching memory devices can change resistance through different states. This allows it to work with a wider variety of voltages, allowing it to encode more information. At a high enough level, this is roughly the same process that occurs in the NAND area, with a corresponding increase in the number of possible voltage states unlocked in a memory cell design, which corresponds to an increase in the number of bits per cell.

One way to distinguish between processing and storing is that processing means writing and rewriting information (addition or subtraction, conversion or rearrangement) as fast as the switching cycle is required. Storing means that the information should be static for a long time. Probably because the information is part of the Windows or Linux kernel.

to build these synaptic deviceAs mentioned in the paper, the research team needed to find a way to address a materials engineering bottleneck known as the homogeneity problem. Since hafnium oxide (HfO2) has no structure at the atomic level, hafnium and oxygen atoms are randomly deposited that impair its insulating properties. This limits their electron-conducting (power) applications. A more ordered atomic structure minimizes the resistance that occurs, thus increasing speed and efficiency. However, the researchers found that depositing barium (Ba) in thin films of unstructured hafnium oxide produced highly ordered barium bridges (or spikes). And because their atoms are more structured, these bridges allow electron flow better.

Photographs taken by transmission electron microscopy (TEM) show the increasing order of hafnium oxide deposits (disordered spontaneous deposits like image A) when tunneled by dynamically changing barium spikes. (Image credit: University of Cambridge/Markus Hellbrand et al.)

But the fun began when the researchers discovered that they could dynamically change the height of the barium spikes to fine-tune their conductivity. They found that spikes can provide switching functions at rates of about 20ns. This means that the spike can change voltage states (and therefore hold different information) within its window. They found a memory window greater than 10 and a switching endurance greater than 10^4 cycles. This means that while the material is fast, the maximum number of voltage state changes it can currently withstand is about 10,000 cycles. This isn’t a terrible result, but it’s not a surprising one either.

This is comparable to the durability available with MLC (multi-level cell) technology, but it necessarily limits its application, namely the use of this material as a processing medium where voltage states change rapidly to store computations and their intermediate results.

A rough calculation shows that switching for about 20 ns gives an operating frequency of 50 MHz (Convert to cycles per nanosecond). If the system is going through various states at full speed (e.g. running as a GPU or CPU), the barium bridge will die (reaching its endurance limit) around 0,002 seconds (remember, it’s only running at 50 MHz). I don’t think it can perform well enough as a processing unit.

But for storage? That’s where USB sticks come in, which are “10-100x denser” in terms of memory capacity. These Synaptic devices can access 10 to 100 times more intermediate voltage states than even the densest NAND technology found in today’s widest USB sticks.

Who wouldn’t want a 10 or even 100 terabyte “USB 7” stick?

There is some work to be done in terms of barium bridge durability and switching speed, but the design already looks like an attractive proof of concept. Even better, the semiconductor industry is already dealing with hafnium oxide, so there are fewer tooling and logistical nightmares to navigate.

But here is the potential for a particularly original product. Imagine technology manufactured and improved to the point where it can be used in the design of AMD or Nvidia GPUs (which operate around the 2 GHz mark these days). There is a world where graphics cards are factory reset and run entirely as memory (imagine a 10 TB graphics card, like a virtual USB stick).

Imagine a world where AMD and Nvidia offered essentially programmable GPUs, where continuous range based GPU dies were product-stacked in terms of maximum storage capacity (remember 10-100 density higher than current USB). If you’re an AI enthusiast looking to build your own Large Language Models (LLMs), you can program your GPU to have just the right amount of these synthetic devices, or neuromorphic transistors, to perform the processing functions. As complexity increases, we never know how many trillions of parametric models we will end up with. Therefore, memory becomes more and more important.

It is entirely up to the end user to decide whether to use the transistors in the graphics card as memory or as an eye candy amplifier to change graphics settings up to 11. From casual gamers to high performance computing (HPC) installers. Even if that meant the life of the chip’s components would be significantly reduced.

Anyway, you’re always upgrading.

But don’t get ahead of yourself. While this is a less perilous issue than AI development and its regulation, there is little to be gained from dreaming this far ahead. As with all technology, when it’s ready, it’s ready. If so.