Flash Drive Durability – NCSA
Flash Drive Durability
The original flash technology, SLC (Single Level Cell) NAND stores one data bit (single level) in each cell. To produce better bit density and increase capacity, MLC (Multi-Level Cell) NAND was developed which can store two bits of information in the same physical location that SLC stores one bit. This makes MLC roughly half the price as SLC, on a per gigabyte (GB) basis. However, MLC has other issues.
MLC has Lower Endurance
All flash devices can sustain a finite number of writes and erasures, also called program/erase cycles, or (P/E cycles). Partly because MLC media stores twice the information in the same physical space, its maximum endurance (measured in P/E cycles) is significantly lower than SLC media, by roughly a factor of 10. In NAND flash storage, cost is a function of endurance — the faster a device wears out, the more often it must be replaced. This endurance issue reduces the cost advantages MLC enjoys.
A common way to express lifespan is in Total Bytes Written (TBW), which is a product of the P/E cycles and the capacity of the flash device itself. Aside from MLC’s lower TBW, some of the characteristics of flash storage applications can exacerbate its endurance problem. Processes like caching and storage tiering generate more P/E cycles, hastening wear out. Also, the shrinking lithographies inherent in the evolution of semiconductor-based products, like flash memory, increase bit errors that further reduce useful life.
Better MLC Endurance Improves Economics
One way to improve the economics of flash technology is to increase the lifespan of MLC devices. What’s needed is near-SLC endurance at an MLC cost. eMLC (enterprise MLC) media is a technology designed to do this, offering a 3x increase in TBW. However, this comes at the expense of slower performance than MLC and a higher cost. Instead of looking to different media, like eMLC, the solution lies in using an intelligent SSD controller technology to make the existing MLC last longer.
NAND flash stores the information by controlling the amount of electrons in a region called a “floating gate”. These electrons change the conductive properties of the memory cell (the gate voltage needed to turn the cell on and off), which in turn is used to store one or more bits of data in the cell. This is why the ability of the floating gate to hold a charge is critical to the cell’s ability to reliably store data.
Write and Erase Processes Cause Wear
When written to and erased during the normal course of use, the oxide layer separating the floating gate from the substrate degrades, reducing its ability to hold a charge for an extended period of time. Each solid-state storage device can sustain a finite amount of degradation before it becomes unreliable, meaning it may still function but not consistently. The number of writes and erasures (P/E cycles) a NAND device can sustain while still maintaining a consistent, predictable output, defines its endurance.
In the analog recording world, the concept of Signal to Noise Ratio (SNR) is used to describe the quality or condition of a recording. This refers to the strength of the signal which carries the information when compared to the random artifacts (or noise) that is generated by the recording media and the electronic components in the system. The SNR of a recording must be at a certain level in order for it to be replayed with any degree of quality. When a recording degrades, as happens when the metal oxide layer of a magnetic tape gets old, its SNR decreases. There are analog filters that can be applied in order to ‘clean up’ a noisy recording, but at some point the SNR gets low enough for the recording to be unusable.
In order to assure that most blocks will wear out at roughly the same time, most enterprise flash controllers use a process commonly referred to as “wear leveling.” This process is designed to evenly distribute write operations across all available blocks in a NAND device to reduce or eliminate concentration on a subset of blocks. But there are other ways to improve MLC endurance.
How to Improve MLC Endurance
Endurance in an SSD can be improved by reducing the amount of degradation that occurs to the recording media or increasing the system’s ability to effectively read degraded media. Said another way, an SSD manufacturer can improve life expectancy of MLC NAND flash if they can decrease the damage caused by the write and erase processes or improve their ability to prevent or correct bit-level errors caused by gate oxide degradation. An advanced SSD controller in a flash storage device can do both of these.
On the other side of the endurance equation, an advanced SSD controller can reduce the amount of damage done through the normal write and erase processes. Advanced SSD controller technologies, such as those developed by STEC, Inc., can control the way a cell is written and erased so as to minimize the impact of these processes on the NAND substrate. Every block of cells is different, so care must be taken to use the minimum amount of voltage to get the block programmed or erased.
As flash cells wear, the write and erase operations described above will change as well. This degree of wear can actually be characterized to some extent by the data collected for the read processes described earlier. Together, this ability to reduce cell degradation during the write and erase steps, along with improvements to read technologies, can greatly improve MLC endurance.