The amount of data that is produced and that needs to be stored continues to grow exponentially. It is estimated that daily we create Exabytes (10^18) of data. Magnetic tapes though slow have the maximum capacity to hold data and is about a few Terabytes/tape (10^15) as of now. But both optics based discs and magnetic tape will soon hit physical limits on their storage capacities. One promising storage medium is DNA or deoxyribonucleic acid which is composed of nucleotides. A nucleotide consists of a phosphate group, sugar and a nitrogenous base i.e. A (Adenine), T(Thymine), C(Cytosine) or G(Guanine). It is present in all humans but it is its synthetic version that is being researched as storage medium.
The biggest advantage of using DNA is its storage density. In theory 2 bits of information can be stored per nucleotide and that puts storage in less than nanometer thickness. Current chip making processes are struggling with 10 nanometers mark. Besides DNA can store data in three dimensional form unlike optical or magnetic media. Even today, 1 gram of DNA can store 215 petabytes of data. Additionally, if stored properly the data can last for thousands of years. Scientists have sequenced or read DNA from 60,000 years old woolly Mammoth and of a 430,000-year-old Neanderthal man. Its use of power is many order of magnitude times less than that of optics or magnetic devices.
But use of DNA also has many technical and commercial issues. One is that the cost of storage and retrieval is much higher than that of optical or magnetic media.
This may fall as use of DNA becomes popular. Other is the speed and just like DNA computer, it is slow. As of today the speed needs to improve by at least by two orders of magnitude.
However, this is not easy because of the mechanism of storage and retrieval. First data in binary format needs to be mapped to DNA's four-letter alphabet of A, G, T, and C. Then synthetic DNA needs to be created from these nucleotides and the resultant DNA needs to be stored in a cool place. Retrieval needs a Polymerase Chain reaction to repeatedly duplicate the sequence that we want to read. This proliferation reduces the chances of errors. Finally, it has to be mapped back to binary data. The research is focused on improving the efficiency of these processes and making it error free. Scientists have started show casing results.
In 2012 Harvard University researchers encoded a 52,000-word book. But in 2017 scientists at Columbia University and the New York Genome Center demonstrated storage and retrieval of a full computer operating system, an film gift card, a computer virus, plaque etc at densities that were 100 times more. Now Microsoft and another company, Twist Bioscienceare are working on 10 million long oligonucleotides of DNA. Microsoft has already planned to add DNA storage to its cloud. An Irish startup Helixworks has created a DNA data storage device which is available on Amazon. Micron Technology, Google, Facebook, Apple etc. also exploring this technology.
Despite the current challenges, DNA seems to be emerging as storage medium of choice for future.