Welcome to DU! The truly grassroots left-of-center political community where regular people, not algorithms, drive the discussions and set the standards. Join the community: Create a free account Support DU (and get rid of ads!): Become a Star Member Latest Breaking News General Discussion The DU Lounge All Forums Issue Forums Culture Forums Alliance Forums Region Forums Support Forums Help & Search
 

morningfog

(18,115 posts)
Thu Jan 24, 2013, 07:44 AM Jan 2013

MP3 files written as DNA with storage density of 2.2 petabytes per gram

It's easy to get excited about the idea of encoding information in single molecules, which seems to be the ultimate end of the miniaturization that has been driving the electronics industry. But it's also easy to forget that we've been beaten there—by a few billion years. The chemical information present in biomolecules was critical to the origin of life and probably dates back to whatever interesting chemical reactions preceded it.

It's only within the past few decades, however, that humans have learned to speak DNA. Even then, it took a while to develop the technology needed to synthesize and determine the sequence of large populations of molecules. But we're there now, and people have started experimenting with putting binary data in biological form. Now, a new study has confirmed the flexibility of the approach by encoding everything from an MP3 to the decoding algorithm into fragments of DNA. The cost analysis done by the authors suggest that the technology may soon be suitable for decade-scale storage, provided current trends continue.
Trinary encoding

Computer data is in binary, while each location in a DNA molecule can hold any one of four bases (A, T, C, and G). Rather than using all that extra information capacity, however, the authors used it to avoid a technical problem. Stretches of a single type of base (say, TTTTT) are often not sequenced properly by current techniques—in fact, this was the biggest source of errors in the previous DNA data storage effort. So for this new encoding, they used one of the bases to break up long runs of any of the other three.

(To explain how this works practically, let's say the A, T, and C encoded information, while G represents "more of the same." If you had a run of four A's, you could represent it as AAGA. But since the G doesn't encode for anything in particular, TTGT can be used to represent four T's. The only thing that matters is that there are no more than two identical bases in a row.)

That leaves three bases to encode information, so the authors converted their information into trinary. In all, they encoded a large number of works: all 154 Shakespeare sonnets, a PDF of a scientific paper, a photograph of the lab some of them work in, and an MP3 of part of Martin Luther King's "I have a dream" speech. For good measure, they also threw in the algorithm they use for converting binary data into trinary.

http://arstechnica.com/science/2013/01/mp3-files-written-as-dna-with-storage-density-of-2-2-petabytes-per-gram/

1 replies = new reply since forum marked as read
Highlight: NoneDon't highlight anything 5 newestHighlight 5 most recent replies
MP3 files written as DNA with storage density of 2.2 petabytes per gram (Original Post) morningfog Jan 2013 OP
Sorry, Mom, I didn't mean to eat your Library formercia Jan 2013 #1
Latest Discussions»Culture Forums»Science»MP3 files written as DNA ...