Half a croissant, on a plate, with a sign in front of it saying '50c'
h a l f b a k e r y
We got your practicality ... right here.

idea: add, search, annotate, link, view, overview, recent, by name, random

meta: news, help, about, links, report a problem

account: browse anonymously, or get an account and write.

user:
pass:
register,


                     

Please log in.
Before you can vote, you need to register. Please log in or create an account.

DNA sequencing injection

GACTATGDROPTABLESEQUENCES
  (+5)
(+5)
  [vote for,
against]

SQL injection is an old hacker's trick. Data to be analyzed is improperly parsed by the system, and used as a command. Any competent programmer knows to sanitize their inputs against untrusted sources. But gene sequencing can be trusted, right?

A crack team of computer security experts continually analyze DNA sequencing software available on the market, searching for this specific weakness. Once found, a proper exploit is written--in DNA.

Want to expunge yourself from a police database? Or discredit the research of a fellow mad scientist? Then simply get them to analyze this custom sequence of nucleotides, and watch their computer happily do your bidding.

Aq_Bi, Jan 05 2015

Shortest Turing complete compiler http://stackoverflo...omplete-interpreter
[Voice, Jan 06 2015]

[link]






       Ingenious. Be aware that there is a wide range of DNA sequence analysis out there and in use.   

       Is it feasible that such exploits could be written?
MaxwellBuchanan, Jan 05 2015
  

       It reminds me of the "Bones" episode where the data was encoded on the surface of a bone that was being scanned by the lab. Neat concept, completely unworkable in practice for two reasons. The first isn't relevant to this idea (it was a custom computer, so unless you already had access you wouldn't know how to encode the data).   

       The second, however, is. The data simply is never handled in such a way that the given sequence would actually be executed. In the case of the "Bones" episode, the data is a list of 3D points, which are different if the bone is being scanned on the table. Here, the data is one of four characters, and each is handled independently. There is never a time when a relevant sequence would be handled as a unit (even if you could assemble the relevant code out of those four characters).
MechE, Jan 05 2015
  

       I dunno. I've written a lot of sequence analysis software (mainly for my own use; sometimes custom software for others).   

       Because I'm lazy and not a Proper Programmer, I'll often do things such as finding the start of a sequence (in a multi-sequence file) by looking for the block of text that usually acts as a header. This assumes that such text will never occur in the actual sequence. I don't see how that could be the basis of an exploit, but I'd be surprised if there's _no_ sequence analysis software that can be hacked in such a way.
MaxwellBuchanan, Jan 05 2015
  

       As long as the data is going through a processor that doesn't prevent execution the data itself can simply spell out a program and be executed. How it has to be written depends on how the program works. [Max] give me the source code of your software and I'll give you a proof of concept hack. (as long as it's not written in forth or something)
Voice, Jan 06 2015
  

       It's written in LabView, which is probably even worse (from this perspective) than Forth...
MaxwellBuchanan, Jan 06 2015
  

       //No// wellll.....   

       There's an old lengthy (and somewhat boring) discussion in alt.folklore.computers, when a bunch of yahoos from a crypto newsgroup barged in, interrupting the usual "kids these days" rants and kumquat pie recipe exchanges, demanding that the geezers solve the problem of buffer-overflow exploits for them, making explicit threats about changing the C standards so you can't go out of bounds (thus pretty well borking the language), and implicit threats about pensions.   

       basically the 5,733 posts boiled down to...   

       Cryptos: what can we do to solve the problem ?
Auld Farts: hire competent programmers
Cr: we can't afford to
AF: use COBOL
Cr: but that's an old language
AF: ....
  

       (eventually the auld farts went back to discussing haemorrhoids and the chief security twat went and got a $2m gov't grant to study the problem)   

       ------
FlyingToaster, Jan 06 2015
  

       What we want (for such an exploit to work) is for some part of the sequencing pathway to be vulnerable to attack by the data. There are two issues with this:
1) The data is severely restricted,
2) The data is of limited length
  

       However, I think that both are potentially sumountable.   

       As previously observed it may be possible to cause buffer-overflows in the software with particular input, if difficult.   

       In addition, though, I propose a QR-style attack.
We write a sequence which is translated by the sequence analysis software to a valid URL. It's quite likely that at some point a piece of code will automatically highlight this as a link, and we can then rely on operator curiosity to click on it.
At this point we have converted the problem into a standard browser malware attack, which is a solved problem.
  

       But how do we encode a recognisable URL?
Word, for example, will by default auto-corrupt any alphanumeric string beginning with "www." into a link. We can easily encode the letters "ACTG". It may be less obvious, but we can also encode 11 other letters, including 'W', using degenerate bases. A combination of 'A' and 'T' at a position will be coded as 'W'.
The only remaining issue is the full-stop/period '.'.
Obviously, we can't do this in the same way, except in the rare case where all bases is coded as '.' instead of 'N'.
Alternatively, it might be possible to do something with auto-correct, although this opportunity is likely rare.
  

       Barring this, I think the best we can do is put a recognisable and compelling instruction into the sequence such that the operator is obliged to engage with the infection.
The letters available are "ABCDHGKMNRSTVWY".
  

       WWWCANRATTYHAVACRACKA (domain apparently available)
WWWCRYBABYCRY (domain available to buy)
WWWRAGBAGHAGANDDAD (domain apparently available)
  

       If the researcher is doing deeper analysis, one can of course arrange for an open reading frame to carry a message in the universal genetic code, which gives you 20 letters.   

       It might be easier to encode an offer of a bribe.
Loris, Jan 06 2015
  

       This can also serve as a warrant canary. If you somehow make sure the DNA sequencer will be reading your bit of code and if you assume it can result in a click to your web site you can make it a site no one is likely to visit (RAGBAGHAGANDDAD RAGBAGHAGANDDAD RAGBAGHAGANDDAD .com) and when you get a hit from a browser you know your DNA has been read. For extra giggles include system data in the outgoing query.
Voice, Jan 06 2015
  

       //we can also encode 11 other letters// or, rather than using degeneracy codes, rely on translation to single-letter amino acid codes. Most DNA packages include translation in all possible frames. I suspect some of them even use "." to mark a stop codon, so your "www." problem is solved right there.   

       EDIT - ah, just realized that [Loris] said exactly that.
MaxwellBuchanan, Jan 06 2015
  
      
[annotate]
  


 

back: main index

business  computer  culture  fashion  food  halfbakery  home  other  product  public  science  sport  vehicle