h a l f b a k e r yCogito, ergo sumthin'
add, search, annotate, link, view, overview, recent, by name, random
news, help, about, links, report a problem
browse anonymously,
or get an account
and write.
register,
|
|
|
|
Ingenious. Be aware that there is a wide range of
DNA sequence analysis out there and in use. |
|
|
Is it feasible that such exploits could be written? |
|
|
It reminds me of the "Bones" episode where the data was
encoded on the surface of a bone that was being scanned
by the lab. Neat concept, completely unworkable in
practice for two reasons. The first isn't relevant to this
idea (it was a custom computer, so unless you already
had access you wouldn't know how to encode the data). |
|
|
The second, however, is. The data simply is never
handled in such a way that the given sequence would
actually be executed. In the case of the "Bones" episode,
the data is a list of 3D points, which are different if the
bone is being scanned on the table. Here, the data is one
of four characters, and each is handled independently.
There is never a time when a relevant sequence would be
handled as a unit (even if you could assemble the
relevant code out of those four characters). |
|
|
I dunno. I've written a lot of sequence analysis
software (mainly for my own use; sometimes custom
software for others). |
|
|
Because I'm lazy and not a Proper Programmer, I'll
often do things such as finding the start of a
sequence (in a multi-sequence file) by looking for the
block of text that usually acts as a header. This
assumes that such text will never occur in the actual
sequence. I don't see how that could be the basis of
an exploit, but I'd be surprised if there's _no_
sequence analysis software that can be hacked in
such a way. |
|
|
As long as the data is going through a processor that doesn't
prevent execution the data itself can simply spell out a
program and be executed. How it has to be written depends
on how the program works. [Max] give me the source code of
your software and I'll give you a proof of concept hack. (as
long as it's not written in forth or something) |
|
|
It's written in LabView, which is probably even worse
(from this perspective) than Forth... |
|
|
There's an old lengthy (and somewhat boring) discussion in alt.folklore.computers, when a bunch of yahoos from a crypto newsgroup barged in, interrupting the usual "kids these days" rants and kumquat pie recipe exchanges, demanding that the geezers solve the problem of buffer-overflow exploits for them, making explicit threats about changing the C standards so you can't go out of bounds (thus pretty well borking the language), and implicit threats about pensions. |
|
|
basically the 5,733 posts boiled down to... |
|
|
Cryptos: what can we do to solve the problem ?
Auld Farts: hire competent programmers
Cr: we can't afford to
AF: use COBOL
Cr: but that's an old language
AF: .... |
|
|
(eventually the auld farts went back to discussing haemorrhoids and the chief security twat went and got a $2m gov't grant to study the problem) |
|
|
What we want (for such an exploit to work) is for some part of the sequencing pathway to be vulnerable to attack by the data.
There are two issues with this:
1) The data is severely restricted,
2) The data is of limited length |
|
|
However, I think that both are potentially sumountable. |
|
|
As previously observed it may be possible to cause buffer-overflows in the software with particular input, if difficult. |
|
|
In addition, though, I propose a QR-style attack.
We write a sequence which is translated by the sequence analysis software to a valid URL. It's quite likely that at some point a piece of code will automatically highlight this as a link, and we can then rely on operator curiosity to click on it.
At this point we have converted the problem into a standard browser malware attack, which is a solved problem. |
|
|
But how do we encode a recognisable URL?
Word, for example, will by default auto-corrupt any alphanumeric string beginning with "www." into a link. We can easily encode the letters "ACTG". It may be less obvious, but we can also encode 11 other letters, including 'W', using degenerate bases. A combination of 'A' and 'T' at a position will be coded as 'W'.
The only remaining issue is the full-stop/period '.'.
Obviously, we can't do this in the same way, except in the rare case where all bases is coded as '.' instead of 'N'.
Alternatively, it might be possible to do something with auto-correct, although this opportunity is likely rare. |
|
|
Barring this, I think the best we can do is put a recognisable and compelling instruction into the sequence such that the operator is obliged to engage with the infection.
The letters available are "ABCDHGKMNRSTVWY". |
|
|
WWWCANRATTYHAVACRACKA (domain apparently available)
WWWCRYBABYCRY (domain available to buy)
WWWRAGBAGHAGANDDAD (domain apparently available) |
|
|
If the researcher is doing deeper analysis, one can of course arrange for an open reading frame to carry a message in the universal genetic code, which gives you 20 letters. |
|
|
It might be easier to encode an offer of a bribe. |
|
|
This can also serve as a warrant canary. If you somehow make
sure the DNA sequencer will be reading your bit of code and
if you assume it can result in a click to your web site you can
make it a site no one is likely to visit (RAGBAGHAGANDDAD
RAGBAGHAGANDDAD RAGBAGHAGANDDAD .com) and when
you get a hit from a browser you know your DNA has been
read. For extra giggles include system data in the outgoing
query. |
|
|
//we can also encode 11 other letters// or, rather
than using degeneracy codes, rely on translation to
single-letter amino acid codes. Most DNA packages
include translation in all possible frames. I suspect
some of them even use "." to mark a stop codon, so
your "www." problem is solved right there. |
|
|
EDIT - ah, just realized that [Loris] said exactly that. |
|
| |