h a l f b a k e r yTrying to contain nuts.
add, search, annotate, link, view, overview, recent, by name, random
news, help, about, links, report a problem
browse anonymously,
or get an account
and write.
register,
|
|
|
My father once came home from work and announced that
the computer (presumably a '70s mainframe or mini) had
broken down because it had attempted to subtract a
double-barrelled surname.
Although I'm not sure this is true, this kind of thing can be
done. Just as hexadecimal consists of the
digits 0 to 9
along with A to F, base 36 could be 0 to 9 along with A to
Z. This means that a word such as "bad" would have a
value of 14629. Therefore a word such as "half-baked"
would then be the numbers 20451 and 28869. Subtract the
second from the first and you have the "word" "-6hu",
which is shorter when written but not said. On the other
hand, if the first word in a hyphenated term is longer, the
result will be positive, an example being "mid-80s", whose
result is "ehl", which can basically be pronounced.
Therefore, my proposal is quite simple: shorten texts by
using lots of hyphenated words and saying the result of the
second subtracted from the first. It will be hard to
pronounce and often lengthen words in speech, but
otherwise it's fine.
Please log in.
If you're not logged in,
you can see what this page
looks like, but you will
not be able to add anything.
Annotation:
|
|
Yes, but what if "bad-ass" comes out the same as
"half-baked"? |
|
|
This is sort of like saying "add all the words together
and express the answer as modulo 1", only not quite
as much. |
|
|
Incidentally, your father's story is not quite so far
fetched. The IBM machines that were used in the
Manhattan project were required to be modified such
that, if a particular combination of numbers was fed
to them, small gunpowder (or maybe it was
guncotton) charges in the programming switches and
card stacks would be ignited by little coils. This was
to ensure that the machines could be decomissioned
in an emergency if there was any risk of them being
captured, leaving no clue as to the calculations they
had been doing. Some of the later German Enigma
machines had something similar. |
|
|
I feel your account could be interpreted as a pun
[MB], but I would like it to be true so I choose to
believe that it's so. According to my trusty Jupiter
Ace, bad-ass,
or rather 36 BASE C! BAD ASS - . , comes out as "hl",
so it's OK for that, though I take your point. |
|
|
Also, maximum could be interpreted as (ma) multiplied by (imum). And similarly, either/or could be (either) divided by (or). And, pushing it a little, you could consider t to be similar to +, and so a word such as motor-car could be constructed as (mo)+(or)-(car). |
|
|
//According to my trusty Jupiter Ace// |
|
|
Wowww. I have to say, I am impressed. You program
it in Forth? |
|
|
//-6hu...shorter when written but not said//
Not necessarily. I presume you are pronouncing 6hu as six-aytch-yoo. However, if you put on your fake Japanese accent and pronounce it 'sixhoo', all is resolved. |
|
|
Why are you bothering with numerics ? just use the letters for base 26. |
|
|
"hi there", being an addition, would be "thfcn". |
|
|
"e-mail" would be "-maig". |
|
|
Weirdly, this overlaps something I've been thinking
about for a while now. I am trying to connect
similar entities from two huge (separate) tables. If
I take the numeric columns and transform them
into a normalised range of values, I can plot them,
n-dimensionally (where n is the number of numeric
columns) and identify which ones are closest to
one another by calculating the hypotenuses, and
identifying nearest neighbours. The problem is
that key columns that need to be compared are
non-numeric, but still have a concept of
"closeness". My thought was to convert each of
these strings into a vector, or series of vectors,
each of which corresponds to the distance
between each key on a qwerty style keyboard.
Thus, the word "SAD" (all on the same Y-axis)
might resolve to a vector of (x,y)->(1,0) (start at S,
x-1 for A, x+2. While "GOOD" would be (x,y)->(G)
(O:x+4,y+1)(O:0,0) (D:x-6,y-1)=(x,y)->(-2). If I also
increment the z-axis for each character, the result
is a wiggly spiral stretching off into the z-
direction. Intuitively, I imagine these
crystallisations or shapes ought to be relatively
unique, but actually there are quite a few clashes,
for words that are not that similar. It's been a
while since I ran it. Anyway, the point is, the
overlap with the idea here is the resolution of a
word down to some kind of base, or integer value
(since the wiggly word-shapes can be resolved to a
single vector in 3-d space, the length of which
might be a way of identifying that word and
plotting it in a word-space that somehow put like-
words together - IF - some method can be found
that does that neatly without silly clashes - AND -
and this is where I eventually gave up, allowed
some kind of semantic dimension to give puns and
similes a means to be identified) |
|
|
I do program it in FORTH, yes. It's fairly typical of me to back the loser
but it was the first item I ever bought with money I earnt myself. It
was just simpler than trying to work it out on the Windows calculator.
Regarding the number base, I didn't think of that but I did think of
using 1 and 0 as letter substitutes to reduce the size of the numbers
and so also the calculations involved. [Zen_tom], I'll get back to you. |
|
|
OK, [zen_tom], is that not a function rather than a value? I
may have misunderstood, but I have the impression you're
describing something which can be plotted as a series of
connected diagonal line segments in three dimensional
space. Have I got it right? |
|
|
If we follow this usage of punctuation as a
mathematical operator, some of the results would be
enormous! |
|
|
0. 1. There, I've said it all. Anything else is simply redundant. |
|
|
[nineteenthly] it depends on your definition - let's just say
that I'd like to a stage where I can arrive at
i) a single point in multi-dimensional space that I can use
to measure the distance between another point in multi-
dimensional space to result in a scalar quantity that tells
me how similar two words, or even entire strings of text
are. |
|
|
The twisty word-shapes are kind of cute to consider and
work as handy visualisations, but it's that final numeric
value that I'm ultimately aiming towards. (Though, it
might be interesting to consider the shapes formed and
see if they provide any geometrical methods of clustering
the data formed - clockwise words, anti-clockwise, twisty
ones, straight ones, words that form into loops, or which
just wiggle along - there's bound to be a number of ways
of categorising them other than resolving them to a single
scalar magnitude) |
|
|
I tried hamming and Levenstein distances, but the
problem there is that each distance has to be against
something else - to find a true measure, you have to
calculate hamming/Levenstein distances from each string
to each other string - the algorithm to do that is NP - If I
can map any word into a space (and for me, yes that
would be a function) without reference to any other
word, then I can do that in much less time, and it will
scale for large volumes. The hard part is defining that
space such that like-words appear relatively close to one
another. And I think the hard part in doing that is that
there are so many different ways that two words can be
alike. One way might be typing, so "awful" and "awfuk"
are very close in that sense (assuming a particular
keyboard layout, of course). And another might be "ear"
and "shell-like", assuming some form of cockney
dimension is incorporated into the model. |
|
|
I think you need something like a chaos plot. It
works very nicely for DNA sequences (four letters),
but I don't know if there's any reason it wouldn't work
for a larger alphabet. |
|
|
Yes, I've heard of similar uses/approaches used for DNA
sequencing - not heard of a chaos plot till now - that will
be something to look up tomorrow. |
|
|
The other possibly interesting thing about mapping words
onto numeric spaces, is that it's possible to calculate
things like averages, outliers etc. Even more interesting
would be if the mapping function worked both ways - i.e.
if a word goes in and translates to an n-dimensional set of
coordinates {a,b,c,d,e,f,...n} then what do you get if you
turn it around, feed in a set of coordinates? Entirely new
and alien words could be discovered this way. |
|
|
//0. 1. There, I've said it all. Anything else is simply redundant. |
|
|
Hmm, by removing the zeroes in this fashion, with 1=0, 11=1 and so on, you could save 50% of the bandwidth. |
|
|
I have this nasty feeling that might just work with enough checksums, and a deity with a strange sense of humour. |
|
|
I am afraid zen-toms scheme glimmers with manic madness to me and I had to stop reading it. |
|
|
But I like nineteethly's idea fine. A problem I have is that is hard to go back: there is a loss of value in subtraction and multiple possible word partners could have the same result. |
|
|
I propose instead that the entire word without hyphen be converted to a number, and that number reduced to its cube root. Big languages like German could use the 4th or 5th root. Halfbakery or 2045128869 becomes 1269.33 or abfi.cc. The decimal point is pronounced as a cough. If you want the original back, easy - just cube it and there you go. Of course what comes back has no hyphen but they are pretentious anyway. |
|
|
// no hyphen but they are pretentious anyway. |
|
|
I would quibble with that but I have to - |
|
|
//glimmers with manic madness// actually [bungston] that's
actually a very helpful point - in meetings when walking
through a possible course of action, especially when it's a
particularly interesting one, there has been a tendency for
the audience to gloss over - I suspect it may be at the point
where mania is detected, and I may need to consider ways
of shielding my colleagues from the more manic stuff, as it
rarely sticks - until I build it myself and they see it working
on a "don't need to know how that works under-the-covers"
basis. |
|
|
//I build it myself and they see it working// |
|
|
Absolutely, [zen_tom]. Skunkworks. Ask forgiveness, not permission. Or go to work with more intelligent people, if you can find them. |
|
|
//This means that a word such as "bad" would have a
value of 14629. Therefore a word such as "half-baked"
would then be the numbers 20451 and 28869.// |
|
|
How do you figure? I get 806883 - 18968773, or "-at9sy".
Negative at ninesy. |
|
|
BTW, the Ruby programming language can do math like
this very easily. When you convert a string to an
integer or vice versa, you can specify a base between 2
and 36. So the statement ( "half".to_i(36) -
"baked".to_i(36) ).to_s(36) will perform this half-baked
arithmetic. |
|
|
FORTH does it that way too. It has an 8-bit variable called BASE. I may
be wrong. Basically, I didn't go to the extra effort of writing a double-
length integer printing word, which would've been easy. |
|
|
FORTH brings back memories - I never got the hang of FORTH but on the other hand, I loved programming in LISP. Different people's minds work in different ways, I guess. This was all around the time I was being employed as a part-time Pascal/VMS programmer (it seems weird now to think that people actually paid me to do programming - I am not a very good programmer...). |
|
|
While we're on this programming-language tangent, I've
been playing with Python recently - and love it - one
favorite thing that it has is a full and deep set-
processing facility built in. So you can define a set
a = {1,2,3,4,5}
and another one
b = {3,4,6}
And then ask
c = a.intersection(b)
d = a.union(b)
e = a.difference(b)
And expect the results
c = {3,4}
d = {1,2,3,4,5,6}
e = {1,2,5}
And so on, only if you like, you can create sets of
characters, or documents, or objects, whatever you
like. It's a nice facility I've not seen embedded
within a programming language before, which saves lots
of time writing/porting/managing different
iterative/search routines. It's a nice way to
formulate and resolve problems sometimes - especially
when you need to explain your code to a pointy-haired
person who responds only to basic Venn-diagrams. |
|
|
That's pretty common for modern programming
languages, especially (so-called) scripting languages
like Python and Ruby. Ruby's version is even terser:
a & b intersection
a | b union
a - b difference
|
|
|
Also, realize that what you think of as being
embedded within a programming language really
means it's just part of the standard library. You can
very easily add that functionality to most programming
languages by either creating a custom array object or
modifying the built-in array. You could also technically
remove it from the language, but there's really no
reason to do so. |
|
|
Hmm, it's interesting - these days, there's a library for
anything, and if you don't like how those folks did it, you
can fork, or write one yourself. I've done the same thing
myself. What's nice about having something like this deep in
the "core" packages is that other libraries are informed and
are shaped by it's existence. |
|
|
what is not equal to factorial? |
|
| |