h a l f b a k e r yNot from concentrate.
add, search, annotate, link, view, overview, recent, by name, random
news, help, about, links, report a problem
browse anonymously,
or get an account
and write.
register,
|
|
|
I've recently released an online tool for geneticists and molecular biologists . What it does isn't important for this idea, but essentially it creates genetic sequence diagrams from annotated sequence - see link if you're interested.
So obviously I want it to appear on Google somewhere near the top
when it's searched for with appropriate search terms, so people who want it can find it, ideally by searching just for its name. It isn't - yet; that's okay, because it's pretty new. It is a little disappointing that it ranks somewhere below pages of rubbish which spammers have created to game the search engine, but still.
But while investigating, I found something interesting. Google does know about the site, because it comes up if you search for two relevant search terms. This got me thinking. I'd like to know what potential searches my website comes top of. Other people might be interested in doing that sort of thing too, for various reasons.
This is the inverse of what Google usually does, because it is going from page to keyword-lists rather than from a keyword-list to pages.
How would it be implemented? I think it would be fairly straightforward: Google already accumulates a list of popular search terms - search whatever the engine's internal representation is for those, and then search using probablistically popular combinations of those (or just feel lucky since we only want the top hit) using the main engine. Any unusual words on the page but not on the list of popular words can be given a generic low score so they'll still be searched for in combination with the popular terms.
Of course this would apply to any other search engine too, but Google's really the only one that counts, for the time-being at least.
**update**
Basically, the point of this is that you can find plausible, but perhaps unusual searches which put a website at the top. Perhaps many of these exist, which you don't know about. Why would you want to do that? Well, for fun, of course. See link below 'an example'. They probably have no idea that they're the web's top hit for those terms.
An advantage of these searches is that if you mention one on your site, you effectively 'lock it in'. An maybe people will remember the keywords if they don't remember much else about your site.
(?) Genogator
http://www.kato.mvc.mcc.ac.uk/genogator/ A tool to create sequence diagrams from annotated sequence files [Loris, Jun 26 2009, last modified Jan 23 2010]
conceptual nearest neigbour
http://askpang.type...oogle_in_rever.html Surprisingly it doesn't look like many other people have thought of this. This link is the nearest I found in my searches. [Loris, Jun 26 2009]
(?) Google keywords
http://www.google.c...hl=en&answer=114429 Seem to be doing something like this - although for a slightly different purpose. [jutta, Jun 26 2009]
HTML Meta data generator
http://vancouver-we.../META/mk-metas.html [Dub, Jun 27 2009]
Google's Pank Rank Algorithm
http://en.wikipedia.org/wiki/PageRank [swimswim, Jun 27 2009]
Projects Metafile
http://projects.metafilter.com/ A handy place to announce new projects [Dub, Jun 27 2009]
Elective tree surgery
http://www.healthca...ree-surgery-center/ An example [Loris, Jul 07 2009]
Please log in.
If you're not logged in,
you can see what this page
looks like, but you will
not be able to add anything.
Annotation:
|
|
Hah, I see that the autoboner has bitten in less time than it takes to actually read the idea. |
|
|
The list would have to exclude quotation marks. Otherwise top of your list would be:
"Welcome to Genogator, the easy way to create sequence diagrams from annotated sequence files.'' |
|
|
Interestingly "the easy way to create sequence" is a googlenope. Why's that? |
|
|
I'm not sure if quoted terms need to be dropped. If the terms are ranked by popularity, then I'd treat those as 'atoms'. If people are searching for //"the easy way"// for example, and others were searching for //annotated// then the score for that search would be the product of those two terms' popularities. |
|
|
//Interestingly "the easy way to create sequence" is a googlenope. Why's that?// |
|
|
A googlenope is a term which google doesn't report any sites for? It actually does now - it's my site! I noticed recently that the big G scans the halfbakery pretty intensively - I searched for a term someone mentioned a few minutes before, and that page itself was top of the list. Below that was utter rubbish - I never did find out what he meant. |
|
|
//a term which google doesn't report any sites for?// - yes. |
|
|
//No results found for "the easy way to create sequence".// is what Google tells me. Doesn't even find it here on HB. Odd. |
|
|
Ah, I'm searching google.co.uk, you might be searching on google.com (where it's still a googlenope) or elsewhere. It might propagate over time I suppose, or maybe it's considered local interest. |
|
|
Ahh, of course. co.za probably has more googlenopes than most! |
|
|
There are many ways - including commercial web statistic monitors - to know what keywords people use to find your site. They won't necessarily tell you if you were the first result in search, but do you really care? |
|
|
I'm not sure I understand what this is about. Does this mean registering the most important keywords on a site somehow so that those words would be given more weight by a search engine? |
|
|
Its only of academic interest to me, phoenix. |
|
|
How people find the site is not necessarily the same as what search terms it comes at the top of. <later addition> I once heard of someone being surprised that their site had a hit from someone googling for //up skirt photos// - when their site was about mountaineering. 'up' was obvious enough - from phrases like 'looking up the valley you will see' or whatever, and so was 'photos'. But the 'skirt' was more of a head-scratcher. He found it was from something like 'you can skirt around the pile of skree...'. If it hadn't have been for that hit, he'd never have known about it. |
|
|
I think the results would be amusing. The halfbakery, for instance, might claim to be the number-one website for ... well, I've been trying to find words which encompass the hb, and so far I've failed.
<later edit> Welcome to the halfbakery, the web's number one communal database of inventions! |
|
|
Zimmy, the idea is to be able to find what potential searches you could do on Google which would put your site at the top. Then you can claim to be the best at that, or whatever. |
|
|
#1 & 2 for "autoboner". There's got to be some combination of word(s) with custard that would do it, but I can't think of it. |
|
|
Auugh! Someone opened a business with the name Hullaballon. |
|
|
I had 'custard' in most of my attempts, too. |
|
|
Of course in some ways the hb isn't a good test - 'halfbakery' by itself feels lucky.
Croissants fishbones is mostly hb pages, but somehow I feel that's cheating.
There's not much wiggle-room for the main page, since there's not much in the way of unique text on it.
Hah! 'communal original inventions' and 'communal database inventions' both work. I like that, although it's a pity that 'communal inventions' doesn't quite work (hb is second). |
|
|
Ah well, I was thinking of obscure blogs really in any case. |
|
|
A quick look at the HTML on your site seems to show that you're not making it easy for seach engines to find you - or at least to make you searchable. |
|
|
First you should be linked by something publicly linked (like HB) so bots can find you in the first place, then you should make the site bot-friendly (keywords and meta data helps them know what it is your page does) - I didn't see much evidence of that. I use meta data generated by [linky], and it seems to help |
|
|
Thanks for the search engine optimisation tips, Dub. Regarding linking to it from the hb, well... I've done that now[1]. What this means is that this page now appears in the google search (for 'Genogator'). I don't think that's how Google was supposed to work! My plan was basically to try and get sensible links to it, rather than spamming everywhere. |
|
|
One other thing I did recently was to directly tell Google about the site - there's a form for that. But it probably knew already since there's the odd other link around. |
|
|
My understanding about metadata is that this is largely disregarded by Google due to abuse. But I should try them anyway. The other thing is that there isn't much text on the main page. But the Halfbakery does seem to get by with less. At some stage I'll try to add a bit more explanation of the tool to the main page. |
|
|
[1] This benefit did occur to me, but I decided that just posting a link would be exploitative. So it wasn't until this idea developed that it seemed reasonable, as part of the explanation. I hope noone feels that this was abusive. |
|
|
Add another a link to MetaFilter Projects {Linky}. |
|
|
There are sites which register the information across several search engines - I wouldn't bother - I certainly wouldn't pay to do it. Consider the following: |
|
|
Think of links to your site as recommendations.
HB and MeFi Projects are general public links which will boost a search engine's confidence, but won't neccessarily help people find you - A link from an august genetics journal would be a big boost... but a link on a page linked by that journal would be a good place to start.
The more reputation the linking site has (especially in your subject-domain), the more rubs off on your site! They're saying - "This guy's got something interesting to say." You generally can't say that sort of thing yourself!... and a general list of random links (HB/MeFi) isn't really very reputable... - but it'll get you noticed, and on the bots' lists. |
|
|
Think of links from your site as confirmatory...
{Notices some nice links to JavaScript and Perl but none to EMBL's WIKI entry! The one to Taverna's probly a good'un}
If you add a link to "like" sites (e.g. a Wiki article about what it is you're doing, or at least the general subject) should help. When you've built enough reputation, new links from your site will carry more weight. |
|
|
Metadata's not abuse! Like all good things, it *can* be abused. It ought to be true data about data.
Search engines can be sceptical. They look for signs of abuse - links in teeny-tiny fonts - fonts using the same colour as the background colour! But I seriously don't think they ignore keywords! |
|
|
Use simple keywords that someone who'd never heard of your project, and therefore, isn't aware of its name, might be looking for. It takes some careful consideration to get this right. You have to have two hats on - One, describing what it is you do, and the other trying to guess what people might think of searching for to find you. If you have a unique name, that should appear in there too, but you also need to include words and terms which skirt around the subject... think of the keywords as leaving a trail of breadcrumbs to your site - i.e. What might someone be looking for. It's a fine art - Imagine how someone not really sure what they're looking for, but knows some nebbulous words, terms and phrases might type - Include some specific subject terms that it's unlikely anyone else outside your sphere of interest would use. |
|
|
BTW, I don't think search engines are broken... If everyone parsed their sites and were presented with a set of keywords whose weighting could be adjusted and regurgitated as HTML Metadata that'd be handy.
Easy enough to do, probably, using some kind of histogram, I suppose... (Spellcheck, throw away common glue-words, emphasise the unusual) |
|
|
I had hoped this would be along the lines of 'i give Google ten sites i visited, and Google tells me what i was looking for' - but you idea seems to be to have Google lay down it's arms in the war against spam-sites. Honest people would be able to promote their sites, but what about the dishonest and insane? |
|
|
The metadata in html once was intended to facilitate just the thing you want - the creators of a site could enter the words they thought described the content of their site best, and searching for some of those words would bring up those sites. |
|
|
Result was page-long metadata tags that contained every word anyone had ever searched for, including every permutation of the letters 'nopr'... |
|
|
But if your tool is good, just advertise it to some labs you know, and they can put it into their 'links' section, possibly even with a short intro. That will boost your rating fine. |
|
|
I think you misunderstand, loonquawl. All this would do is tell you what hypothetical searches your site would come at the top of. You know, for fun.
If you have target searches in mind it's much more straightforward to just google 'em and see what you get, this wouldn't help you there. |
|
|
What you get (from inverse googling) might be quite unexpected - like that anecdote I mention above in the comments.
Would you expect the halfbakery to be first on the web for 'communal database inventions'? |
|
|
Regarding the last three paragraphs, if you read the prior comments you'll see that I know perfectly well what happened with keyword metadata.
I'm already trying to get appropriate links. It isn't quite as easy as you make out though. |
|
|
by the way, put the name of your product on Halfbakery
with the search terms you want and it will show up close to
the top. I don't know why that is, but it's a fact. Just saying.
I know that has nothing to do with your idea, but has
everything to do with the beginning of your pitch. |
|
|
Say you invented Genogator - create genetic sequence
diagrams from annotated sequence files. DNA sequence to
diagram. Display genomic annotation graphically with
Genogator. Now lets see what happens when I search for
genetic sequence to diagram... (Currently Genogator is not
showing up on first two pages. Let's check back tomorrow) |
|
|
//Currently Genogator is not showing up on first two
pages.// |
|
|
Genogator has been off-line for quite some time now.
Keeping a server serving a web-service for no money or
kudos over long periods turns out not to be so easy. |
|
|
I still have the source-code, though. |
|
|
^ If you have a good service, I ' m sure there will be like minds, with the equipment, to share the burden and make it a delight. It's all internettting. |
|
| |