h a l f b a k e r yA dish best served not.
add, search, annotate, link, view, overview, recent, by name, random
news, help, about, links, report a problem
browse anonymously,
or get an account
and write.
register,
|
|
|
|
How do you associate the link/address with a name? |
|
|
For example, it would be easy to write a program that strips out all text that matches a regular expression along the lines of *.*@*.* (I'm sure there are better ways of writing that, but can't be bothered right now) probably fetching the webpages required using something akin to 'curl' but linking these scraped addresses to real, meaningful names (as you'd expect in an address-book) is probably quite tricky. |
|
|
How are you going to do that bit? |
|
|
(I suppose you could extract all instances of word-pairs that are Both Capitalised - but this might cause problems with Double-Barrelled names, or places like Leigh on Sea) |
|
|
I suspect it would be pretty complex to automate, but perhaps the idea of making a .vcf file available by link could become a standard web practice. I'll play with it at work today. |
|
|
Phoenix, I think that already is being done quite a bit. |
|
|
The converter is cute. I'd enjoy writing something like that. You'd have a set of rules that say how an address is usually written, perhaps locale-specific; detect the locale based on embedded text or server geography. |
|
|
One of the relatively benign business models could be to make the rules extensible and get really good, then sell an enterprise version to people who have to detect addresses for other purposes. |
|
|
The evil business model would be to store all the addresses and then send spam to them. |
|
|
The default business model is to host ads on the user interface page that you paste the URL into, and have local, location-specific advertising on the result page that you get the vcf link from. (One more hop than strictly necessary.)
Right? |
|
|
You could remember which .vcf file originated from which page and then monitor the page for changes and sell some sort of notification/automatic address book updating service. |
|
|
In retrospect, many employment web sites will scrape your uploaded resume for phone and address information. It's not always perfect but it's pretty good, so the code is there at least in part. |
|
|
It will be more difficult if, say, the address is formatted using tables and the various elements are in separate TD tags. A single page with multiple addresses would also be difficult to automate. |
|
|
[jutta] You may be right - that's not the sort of thing I traditionally look for on a web site.
The standard business model wouldn't just post ads local to the scraped address, it would try to find similar businesses as well. If I try to scrape a page for the address of an auto parts store, show me ads for its competitors as well...maybe at a premium cost to the competitor. |
|
| |