MozillaZine

Proposals for Incorporating Machine Learning in Mozilla Firefox

Friday June 18th, 2004

Blake Ross writes: "I will be doing research this summer at Stanford with Professor Andrew Ng about how we can incorporate machine learning into Firefox. We're looking for ideas that will make Firefox 2.0 blow every other browser out of the water. People who come up with the best 3-5 ideas win Gmail accounts, and if we implement your idea you'll be acknowledged in both our paper and in Firefox credits. Your idea will also be appreciated by the millions of people who use Firefox :-). We'll also entertain Thunderbird proposals."


#95 Re: URL guessing

by Ashmodai <ashmodai@mushroom-cloud.com>

Friday June 25th, 2004 6:00 AM

You are replying to this message

I think that'd be a good improvement for Firefox's error pages. A bit like the corrections Google suggests:

"The website <http://www.mozillazine.ogr> could not be found, did you mean <http://www.mozillazine.org>?"

An improvement would be if it would automatically check if the alternatives exist and then display only those that are actually existant (if you enter xyz.ogr and xyz.org doesn't exist, it shouldn't be suggested -- maybe offer xyz.com or xyz.net as alternatives?).

Based on the user's location and language preferences a lookup for relevant alpha-2 country code TLDs could be made too: en-US would trigger a .us search, en-CA would trigger a .ca search, DE would trigger a .de (as would de-DE), .at (as would de-AT) and maybe .si (or whatever switzerland is) (as would de-SI or whatever) search, EN would trigger a search for all english speaking countries, and so on. So, if the user specifies en-CA as well as EN as language preference, only english country codes would be searched, if she only specified en-CA as preference, only the Canadian alpha-2 code would be searched. I have no idea how to cover multinational country codes tho -- although .eu seems to be the only extension which is multinational right now. If no language preference is set, alpha-2 country code TLDs would not be searched at all (except for the next situation -- read on).

A levenshtein check could work wonders as well. Like, if the specified TLD extension is non-existant (ogr for example), the most similar (i.e. those with the lowest levenshtein difference) existing extensions could be suggested ("xyz.nx was not found. Did you mean xyz.nu, xyz.nz, [..] ?").

All that would be needed would be a list of existing TLD extensions along with a language->countries lookup functionality. If a language, country or non-country TLD extension is not recognised, it should be ignored, but not trigger an error, as new valid extensions might be defined by IANA in the future.

A lookup using something like WHOIS to find out whether the site exists at all would probably work better than trying to connect to a domain. The error page would otherwise take too long to load, I guess.

The user should be able to chose whether he wants to use this feature at all and whether he wants it to check country TLDs as well, only do levenshtein checks, etc. Power users might want to disable it to speed up their browsing.