MozillaZine

Proposals for Incorporating Machine Learning in Mozilla Firefox

Friday June 18th, 2004

Blake Ross writes: "I will be doing research this summer at Stanford with Professor Andrew Ng about how we can incorporate machine learning into Firefox. We're looking for ideas that will make Firefox 2.0 blow every other browser out of the water. People who come up with the best 3-5 ideas win Gmail accounts, and if we implement your idea you'll be acknowledged in both our paper and in Firefox credits. Your idea will also be appreciated by the millions of people who use Firefox :-). We'll also entertain Thunderbird proposals."


#94 Community learning

by pa_bryan

Thursday June 24th, 2004 6:02 PM

You are replying to this message

My idea is for a community based learning system. It's more of a framework than anything else though. Servers are set up to store browsing habits. You configure your browser (firefox of course!) to use a particular server (e.g. community.mozilla.org). Then, as you browse, it updates the server with your browsing habits. For example, while looking at pages about apples, I also looked at pages about oranges.

If a lot of people use this system, the system will learn from a whole variety of different users with different tastes. Thats all good and well, but your then at the mercy of everyone else's browsing habits? Not so! The system could also store you habits locally, and only refer to the community site when needed. For example, one day I browse around looking for sites about bananas. I've never done this before. It just so happens that other people have already done this, and the community site has browsing habits based on bananas. Hey wow, look at that, it tells me that in Australia, a nasty fungus that effects frogs is being spread all around Australia, because infected frogs are hopping into the banana boxes that are subsequently shipped all around Australia.

Now, imagine another day, I'm browsing pages about apples. Remember, I also like to look at pages about oranges at the same time. The community site however, relates apples to a computer company instead (based on other people's browsing habits). That's okay, my local system knows to show me information about blood oranges rather than the new G5's. Maybe sites about using Macs to build computer models on the properties of citric acid would be ranked higher ;-)

Like I said, my idea is more about the framework. The actual mechanics of how the system learns could incoporate many of the suggestions already posted (e.g. Bayesian based learning). Perhaps, a side bar could be used, that lists sites relevant to what you've been looking at recently. It could update as you browse, and tune itself as you follow (or don't follow) its suggestions. Maybe it does Google searches on your behalf, and show the results in the side bar.

Privacy is a concern with my idea. I certainly wouldn't be using a server that's run by certain large companies. All habits stored would have to be anonymous. Pehaps "open" sites would be trust worthy. By open, I mean ones that let anyone see the information stored at any time, has a clear privacy policy etc. All data should be sent in clear text so you can use monitor the traffic your browser is sending. Perhaps, XML would be a good choice for the transfer of data between clients and the server.

There's also bandwidth concerns, and possibly storage. Bandwidth has been discussed already (see the pre-fetching posts for example), and storeage costs are low enough that it's probably not a problem to store the habits of millions of people - after all, it's only meta-data... how much could there be?