November 13, 2007
This week I saw a presentation given by a member of the Yahoo!/Berkeley research team.
At the talk, Dr. Naaman demoed this unassuming tool that his group has been working on:
I am really glad I went to the talk, since the demo helped me understand how sophisticated this tool really is. I had a definite ah-ha moment learning about all the new flavors of semantic information soon to be mined from the massive amounts of memories we are collectively recording.
During the talk I was reminded of this recent essay on Evolution and the Wisdom of the Crowds which explains how counter-intuitive these emergent properties are to our everyday experience. But, this seemingly teleological construction of semantic knowledge naturally emerges from a rich enough system, as the flickr research demonstrates.
To clarify what you are looking at here, no humans tuned or trained the system to teach it which are the significant landmarks in these regions. The representation is computed using the aggregate processing of many, many tags. These tags are starting to provide enough information to disambiguate different senses of a word (based on the adjacent tags that are also present). Patterns are also discernible from the spatial-temporal information on these photos, and yearly events (e.g. BYOBW) have been detected and recognized by the system. Formerly unanswerable questions, like “What are the boundaries of the Lower East Side?”, now have a fuzzy answer of a sort, in the form of collective voting.
While the UI work here is neat, it pales in comparison to this Jaw-dropping Photosynth demo presented at TED this year (though it does beat the pants of the current UI of pink dots on a map which forces you to paginate over all the matching pictures in batches of 20). The widget is even available as web service which you can feed your own data into.
But, the real work here is going on behind the scenes. It’s being published and presented in CS contexts, just in case anyone thought this “social media” stuff was for just for kids.
There is certainly lots to digest here. It’s one thing for an algorithm to decide on the most representative photographs of the Brooklyn Bridge essentially based on popularity (though its a shame that avat-garde art photos will be automatically marginalized through this technique), but its quite another to imagine other important areas of discourse being regressed to the mean – its an odd sort of leveling effect that is likely another manifestation of Jaron Laniers’ Digital Maoism.
The presenter did note that social media designers do need to anticipate feedback effects, as when they launch a new tool and users adjust to the new conditions and modify their behavior accordingly (or begin to “game” the system to take advantage of it).
We are a long way from 1960’s AI and its conviction that the world is best modeled and represented as a series of explicit propositions.