Flickr’s Interestingness search is becoming worse each day. More and more, attempting to use keywords to find photos results in a frustrating amount of false positives. Keyword pollution is becoming a serious problem.
What is “Keyword Pollution”?
Dan Heller defines keyword pollution as “…a term I use to describe an image whose keywords are such that the number of false positives causes the searcher to get exasperated and quit.” He writes an extensive dissertation on keywording and how it related to the future of stock photography. I think Dan’s definition is pretty accurate and I really like some of his in-depth observations.
You can think of it like the videos you see on YouTube tagged “south park” or “stephen colbert” that have nothing to do with either. People do that to make their videos to show up in the search results for popular topics. It was a common tactic among internet spam sites back in the day.
I find it especially frustrating when I’m trying to find photos of camera equipment for this blog. People tend to tag their photos with the camera they used, despite the fact that it’s already listed in the EXIF data. This means it’s hard to find pictures of a Canon digital camera instead of ones taken with a Canon digital camera.
Popular blogger Darren Barefoot touched on Flickr keyword pollution recently as well, directing his wrath at the photos people take of what’s in their bags. It makes it hard for him to find pictures of individual objects instead of big piles of them. Another example is when I was searching for CC-licensed photos of weddings. Out of 24 total photos on the first page for “wedding”, 8 photos are of wedding cakes all from the same account. On the second page, the numbers go up; 14 out of the 24 photos are cakes. I find it frustrating to have to waste my time sifting through unrelated photos in order to find what you do want.
Why Does Keyword Pollution Happen on Flickr?
The short version is that Flickr’s “interestingness” algorithm is easy to cheat. The formula is based on activity. How many people view, comment on, add tags to, and favourite your photos is often what pushes it up in the search. People will often artificially inflate this by submitting their photos to communities like Delete Me and Score Me that result in a lot of activity, even if the photo is terrible.
Unfortunately, there’s nothing that can be done to prevent this. The downside to anyone being able to edit and/or tag stuff means that there’s no governing organizational system. The practice of inputting incorrect keywords isn’t evil or anything, it just makes life frustrating for those of us trying to find legitimate content.
I’m predicting that one of the next big things across the board in Web 2.0 will be content moderators to help streamline the user experience by weeding out these false positives. Netscape has already implemented this by taking user-submitted news and having Netscape Anchors, like Muhammad Saleem, help moderate the content.
Is all of this a big deal? No. It’s a minor annoyance that I don’t enjoy dealing with, like flyers in the mail or bureaucratic red tape. Without a cohesive system to govern tagging, it gradually becomes more and more frustrating to find what you’re looking for.
Now if only Google had bought Flickr…