Is the internet really polluted by blogs?
Mega Man X recently brought up a very valid question about blogs in the LinuxQuestions.org forums. Here's what he has to say on the subject:
But what is annoying me the most today, are indeed the blogs. I was just searching the web for some Ubuntu backports for Breezy and I hit a blog with a guy talking about Chuck Norris. The whole thing is a mess (not that the Internet has ever been organized either)
I think somehow soon we will need an advanced search option on google to bypass useless blogs, if that's not possible already. Does blogs annoy anyone else? Or even worse... do you have a blog? If so, why?. Is it stuff worth reading or is it just like a diary? Because I believe peoples browsing the web could care less that your dog or cat is sick, what you've ate at school and what you did last summer...
Source: LinuxQuestions.org forums
I must say that the concerns raised are quite valid although I beg to differ on the subject of severity of the problem. The vast network of interlinked blogs out there have certainly gained a degree of visibility on many search engines out there. And I have to accept that a large number people are increasingly annoyed by the growing number of online diaries and journals that are sprouting up like wild mushrooms by the day. The sheer number of blogs out there render most search engines helpless in filtering them out completely. It's not merely a question of blocking out all blogspot.com addresses out there for instance because the problem goes way beyond that.
Because, as end users of the vast ocean called the internet, we cannot change anything - that's for certain. Moaning and groaning about it will not help. Rather, we should adapt to this situation and explore techniques to separate the wheat from the chaff. And believe me, while useless personal blogs have grown, the number of quality websites providing a rich wealth of information and education have grown as well. Ultimately it's a question of perception and how well one is tuned to take the good and filter out the rest. Crap is crap, whether you find it on a blog or whether you find it on a corporate or business website; whether you find it in a personal diary or in a regular, mainstream newspaper.
Let's admit it. Searching the web is inherently limited because while we can input keywords to search for occurrence of words, we cannot input ideas to search for relevant content. Let me take an example: today I want to read any essay which talks about the issue of "quality over quantity." It is extremely hard to find a generic one on this particular subject simply by entering quality over quantity in google, because my search has more to do with the idea rather than the actual keywords. Google obviously doesn't recognize that fact and hence provides less than satisfactory results. It throws up topical pages on other issues which have these words "quality over quantity" and not an essay dedicated to this topic as such.
Another factor is that search engines don't necessarily index every single website out there and that SE ratings can sometimes be seriously flawed. Search engines can only look for quantitative factors: number of links pointing to a site, number of occurrences of keywords but not necessarily how those occurrences are relevant to the search on hand. In other words, the search engine cannot rate the quality of those sites which have a higher rank. We try to cut down this discrepancy by refining searches, but ultimately if a site is not indexed by google, that site will not occur in google results, no matter how hard we try. Many times, I've given up on searching because the quality of the results have simply not justified the time spent in doing it. Do a broad search and you're swamped with irrelevant results. Do a more refined one and you get only two hits, both of which have almost nothing to do with what you wanted to find in the first place. This has been my experience more often than not.
I am certainly no expert on search engine technology, but I believe it has more to do with the skewed methodology rather than the content. And I also think it's a by-product of search engines not being able to keep up with the current growth of the world wide web. But blogs are but drops in the ocean. I don't think they are so important that they get higher weightage in SE ratings just because they are linked to a dozen similar blogs. On the contrary my observation is that blogs certainly do not "dominate" search results, although they might admittedly have more visibility in searches these days. And you certainly get irrelevant results from other websites as much as you do from personal blogs.
I think singling out blogs is unfair. There are certainly useless blogs out there, but there are worse kinds of nonsense going on in the internet and in much higher volume than inane personal ramblings or diary entries of a bored person. Generalized observations such as "blogs are the crap of the internet" miss this perspective. While I admit that search engine results need to keep improving over time and blogs probably have to be filtered out where irrelevant, it's certainly not such an important issue when we think about the other kinds of trash littered all over cyberspace. If the world wide web is polluted, blogs certainly are nothing more than minute specks of dust in a room full of rotting, stinking garbage.
13 comments
...in a blog!
http://organizedlife.blogspot.com/2005/07/choosing-quality-over-quantity.html
I think Megamanx is just scapegoating blogs. There's junk everywhere, and searching for stuff it always hard. Junk can be blogs. Junk can be other pages, too.
Brad, yes - good points. BTW, what I find more annoying is the fact that many of the "real" websites are so out-dated that many of the information contained are several years old. On the other hands, blogs with frequently updated content sometimes give us better results than old regular websites.
sorry- websites don't always have to be made by mega corps with paid staff- but the individual isn't going to be able to pull the hard yards of keeping the website updated. A blog, on the other hand, with the blogging software pulling the hard yards for you, is much easier to keep up to date.
However, I believe that Google (no links - I read it "somewhere" on the 'net) is aiming to come up with a search which takes context into account. So if you were looking at an electrical store site and searched for the word "television", you would have a list f searches returned that focused on tech and price comparisons and excluded "the history of television" and anything similar.
It will be slow, the problem has only really come up as the size of the internet has grown to it's present (and future) size.
While not blaming the people who get frustrated by blogs, I do think they over-generalize and exaggerate the issue and miss the perspective.