Information Retrieval
Contents
This morning I spent some time looking for information on how to deploy a legacy webservice on Apache Karaf.
This post isn’t about Karaf and OSGi and webservices though, but a reflection on how to efficiently find information that can lead towards a specific technical problem on the Internet. For quite some time now I feel that this task has become much more complicated in the past 2-3 years than it used to.
We are now in 2011 and at least in my perception, Google isn’t anymore a simple, efficient nor even effective way of finding information regarding specific problems. Which is an euphemism for “Google really sucks these days”. Type in a query, click on a link, search again for something rather different and you’ll see one of the results you clicked earlier appear in the results of your new query (which is related to the previous query by maybe only one keyword, two at most. What’s worse is that in many cases, the page that Google lists as result don’t even contain one of the keywords entered in the query, but have something to do with sites I browsed in the past. Seriously Google, why should I care about things I searched for in the past in a new query?
And yet, especially when it comes to software development, finding sound information about a problem is critical. Way too often I find myself rushing to Google and searching for a solution, then finding some random piece of information and trying to get something out of it - i.e. starting to work on a solution on a shaky ground, even though I perfectly know I’m on shaky grounds (intuition is very helpful there).
So there’s this one issue of finding high-quality information. There’s another, much bigger problem though: understanding the problem to look for the right solution. Oftentimes do I rush into looking for a solution when I haven’t yet understood the problem (or think I do, but really am just headed in the plain wrong direction).
At this point, I shall point to The Five Orders of Ignorance by Phillip G. Armour. If you haven’t read it and are in the business of writing software, do so. Now.
For the sake of the completeness of this post, I’ll list the Five Orders anyhow. They go like this:
- 0th Order - Lack of Ignorance: “I have 0OI when I provably know something”
- 1st Order - Lack of Knowledge: “I have 1OI when I don’t know something”
- 2nd Order - Lack of Awareness: “I have 2OI when I don’t know that I don’t know something”
- 3rd Order - Lack of a Suitably Efficient Process: “I have 3OI when I don’t have a suitably efficient way to find out that I don’t know that I don’t know something”
- 4th Order - Meta Ignorance: “I have 4OI when I don’t know about the Five Orders of Ignorance”
So how does that relate to the information search issue from before? Well, as I pointed out, oftentimes a problem when looking for information is the lack of clear understanding of the problem I try to solve. So this is - as far as I understood the 5 Orders - a 2OI: I lack the awareness that I do not understand (know) the problem for which I am looking for a solution.
So how to get from 2OI to 1OI? Of course this is situation dependent, but I would say that when you have some domain knowledge things are much easier. In software development, there’s one type of problems which I call “configuration problems” (although they could probably also be called “integration problems”) and that go like this:
- how do I deploy…
- how can I configure…
- I need to integrate…
and involve one or more technologies of the many, many technologies that are available in the open-source space.
Whenever I get to one of these questions (which appear to be 1OI-type-of-questions), I try to step back and ask myself: does that even make sense? Isn’t there a deeper problem I actually want / need to solve? I say that I “try to” step back because in some cases I either forget and rush into finding a solution to the wrong problem, or I know or feel that this is not the right question, but have a personal itch to get things to work anyway (“I really want Spring DM 2 to work on Karaf! It just has to work!").
Now let’s go back to the initial issue described in this post (Google sucks): how to find answers to technical questions on the Internet these days?
When I think of not using Google, I am now trying the following:
- if the problem relates to a specific technology, I try to find a mailing-list and its archive and look there first. It may just be a POMO (Plain Old Mailing-List) but I found that oftentimes, you can get good answers just by looking at the archives (and by archives I do not mean Googling for the archives but browsing through them or searching only the archive site with Google’s “site” keyword)
- search and ask on Stackoverflow
- subscribe to a technology mailing-list and send the question there, to the developers and experienced users. I used to avoid this step for some time, wanting to solve things all by myself…but really it is not worth the trouble.
- check out the damn code. There is a lot in the code, much more than you’d think. And looking into the internals can sometimes be much more efficient than searching for a sample or documentation - let’s be honest, not too many projects have always up-to-date documentation and samples.
Allright, that’s it. Let’s hope Google fixes their search engine, or offers a paying high-quality service for search (I would pay for that).