August 23, 2007

Search --> Discover --> ??

Search --> Discover --> ??

0.0) Since the time of its inception, web has moved leaps and bounds. It's genesis has created nerve-wracking problems and also wonderful creative solutions. Nothing apart from human evolution could compare to this so intriguing development (both good and bad) of the web. The difficulties which cast doubt on the evolution of an entity are iconoclastic problems which always push the entity to either mutate, compromise, or find a solution to end that problem. Growth of the internet has remained no exception and has led to some very amazing and intelligent answers during the course of its development.

1.0) First it was the problem of searching information/resources on the net. Google and other search-engines have solved it well and in a more than efficient way. One could search possibly everything that is there on the web by keying in pertinent keywords and reach out to potentially exhaustive list of resources relevant to that keyword(s). A peculiarly inconclusive and futile search for the letter 'a' on google would return somewhere around 8 billion resources. Of course an incalculable percentage of it is basically redundant because it's one of the most ubiquitous and canonical search arguments that can be passed to any search engine. The reason to quote that example here is to give an idea of the thoroughness of the searches that are performed over the net. As an indirect consequence of this, people have found ways to derive fun and challenges out of such systems for example by trying to figure out a combination of two meaningful keywords which would yield a single result. However, their discoveries only last for second after being published as the new GoogleWhack !!

2.0) With an almost incessant influx of data and ever increasing amount of information being put online, came the problem of discovering the information. StumbleUpon(SU) and the likes(, digg etc etc) provided a tool to do that. This was a paradigm shift from the way search engines work in the sense that unlike the dedicated crawling spiders/bots which generate the content for search engines systems, here the users themselves were the source of the discovered content. Any content deemed fit by a particular user would get automatically added to the stumbleupon's repository with proper topics/tags/subjects/contexts and would thereafter serve as the results of discovery by another user for those particular tags/subjects/contexts. Web-surfers could possibly keep stumbling around a particular set of their pre-defined subjects on the net and in the process rate the particular resources based on there relevancy. Thus, in essence it is a kind of self-fulfilling prophecy for the systems such as SU to generate their content and also maintain the quality and reliability of that content in an almost perpetual process of information retrieval.

2.1) However, there is also a fundamental similarity in terms of the user interaction between these two kinds of utilities, services, information retrieval systems or whatever you wish to call them. And that is the user in either of these two cases has to explicitly reach out for the information whether intentionally or unintentionally. While in case of search the user intentionally seeks the information, in the other case he or she unintentionally discovers the information based on what others have qualified as the relevant resource for a particular subject. We can therefore in the language of business process classify these activities as outbound or on-demand from a user's perspective because the user has to actually 'reach-out' for the information. On-demand is a term more often used in the context of enterprise services, wherein a particular service or functionality is made available to the customers on as-and-when-required basis and it clearly represents the manner in which search and discovery work on the net.

2.2) Also an interesting common trait for these two kind of information retrieval systems is that they both are reactive in nature from their user's point of view. The process of accessing information from these system is more of a reactive(on a need-to-know basis) nature rather than pro-active(on an ought-to-know basis)

3.0) The question therefore is : What next? Well, any more obviousness would be killing and hence the next thing that demands focus is the need for inbound information availability. Not that anything of that sort does not exist, but there is a greatly felt need for an inbound Information Radiation System(IRS). Feeds, alerts, subscriptions to newsletters, spams(yes spams and junks) are all examples of these inbound information radiation systems, because the information automatically flows to the user without him or her having to do anything necessarily and continuously. This idea is not at all new in its original form because TV advertising, newsprint and radio are all quintessential legacies of these inbound information radiating systems, which we use till the very day. The way perhaps these systems became inbound had more to do with the limitations of technologies rather than the way they evolved as in one cannot imagine the invention of an interactive television in the first place.

3.1) An ideal inbound IRS should qualify by having the following characteristics.

a.) It should be a two-way communication system between the user and the radiator (radiation and acknowledgment).
b.) It should be symmetric, i.e. the roles of user and radiator should be interchangeable.
c.) It should be adaptive, i.e. to say the system should adapt itself to the radiate information which has utility for its recipient.
d.) It should be pro-active in nature and not reactive such as the search and discovery systems are.
e.) It should be non-invasive in the sense that the information should be acknowledged by it's users and should not in any way be treated as unwelcome.
f.) It should be non-binding, .i.e. the expectation of any returns from the members of this system should be ideally zero. (way to be 'leechers' !!)
g.) It should be non-redundant in a liberal sense. i.e. to say a particular information should not be repetitively radiated for a very long time.

3.2) Before going forward there are some classic example of systems which act as both inbound and outbound - Telephones and Snail/E-mails. Telephones are however not well-defined and ideal inbound IRS (according to the definition above) as these tend to violate the e.) charter of being non-invasive ( I would not believe someone thinking otherwise).

3.3) The basic motivating factor behind any such inbound system is probabilistic filtering of the relevant information over the web over a period of time. An almost infinitely vast expanse of knowledge base exists on the net ( both verifiable and true vs unverifiable and false) in the form of wikis and it's derivatives. However reaching out for the information that is of any utility or interest still remains a far cry for any given user. The probability of finding out a relevant piece of information on a reactive basis is continually decreasing in a system where information comes from a multitude of sources. No wonder people talk about information overloading, which is nothing but an indication of the inability and the associated stress in finding useful and appreciative information.

3.4) Coming back to the examples of inbound IRS, the problem perse is that first these examples are not pro-active in nature which is an essential characteristic requirement for an ideal inbound IRS. Pro-active systems are ones which do not require any or require very limited input from its users in order to radiate the information. For e.g. The user has to always first either search/discover a feed URL and then eventually subscribe to it. Similarly a user creates an alert for the events for which he/she wants to get notified. And the other examples also follow a similar pattern. Spams/junks however behave in an ambiguous way in that although they are more closer to the inbound IRS (because a user never explicitly does anything in order to receive them) than other counterparts but because by definition they are something unwanted and not obliged by their receivers they do not qualify as an ideal inbound IRS.

4.0) Instigation: Why should such a system work. Or become popular. Or more critically, is something like that even required. Two answers to support the argument that indeed there is a place for an inbound-IRS to exist - First it broadens the horizons and usability aspects of the information on the web. In a democratic setup of internet anything which is not acceptable would eventually wear off and be replaced by either something more efficient or revert to its original and enduring previous system. Hence the only litmus test for the feasibility of such a system is to first bring it into existence.

Besides an inbound IRS is in no way mutually exclusive to the existing outbound information retrieval systems or even close to become a perceptive threat. In fact both kinds of system would in time create a symbiotic ecology of their own which would not only benefit the consumers, but also bring more tangibility to the concept of Information Superhighway which we hear about in day to day talk. To grant the feasibility of anything like that would certainly add value to the way we treat information (sometime ruthlessly and casually), and help change our outlook for our own good.


Vital Stats for this article. (acknowledgment Google Docs)

Flesch Reading Ease: [?] 52.65
Flesch-Kincaid Grade Level: [?] 10.00
Automated Readability Index: [?] 9.00

Post to

No comments: