It seems timely to revisit this blog post which discusses the importance of library and information professions in their role of uncovering the deep web for the benefit of end-users.
The original webinar was headed up by John DiGilio, Senior Director of Research & Intelligence at LibSource. He was joined by his colleague Katherine Henderson, a LibSource Research Analyst.
Quite aptly, John used the analogy of an iceberg to describe the different types of web - surface, deep and dark. It’s estimated that 80% of all web content actually sits below the surface, meaning that what we see through traditional engines such as Google and Bing is, quite literally, just the tip of the iceberg.
As a society, we’re throwing around the term ‘information overload’ like it’s going out of fashion but this statistic really puts that into perspective. As John points out, when people talk about ‘information overload’ they haven’t even experienced a quarter of what’s really available on the web. In fact, the issue of overload is so much more severe that one may initially realise.
How’s that for a reality check?
When we talk about the surface web, we are referring to what one would naturally from a surface level search through, i.e. our most popular search engines. This makes up around 20% of the content out there.
There is more to the internet than meets the eye, with its three distinct layers of depth. The Surface Web, occupying 10% of the internet, contains those websites with visible contents resulting from search engine indexing. These searchable, publicly available pages can be accessed from a standard web browser and connect to other pages using hyperlinks. However, information is being overlooked that was never intended to be hidden. This information, invisible to regular search engines, requires persistence and specialized search tools to locate. Beyond the Surface Web exist the Deep Web and the Dark Web Library application of Deep Web and Dark Web technologies (May 2020)
The deep web, on the other hand, goes that step further to cover the parts of the web whose content isn’t indexed by the Googles and Bings of the world. This may be due to content being behind a paywall (such as video on demand services like Netflix), it could be online banking, or web mail or many other use cases. John cited the computer scientist Mike Bergman as being the man who created the term back in the year 2000, and we have seen its usage and relevance grow ever since.
I found it fascinating to note that no single search engine will index more than, approximately, 16% of the surface web. Yet, of course, your typical student - or indeed typical professional conducting their own research, I might add - is unlikely to use more than one search engine when searching for information and resources. According to BrightPlanet, the deep web is 5000 times larger and contains 1000-2000 times better quality information than the surface web. What’s more, John cites 95% of deep web information as being publically available. So, why aren’t we making better use of it?
Of course, some information is inaccessible due to paywalls and other barriers but, mostly, benefiting from the deep web mainly requires speciality searches. You can use the same standard web browser as usual, be it Firefox, Chrome, Internet Explorer or otherwise. Instead, the focus is on leveraging specialist knowledge. You may need to head to a particular deep web search engine, run a search behind a paywall or know exactly what resource to start with.
Clearly, leveraging the deep web requires significantly more effort than your standard surface search. Plus, there’s even more information overload to contend with. What’s the point?
John answered this beautifully:
“Ultimately, the question is the answer” John DiGilio (2017)
Everyone is using the same set of standard search engines to search through the same set of information. Whether they be your child studying for a school research project or an attorney fact checking for a case, they are all doing one thing - sticking to the status quo.
“The status quo, or the average, is not what we strive for in this industry. We cannot afford to.” John DiGilio (2017)
In a knowledge based industry, sticking to the status quo could be a very expensive mistake. Here are some reasons from John as to why:
Just a simple glance at the challenges awaiting deep web explores makes the need for an expert guide more clear than ever.
We’ve written before about the power of Librarians on this blog, and this need is more prominent now than ever before. The information industry is continuously growing and evolving, and so too are the needs and requirements of the people working within it. There’s great accolade and advantage to be won for those who rise to its challenges.
Librarians are information experts who are able to master search whilst also leverage technology to their advantage. As John rightly points out, not only do you need to know how to find material and how to use it but you also need to let people know what it is you do, how you do it and that you are there to serve as a resource.
In fact, John went on to hypothesise that we are at a tipping point of actual information overload and do-it-yourself research effectiveness. He suggested that:
“[We] need the guidance of people trained in library and information science to help make sense of it all, to facilitate skilled information retrieval and to collaborate with information requesters for evaluation and further report.” John DiGilio (2017)
In 2014, the Australian Library and Information Association found there to be a $5.43 for every dollar invested in Librarians and library services. With our ever growing wealth of information to sort through, and new technologies waiting to be utilised at our fingertips, this figure may well become significantly higher.