Page 152 - DLIS408_INFORMATION_TECHNOLOGY-APPLICATIONSL SCIENCES
P. 152
Unit 13: Internet Based Resources and Service Browsers
13.3 Web Search Engine Notes
Figure 13.4: Search engine market share in the US, as of 2008.
Google: 67.9%
US Share Of Searches: April 2008
Source: Hitwise
for SearchEngineLand.com
Others: 1.4%
Ask: 4.2%
Microsoft: 6.3%
Yahoo: 20.3%
A web search engine is designed to search for information on the World Wide Web and FTP
servers. The search results are generally presented in a list of results and are often called hits. The
information may consist of web pages, images, information and other types of files. Some search
engines also mine data available in databases or open directories. Unlike web directories, which
are maintained by human editors, search engines operate algorithmically or are a mixture of
algorithmic and human input.
Web search engines work
Figure 13.5: High-level architecture of a standard Web crawler
World Wide
Web
Web pages
URLs Multi-threaded
Scheduler
downloader
Text and
metadata
Queue
URLs
Storage
A search engine operates in the following order:
1. Web crawling
2. Indexing
3. Searching.
Notes Web search engines work by storing information about many web pages, which
they retrieve from the html itself.
LOVELY PROFESSIONAL UNIVERSITY 147