Page 152 - DLIS408_INFORMATION_TECHNOLOGY-APPLICATIONSL SCIENCES
P. 152

Unit 13: Internet Based Resources and Service Browsers

            13.3  Web Search Engine                                                                Notes

                          Figure 13.4: Search engine market share in the US, as of 2008.




                                                 Google: 67.9%
                                              US Share Of Searches: April 2008
                                                  Source: Hitwise
                                                for SearchEngineLand.com

                            Others: 1.4%
                                Ask: 4.2%
                                   Microsoft: 6.3%
                                                  Yahoo: 20.3%






            A web search engine is designed to search for information on the World Wide Web and FTP
            servers. The search results are generally presented in a list of results and are often called hits. The
            information may consist of web pages, images, information and other types of files. Some search
            engines also mine data available in databases or open directories. Unlike web directories, which
            are maintained by human editors, search engines operate algorithmically or are a mixture of
            algorithmic and human input.

            Web search engines work

                           Figure 13.5: High-level architecture of a standard Web crawler




                                                    World Wide
                                                      Web

                                                         Web pages


                                            URLs   Multi-threaded
                                 Scheduler
                                                    downloader
                                                                   Text and
                                                                   metadata
                                   Queue
                                          URLs
                                                               Storage




            A search engine operates in the following order:
               1. Web crawling
               2. Indexing
               3. Searching.




              Notes  Web search engines work by storing information about many web pages, which
                    they retrieve from the html itself.

                                  LOVELY PROFESSIONAL UNIVERSITY                                              147
   147   148   149   150   151   152   153   154   155   156   157