Page 175 - DLIS405_INFORMATION_STORAGE_AND_RETRIEVAL
P. 175

Information Storage and Retrieval



                 Notes          for refining or expanding a free text query (either interactively or automatically). Alternatively a
                                thesaurus can be used both in searching and indexing with controlled vocabulary indexed datasets
                                and this latter use is the immediate application of our current work (although we also see the
                                techniques as useful with free text searching).
                                In retrieval, thesaurus relationships are conventionally used to expand synonyms and sometimes
                                narrower query terms but the FACET system also performs more general semantic term expansion
                                (to broader and to related concepts). Reasoning over the semantic relationships in the thesaurus
                                permits imprecise matching between query and index terms. This allows the ranking of matching
                                items in a result list or a ‘More like this’ option for similar but not necessarily identically indexed
                                items.
                                Faceted systems are based on a primary division of terminology into fundamental, high-level
                                categories, or facets. A knowledge system can be considered as enumerative, when all possible
                                simple and compound terms are explicitly listed in their hierarchical position, or as synthetic. Faceted
                                systems are normally synthetic; they do not attempt to include the vast number of possible multi-
                                concept headings or descriptors in a domain, but combine terms from a limited number of
                                fundamental facets, as needed when indexing or querying. This flexibility allows highly specific,
                                nuanced metadata descriptions (or annotations). Matching such compound descriptors poses
                                significant challenges when searching and the full potential for retrieval has remained untapped.

                                Objectives
                                 •  The overarching objective of the research was to.
                                 •  Develop and evaluate retrieval tools based on a matching function incorporating thesaurus
                                   semantic closeness measures.
                                 •  Derive heuristics to guide automatic and interactive expansion/refinement of strings of the-
                                   saurus terms, taking advantage of the context provided by facets.
                                 •  Experiment with techniques for creating complex queries using a query editor with knowledge
                                   of the semantic roles of thesaurus facets. This will draw on previous work in the cultural heri-
                                   tage domain.
                                 •  Design and implement semantic closeness measures based on thesaurus relationships.

                                Beneficiaries of the research

                                The research is directly relevant to cultural heritage organisations and the users of their digital
                                collections, also to collection management vendors and commercial image providers. Thesauri are
                                one of the most common Knowledge Organisation Systems and frequently underpin higher level
                                schemas and ontologies. Initiatives to update international thesaurus standards are currently
                                underway and various groups are working on XML/RDF representations for thesauri. Thesauri
                                and faceted approaches have been applied to website architecture and hierarchical browsing
                                interfaces to web databases.

                                FACET Architecture and Interfaces

                                The final FACET system comprises a tiered component-based architecture (Fig. 14.1), accessing a
                                SQL Server relational database. Queries with associated results are stored persistently using XML
                                format data.








          170                              LOVELY PROFESSIONAL UNIVERSITY
   170   171   172   173   174   175   176   177   178   179   180