Page 126 - DLIS402_INFORMATION_ANALYSIS_AND_REPACKAGING
P. 126

Unit 6: Information Retrieval Model and Search Strategies



                                                                                                     Notes



                    The “Smart Contents Acquisition Framework” (SCAF) allows crawling of dedicated
                    information sources and the management of their data.

            The framework supports access to different types of sources and transformation of the original data
            into a predefined set of types. Afterwards, the extracted contents can be validated and relationships
            between the different contents can be identified. A further automation of the crawling is achieved
            with the “Smart Spider”. The spidering process performs, in addition to traditional contents analysis,
            a visual analysis of the web pages. Stable visual or textual structures are identified and classificatory
            are trained to learn their mapping onto the predefined content types.

            Smart Information Retrieval
            The Smart Information Retrieval Cluster addresses many important issues in the areas of Information
            Retrieval and Artificial Intelligence, where the most important ones are dealing with the efficient
            usage of the semantic information that is encapsulated into built indices, the optimization of large
            search spaces to allow the application of filtering algorithms, and the reduction of the response time
            in order to allow complex filtering chains with sufficient performance, and more.
            Most of these goals are achieved through the intelligent application of different machine learning
            techniques that together provide high quality results by taking care of semantics, reduce response
            time by efficiently reducing the search space, and therefore, guarantee good scalability together,
            with high user satisfaction.

            User Modelling and Personalization
            The User Modelling and Personalization Cluster focuses on the development of tools for User
            Modelling, which collect, manage and maintain the data users explicitly input and implicitly create
            while using applications. Based upon the user model, methods of Artificial Intelligence are applied
            for data mining to generate knowledge about the users. The model forms the knowledge base for
            affiliated applications to understand the user’s usage context and to generate adaptation decisions or,
            e.g., recommendations.

            6.5 Information Retrieval Manual

            Legal information retrieval is the science of information retrieval applied to legal text, including
            legislation, case law, and scholarly works. Accurate legal information retrieval is important to provide
            access to the law to laymen and legal professionals. Its importance has increased because of the vast
            and quickly increasing amount of legal documents available through electronic means.

            Synopsis

            In a legal setting, it is frequently important to retrieve all information related to a specific query.
            However, commonly used Boolean search methods (exact matches of specified terms) on full text
            legal documents have been shown to have an average recall rate as low as 20 percent, meaning that
            only 1 in 5 relevant documents are actually retrieved. In that case, researchers believed that they had
            retrieved over 75% of relevant documents. This may result in failing to retrieve important or
            precedential cases. In some jurisdictions this may be especially problematic, as legal professionals are
            ethically obligated to be reasonably informed as to relevant legal documents.
            Legal Information Retrieval attempts to increase the effectiveness of legal searches by increasing
            the number of relevant documents (providing a high recall rate) and reducing the number of
            irrelevant documents (a high precision rate). This is a difficult task, as the legal field is prone to




                                             LOVELY PROFESSIONAL UNIVERSITY                                   121
   121   122   123   124   125   126   127   128   129   130   131