Page 126 - DLIS402_INFORMATION_ANALYSIS_AND_REPACKAGING
P. 126
Unit 6: Information Retrieval Model and Search Strategies
Notes
The “Smart Contents Acquisition Framework” (SCAF) allows crawling of dedicated
information sources and the management of their data.
The framework supports access to different types of sources and transformation of the original data
into a predefined set of types. Afterwards, the extracted contents can be validated and relationships
between the different contents can be identified. A further automation of the crawling is achieved
with the “Smart Spider”. The spidering process performs, in addition to traditional contents analysis,
a visual analysis of the web pages. Stable visual or textual structures are identified and classificatory
are trained to learn their mapping onto the predefined content types.
Smart Information Retrieval
The Smart Information Retrieval Cluster addresses many important issues in the areas of Information
Retrieval and Artificial Intelligence, where the most important ones are dealing with the efficient
usage of the semantic information that is encapsulated into built indices, the optimization of large
search spaces to allow the application of filtering algorithms, and the reduction of the response time
in order to allow complex filtering chains with sufficient performance, and more.
Most of these goals are achieved through the intelligent application of different machine learning
techniques that together provide high quality results by taking care of semantics, reduce response
time by efficiently reducing the search space, and therefore, guarantee good scalability together,
with high user satisfaction.
User Modelling and Personalization
The User Modelling and Personalization Cluster focuses on the development of tools for User
Modelling, which collect, manage and maintain the data users explicitly input and implicitly create
while using applications. Based upon the user model, methods of Artificial Intelligence are applied
for data mining to generate knowledge about the users. The model forms the knowledge base for
affiliated applications to understand the user’s usage context and to generate adaptation decisions or,
e.g., recommendations.
6.5 Information Retrieval Manual
Legal information retrieval is the science of information retrieval applied to legal text, including
legislation, case law, and scholarly works. Accurate legal information retrieval is important to provide
access to the law to laymen and legal professionals. Its importance has increased because of the vast
and quickly increasing amount of legal documents available through electronic means.
Synopsis
In a legal setting, it is frequently important to retrieve all information related to a specific query.
However, commonly used Boolean search methods (exact matches of specified terms) on full text
legal documents have been shown to have an average recall rate as low as 20 percent, meaning that
only 1 in 5 relevant documents are actually retrieved. In that case, researchers believed that they had
retrieved over 75% of relevant documents. This may result in failing to retrieve important or
precedential cases. In some jurisdictions this may be especially problematic, as legal professionals are
ethically obligated to be reasonably informed as to relevant legal documents.
Legal Information Retrieval attempts to increase the effectiveness of legal searches by increasing
the number of relevant documents (providing a high recall rate) and reducing the number of
irrelevant documents (a high precision rate). This is a difficult task, as the legal field is prone to
LOVELY PROFESSIONAL UNIVERSITY 121