CONTECSI - International Conference on Information Systems and Technology Management - ISSN 2448-1041, 9th CONTECSI - International Conference on Information Systems and Technology Management

Tamanho da fonte: 
SYSTEM OF INFORMATION RETRIEVAL BY SEARCHING COMPARED, USING AS DESCRIPTORS MULTIWORDS EXPRESSIONS OBTAINED USING A TECHNIQUE THAT EVALUATES THE STRUCTURE OF THE DOCUMENT
Edson Marchetti da Silva, Renato Rocha Souza

Última alteração: 2015-02-04

Resumo


This paper aims to propose an alternative method for retrieving documents through Multiword Expression (MWE) extracted from a document base to be used as descriptors in search of an Information Retrieval System (IRS). In this sense, unlike the methods considered the text as a set of bag of words, we propose a method that takes into account the characteristics of the physical structure of the document in the extraction process of MWE. This set of terms extracted by using a technical algorithmic exhaustive proposal are compared with results obtained for thirteen different measures of association statistics generated by the software Ngram Statistics Package (NSP). To perform this experiment was set up with a corpus of documents in digital format

Palavras-chave


Extraction of Expressions Multiwords; Measures of Association Statistics; Compared Search; Information Retrieval System; the Document Structure

Texto completo: PDF