At present time in searching systems is using a relevant valuing model of adequacy of studying document to searching application. This model practically doesn’t cope with solving the problems of homonyms (grammatical and especially lexical), synonyms and multi-meaning words recognizing and searching. It conditioned by the fact, that relevant model of searching based on linguistic approach and number of synthetic valuating criteria (such as position on page), but language artifacts enumerated earlier couldn’t be recognized without understanding of searching application’s sense.
Such restriction of relevant model has already been decreasing the efficiency of searching mechanism and closing the opportunity for further increasing of searching quality. Hence, to overcome it, it is necessary to come to direct valuating of sense correspondence (pertinent) of searching application to studying document.
From the theory of semantic space (notion space) point of view valuating of information correspondence between one document (searching application) and another (studying document) is projection of space of the first document on the space of the second document. The larger this projection is, the more sense of studying document corresponds with sense of searching application.
Let the term “descriptor” in this context be equal to the term “notion”. Such rename accepted for corresponding common linguistic terminology.
In the terminology of classifiers, descriptor is one or several words of given language (synonyms), characterizing this notion. Descriptor language intended for co-ordinate indexation of documents and information application through descriptors and/or keywords.
From the theory of semantic space (notion space) point of view, valuing of information adequacy between one document (searching application) and another (studying document) is projection of space of the first on the space of the second. The larger this projection, the more sense of studying document agrees with sense of searching application.
From the theory point of view descriptor (notion) is area in semantic space,
d {x1, x2, …xn}, where x1, x2, xi – average distances (x1=(xmax1+Xmin1)/2) to corresponded co-ordinate hubs of notion space or, in other words, this are weights, squeezing given descriptor to one or another section of catalogue of subject areas of searching. |