Building an Electronic Dictionnary

of Computer Terminology

 

Farida AOUGHLIS (Université Mouloud Mammeri,

Tizi Ouzou, Algérie) fariyamo@yahoo.fr

 

 

Key words : Terminology, terminology extraction,

electronic dictionary, compound words.

 

Abstract

 

An automatic text analysis system cannot lexically recognize a word unless it already exists in the electronic dictionary. At the LADL, the electronic dictionaries built manually for various natural languages (French, English…) exist, but for the technical languages or of speciality, work remains to be made. Our work applies to the INTEX system, with for immediate objective the installation of an electronic dictionary of French computer science terms (compound nouns), with an aim of analysing, automatically indexing technical texts. The linguistic aspects of the terminology are retained.

To build the terms dictionary, it is necessary to identify, to count  the terms and to acquire them and for all that, to find corpora, to locate the terms by deciding if a word group constitutes or not an entry in the dictionary. In the case of a compound noun is a possible lexical entry, we codify the entry and we add it in the dictionary. More than 3000 terms were extracted and listed manually. Most of them are NA, NDN and NN. The elaborate dictionary will be added to INTEX and will make it possible to analyse computer science texts. Various applicability  are possible such are the information retrieval, the machine analysis of texts, the generation of texts and machine translation. The method of extraction is manual; a semi-automatic method of extraction is actually programmed in C++ Builder  and tested. The ATNs are used.