Abstract—There are many methods that can be used to
classify documents, some of these methods depend on discipline
and others depend upon human orientation. Nonetheless all of
them have certain degree and type of difficulty. This work was
designed to introduce a simple idea that can be used to classify
any text document. The hypothesis was based on the fact that
every science, discipline or any field of knowledge has its own
terminologies. Therefore, the algorithm (software system) was
developed employing some operations in set theory to extract
the terminologies within a certain discipline. These
terminologies were used to classify text documents, whether
they are related to specific discipline. The algorithm written to
carry out all necessary operations was implemented using
Matlab. The system developed was tested, and the results
obtained were accurate. It is anticipated that this system can be
used to facilitate e-translation of documents to produce more
meaningful translation.
Index Terms—Classification, intersection, set, terminology
Mohamed AlShaari is with the Department of computer science,
Benghazi University, Benghazi, Libya (e-mail: Mohamed.shaari@gmail.
com).
[PDF]
Cite:Mohamed Ali AlShaari, "Text Documents Classification Using Word Intersections," International Journal of Engineering and Technology vol. 6, no. 2, pp. 119-122, 2014.