CETEMPúblico

CETEMPúblico (Corpus de Extractos de Textos Electrónicos MCT/Público) is a corpus of newspaper text of the Portuguese daily newspaper PÚBLICO, compiled for purposes of research and development in natural language processing (NLP) by the project Computational processing of Portuguese, under an agreement signed by PÚBLICO and the Portuguese Ministry of Science and Technology (MCT).

CETEMPúblico can be used for research and development purposes. Its direct commercial exploitation (such as sale of the material included in the corpus or the corpus itself, or providing paid access to the corpus) is forbidden.

PÚBLICO must be always mentioned as the source for the material in every public presentation of work this tool has been used, such as papers, theses, and conference reports.

PÚBLICO should receive without charge any product which results from any project where CETEMPúblico was used.

More detailed information is available from http://www.linguateca.pt/cetempublico/.