The OpenWordnet-PT (abbreviated as OpenWN-PT or simply OWN-PT) is a open access wordnet for Portuguese, originally developed by Valeria de Paiva, Alexandre Rademaker and Gerard de Melo as a syntactic projection of Universal WordNet (UWN) of de Melo and Weikum. Like many other open wordnets we believe that lexical resources need to be open to be useful.
How to use
The easiest way to use the OpenWN-PT data is from Python's NLTK library, although this might not always be up-to-date. The original OpenWN-PT data is available in RDF/OWL format, following and expanding, when necessary, the mappings from the original Princeton WordNet. Both the data and the RDF template settings (classes and properties) of the OpenWN-PT are freely available for download here. Besides being downloadable, the data can be retrieved via SPARQL in the endpoint and one can consult and compare it with other wordnets at the generic interface provided by the Open MultiLingual WordNet project.
- Alexandre Rademaker
- Valeria de Paiva
- Bruno Cuconato
- Henrique Muniz
- Claudia Freitas
- Livy Real
- Fabricio Chalub
The process of building the OpenWN-PT used machine learning to construct relationships between graphs representing information coming from several versions of Wikipedia, as well as open dictionaries. Starting as a projection at the level of the lemmas in Portuguese and their relationships, the OpenWN-PT has been constantly improved through linguistically motivated additions, either manual or semi-automatic, making use of large corpora. This kind of construction, automatically started, but manually curated and improved, is well exemplified in the construction of our sister project, the NomLex-PT, an open access, wide coverage lexicon of nominalizations in Portuguese, described in this paper.
The OpenWN-PT has been developed since 2010 with the main objective of serving as a lexicon for a proposed NLP system focused on logical reasoning, (whether developed using descriptive logics or first-order logic), based on knowledge representation coming from language, for example using the SUMO ontology. The philosophy of OpenWN-PT is to maintain a close alignment with the original Princeton WordNet, but to remove the biggest mistakes created by automated methods, using language tools and skills. One consequence of this close connection with the Princeton WordNet is the ability to minimize the impact of lexicographical decisions on the separation or grouping of senses in a synset.
The OpenWN-PT was chosen by the organizers of the Freeling, Bond's Open Multilingual Wordnet and BabelNet projects as well as by Google Translate as the representative of the open wordnets in Portuguese used by these projects, respectively.