Affective lexica and other resources for Italian
An affective lexicon is a database of words (or word senses, phrases, or other kinds of lexical items) where each item is classified according to its content in terms of subjectivity, polarity (positive or negative), capability of evoking specific emotions and so on. Such resources are used to build automatic systems that analyze natural language (for example, from websites or social media), and “read” the sentiment expressed in the text. This activity is called Sentiment Analysis (or Opinion Mining) and it is gaining more and more attention from the scientific communities as well as industry, because it can answer questions like “are customers happy with product X?” or “what type of people approve policy Y?”. Italian is a somewhat poorly represented language in the panorama of language resources. This is true for affective lexica too, but thanks to a vibrant community, things are rapidly changing. We conducted a quick survey, asking the members of AILC about affective lexica for Italian. The results of the survey are summarized in the list below. Some of them are lexica, some are other kinds of resources and methods, in the Italian language or otherwise linked to the Italian NLP community.
Affective lexicon, automatically build by aligning MultiWordNet, WordNet and SentiWordNet.
Each sense is given scores for positive polarity, negative polarity and intensity.
Available at http://valeriobasile.github.io/twita/downloads.html.
Publication: V. Basile and M. Nissim (WASSA 2013).
- Lexicon created semi-automatically for the participation to the EVALITA 2014 shared task SENTIPOLC.
Described in Di Gennaro, Rossi e Tamburini (EVALITA 2014).
- Sentiment lexicon developed semi-automatically for the Opener project.
It contains 24.293 lexical entries labeled with positive/neutral/negative polarity.
Available at https://dspace-clarin-it.ilc.cnr.it/repository/xmlui/handle/20.500.11752/ILC-73.
- Proprietary sentiment lexicon containing single words, multiword expressions and idiomatic expressions, annotated with polarity, intensity, emotions and domain distributed by CELI under commercial licence.
Described in A. Bolioli, F. Salamino, V. Porzionato (ESSEM 2013).
- Polarized word embeddings can be created with the technique described in G. Attardi (IIR 2015) and implemented in DeepNL.
- Database of affective norms for Italian developed for the INCREASE project.
Available at https://sites.google.com/view/mariamontefinese/norms-data?authuser=0 (other affective and semantic resources are available on the same Web page).
Described in Montefinese, M., Ambrosini, E., Fairfield, B. et al. Behav Res (2014).
- Automatic method to build multilingual opinionated lexicons based on distant supervision.
Used for the participation to the EVALITA 2016 shared task SENTIPOLC.
Dictionaries in English and Italian are available at http://sag.art.uniroma2.it/demo-software/distributional-polarity-lexicon/.
Described in G. Castellucci, D. Croce, R. Basili (2016) and G. Castellucci, D. Croce, R. Basili (2015).
High coverage resource containing roughly 155.000 English words associated with a sentiment score included between -1 and 1.
Available at http://hlt-nlp.fbk.eu/technologies/sentiwords.
Described in Gatti L., Guerini M. & Turchi M. (2015).
- SentIta and Doxa
Italian databases and tools for sentiment analysis.
Described in S. Pelosi (CLiC-it 2015), A Maisto and S Pelosi (NOOJ 2014), Elia et al (FSMNLP 2015)
This list is open to updates and additions. If you know of other resources that would fit the list above, please contact AILC and let us know.