Journal of Modern Foreign Psychology
2022. Vol. 11, no. 1, 104–115
doi:10.17759/jmfp.2022110110
ISSN: 2304-4977 (online)
Methods of Computational Linguistics and Natural Language Processing: Opportunities and Limitations for Personality Psychology Tasks
Abstract
General Information
Keywords: computational linguistics, natural language processing, personality psychology, textual data analysis
Journal rubric: General Psychology
Article type: review article
DOI: https://doi.org/10.17759/jmfp.2022110110
Funding. This research is supported by the Faculty of Social Sciences, HSE University.
For citation: Kuzmina A.A., Lifshits M.A., Kostenko V.Y. Methods of Computational Linguistics and Natural Language Processing: Opportunities and Limitations for Personality Psychology Tasks [Elektronnyi resurs]. Sovremennaia zarubezhnaia psikhologiia = Journal of Modern Foreign Psychology, 2022. Vol. 11, no. 1, pp. 104–115. DOI: 10.17759/jmfp.2022110110. (In Russ., аbstr. in Engl.)
References
- Plungyan V.A. Zachem nuzhen Natsional'nyi korpus russkogo yazyka? Neformal'noe vvedenie [Why do we need the National Corpus of the Russian Language? informal introduction] [Elektronnyi resurs]. Natsional'nyi korpus russkogo yazyka: 2003—2005. Rezul'taty i perspektivy [National Corpus of the Russian Language: 2003-2005. Results and prospects]. Moscow: Indrik, 2005. pp. 6—20. URL: https://elibrary.ru/item.asp?id=26629630 (дата обращения: 03.03.2022). (In Russ.).
- Yasulova Kh.S., Shikhiev Sh.B. Prikladnye zadachi komp'yuternoi lingvistiki [Applied Problems of Computational Linguistics] [Elektronnyi resurs]. Vestnik Sotsial'no-pedagogicheskogo instituta [Bulletin of the Social and Pedagogical Institute], 2015. Vol. 14, no. 2, 3 p. URL: https://cyberleninka.ru/article/n/prikladnye-zadachi-kompyuternoy-lingvistiki (Accessed 03.03.2022). (In Russ.).
- Mikolov T., Grave E., Bojanowski P., Puhrsch C., Joulin A. Advances in pre-training distributed word representations [Elektronnyi resurs]. arXiv preprint arXiv:1712.09405, 2017. 4 p. URL: https://arxiv.org/pdf/1712.09405.pdf (Accessed 03.03.2022).
- Allport G.W., Odbert H.S. Trait-names: A psycho-lexical study. Psychological monographs, 1936. Vol. 47, no. 1, i-171. DOI:10.1037/h0093360
- O'Callaghan D., Greene D., Carthy J., Cunningham P. An analysis of the coherence of descriptors in topic modeling. Expert Systems with Applications, 2015. Vol. 42, no. 13, pp. 5645—5657. DOI:10.1016/j.eswa.2015.02.055
- Park G., Schwartz H.A., Eichstaedt J.C., Kern M.L., Kosinski M., Stillwell D.J., Ungar L.H., Seligman M.E.P. Automatic personality assessment through social media language. Journal of Personality and Social Psychology, 2015. Vol. 108, no. 6, pp. 934—952. DOI:10.1037/pspp0000020
- Argamon S., Koppel M., Pennebaker J.W., Schler J. Automatically profiling the author of an anonymous text. Communications of the ACM, 2009. Vol. 52, no. 2, pp. 119—123. DOI:10.1145/1461928.1461959
- Besharati M.R., Izadi M. DAST Model: Deciding About Semantic Complexity of Text By DAST Model [Elektronnyi resurs]. ArXiv, 2019. 40 p. URL: http://arxiv.org/abs/1908.09080 (Accessed 03.03.2022).
- Bird S., Loper E. NLTK: the natural language toolkit [Elektronnyi resurs]. COLING ACL 2006 : 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics: Proceedings of the Interactive Presentation Sessions Proceedings of the COLING/ACL 2006 Interactive Presentation Sessions. Stroudsburg, PA: Association for Computational Linguistics (ACL), 2006, pp. 69—72. URL: https://www.aclweb.org/anthology/P04-3031.pdf (Accessed 03.03.2022).
- Bisong E. Google AutoML: Cloud Natural Language Processing. Building Machine Learning and Deep Learning Models on Google Cloud Platform. Berkeley, CA: Apress, 2019, pp. 599—612. DOI:10.1007/978-1-4842-4470-8_43
- Bleidorn W., Hopwood C.J., Wright A.G. Using big data to advance personality theory. Current Opinion Behavioral Sciences, 2017. Vol. 18, pp. 79—82. DOI:10.1016/j.cobeha.2017.08.004
- Campbell J.C., Hindle A., Stroulia E. Latent Dirichlet Allocation. In Bird C., Menzies T., Zimmermann T. (eds.), The Art and Science of Analyzing Software Data. Waltham, MA: Elsevier, 2015, pp. 139—159. DOI:10.1016/B978-0-12-411519-4.00006-9
- Clark A., Fox C., Lappin S. The handbook of computational linguistics and natural language processing [Elektronnyi resurs]. West Sussex, England : Wiley-Blackwell, 2013. 800 p. URL: https://books.google.ru/books?id=zBmom42eWPcC&lpg=PA3&hl=ru&pg=PA3#v=onepage&q&f=false (Accessed 03.03.2022).
- Crossley S.A., Kyle K., McNamara D.S. The tool for the automatic analysis of text cohesion (TAACO): Automatic assessment of local, global, and text cohesion. Behavior research methods, 2016. Vol. 48, no. 4, pp. 1227—1237. DOI:10.3758/ s13428-015-0651-7
- Bogolyubova O., Panicheva P., Tikhonov R., Ivanov V., Ledovaya Y. Dark personalities on Facebook: Harmful online behaviors and language. Computers in Human Behavior, 2018. Vol. 78, pp. 151—159. DOI:10.1016/j.chb.2017.09.032
- SlovNet Deep Learning based NLP modeling for Russian language [Elektronnyi resurs]. Github, 2020. URL: https:// github.com/natasha/slovnet (Accessed 03.03.2022).
- Dickerson M. A gentle introduction to text analysis with Voyant tools [Elektronnyi resurs]. eScholarship. 2018. 22 p. URL: https://escholarship.org/content/qt6jz712sf/supp/Dickerson_TextAnalysisVoyantTools_112018.pdf (Accessed 03.03.2022).
- bureaucratic-labs Dostoevsky: Sentiment analysis library for russian language [Elektronnyi resurs]. Github, 2022. URL: https://github.com/bureaucratic-labs/dostoevsky (Accessed 03.03.2022).
- Zhang Y., Wei H., Ran Y., Deng Y., Liu D. Drawing openness to experience from user generated contents: An interpretable data-driven topic modeling approach. Expert Systems with Applications, 2020. Vol. 144. Article ID 113073, 13 p. DOI:10.1016/j.eswa.2019.113073
- Goranson A., Ritter R.S., Waytz A., Norton M.I., Gray K. Dying is unexpectedly positive. Psychological Science, 2017. Vol. 28, no. 7, pp. 988—999. DOI:10.1177/0956797617701186
- Kleim B., Horn A.B., Kraehenmann R., Mehl M.R., Ehlers A. Early linguistic markers of trauma-specific processing predict post-trauma adjustment. Frontiers in psychiatry, 2018. Vol. 9. Article ID 645, 7 p. DOI:10.3389/fpsyt.2018.00645
- Eder M., Rybicki J., Kestemont M. Stylometry with R: a package for computational text analysis. The R Journal, 2016. Vol. 8, no. 1, pp. 119—121. DOI:10.32614/RJ-2016-007
- Ferraro F.R. Males tend to die, females tend to pass away. Death studies, 2019. Vol. 43, no. 10, pp. 665—667. DOI:10.1 080/07481187.2018.1515127
- Carreras X., Chao I., Padro, Padro M. FreeLing: An Open-Source Suite of Language Analyzers [Elektronnyi resurs]. In Lino M.T., Xavier M.F., Ferreira F., Costa R., Silva R. (eds.), Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC'04), Lisbon, Portugal. Lisbon: European Language Resources Association (ELRA), 2004, pp. 239—242. URL: http://www.lrec-conf.org/proceedings/lrec2004/pdf/271.pdf (Accessed 03.03.2022).
- Kern M.L., Park G., Eichstaedt J.C., Schwartz H.A., Sap M., Smith L.K., Ungar L.H. Gaining insights from social media language: Methodologies and challenges. Psychological methods, 2016. Vol. 21, no. 4, pp. 507—525. DOI:10.1037/ met0000091
- Goldberg L. R. Language and individual differences: The search for universals in personality lexicons. Review of personality and social psychology, 1981. Vol. 2, no. 1, pp. 141—165.
- Grishman R. Computational linguistics: an introduction [Elektronnyi resurs]. Cambridge: Cambridge University Press, 1986. 193 p. URL: https://books.google.ru/books?id=Ar3-TXCYXUkC&lpg=PP1&hl=ru&pg=PP1#v=onepage&q&f=f alse (Accessed 03.03.2022).
- Haspelmath M., Michaelis S.M. Analytic and synthetic: Typological change in varieties of European languages. In Buchstaller I., Siebenhaar B. (eds.), Language Variation — European Perspectives VI. Selected papers from the Eighth International Conference on Language Variation in Europe (ICLaVE 8). Leipzig: John Benjamins Publishing Company, 2017, pp. 3—22. DOI:10.1075/silv.19.01has
- Nguyen D., Gravel R., Trieschnigg D., Meder T. “How Old Do You Think I Am?” A Study of Language and Age in Twitter [Elektronnyi resurs]. Proceedings of the International AAAI Conference on Web and Social Media, 2013. Vol. 7, no. 1, pp. 439—448. URL: https://ojs.aaai.org/index.php/ICWSM/article/view/14381 (Accessed 03.03.2022).
- Ireland M.E., Mehl M.R. Natural language use as a marker [Elektronnyi resurs]. In Holtgraves T.M. (eds.), The Oxford handbook of language and social psychology. Oxford University Press, 2014, pp. 201—218. URL: https://books.google.ru/ books?id=I2UJBAAAQBAJ&lpg=PP1&hl=ru&pg=PA201#v=onepage&q&f=false (Accessed 03.03.2022).
- Korobov M. Morphological analyzer and generator for Russian and Ukrainian languages. In Khachay M.Yu., Konstantinova N., Panchenko A., Ignatov D., Labunets V.G. (eds.), International Conference on Analysis of Images, Social Networks and Texts. New York: Springer, 2015, pp. 320—332. DOI:10.1007/978-3-319-26123-2_31
- Lyons M., Aksayli N. D., Brewer G. Mental distress and language use: Linguistic analysis of discussion forum posts. Computers in Human Behavior, 2018. Vol. 87, pp. 207—211. DOI:10.1016/j.chb.2018.05.035
- McCoy T.H. Mapping the Delirium Literature Through Probabilistic Topic Modeling and Network Analysis: A Computational Scoping Review. Psychosomatics, 2019. Vol. 60, no. 2, pp. 105—120. DOI:10.1016/j.psym.2018.12.003
- Noecker Jr J., Ryan M., Juola P. Psychological profiling through textual analysis. Literary and Linguistic Computing, 2013. Vol. 28, no. 3, pp. 382—387. DOI:10.1093/llc/fqs070
- Tran Dang Hien, Do Van Tuan, Pham Van At, Le Hung Son Novel Algorithm for Non-Negative Matrix Factorization. New Mathematics and Natural Computation, 2015. Vol. 11, no. 02, pp. 121—133. DOI:10.1142/S1793005715400013
- Panicheva P., Litvinova T. Matching LIWC with Russian Thesauri: An Exploratory Study. In Filchenkov A., Kauttonen J., Pivovarova L. (eds.), Artificial Intelligence and Natural Language: 9th Conference, AINL 2020: Helsinki, Finland, October 7—9, 2020: Proceedings. Cham: Springer, 2020, pp. 181—195. DOI:10.1007/978-3-030-59082-6_14
- Pennacchiotti M., Popescu A.M. Democrats, republicans and starbucks afficionados: user classification in twitter. KDD ‘11: Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining. New York: Association for Computing Machinery, 2011, pp. 430—438. DOI:10.1145/2020408.2020477
- Pennebaker J.W. Putting stress into words: Health, linguistic, and therapeutic implications. Behaviour research and therapy, 1993. Vol. 31, no. 6, pp. 539—548. DOI:10.1016/0005-7967(93)90105-4
- Lanning K., Pauletti R.E., King L.A., McAdams D.P. Personality development through natural language. Nature human behavior, 2018. Vol. 2, no. 5, pp. 327—334. DOI:10.1038/s41562-018-0329-0
- Qiu L., Chen J., Ramsay J., Lu J. Personality predicts words in favorite songs. Journal of Research in Personality, 2019. Vol. 78, pp. 25—35. DOI:10.1016/j.jrp.2018.11.004
- Rehurek R., Sojka P Gensim — statistical semantics in Python [Elektronnyi resurs]. Paris: EuroScipy, 2011. 1 p. URL: https://www.fi.muni.cz/usr/sojka/posters/rehurek-sojka-scipy2011.pdf (Accessed 03.11.2021).
- Schubert L. Computational Linguistics [Elektronnyi resurs]. The Stanford Encyclopedia of Philosophy Archive, 2014. URL: https://plato.stanford.edu/archives/spr2020/entries/computational-linguistics/ (Accessed 03.11.2021).
- Shavrina T.O., Benko V. Omnia russica: even larger russian corpus [Elektronnyi resurs]. In Zakharova V.P. (eds.), Trudy mezhdunarodnoi konferentsii «Korpusnaya lingvistika — 2019 [Proceedings of the international conference «Corpus linguistics—2019»]. Sankt-Peterburg: Izdatel'stvo Sankt-Peterburgskogo gosudarstvennogo universiteta, 2019, pp. 94—102. URL: https://events.spbu.ru/eventsContent/events/2019/corpora/corp_sborn.pdf (Accessed 03.11.2021).
- Stirman S.W., Pennebaker J.W. Word use in the poetry of suicidal and nonsuicidal poets. Psychosomatic medicine, 2001. Vol. 63, no. 4, article ID 150, pp. 517—522. DOI:10.1097/00006842-200107000-00001
- Kowsari K., Meimandi K.J., Heidarysafa M., Mendu S., Barnes L., Brown D. Text classification algorithms: A survey. Information, 2019. Vol. 10, no. 4, 68 p. DOI:10.3390/info10040150
- Pennebaker J.W., Boyd R.L., Jordan K., Blackburn K. The development and psychometric properties of LIWC2015 [Elektronnyi resurs]. Austin, TX: University of Texas at Austin, 2015. 26 p. URL: https://repositories.lib.utexas.edu/bitstream/handle/2152/31333/LIWC2015_LanguageManual.pdf?Sequence=3 (Accessed 03.11.2021).
- Pang D., Eichstaedt J.C., Buffone A., Slaff B., Ruch W., Ungar L.H. The language of character strengths: Predicting morally valued traits on social media. Journal of personality, 2020. Vol. 88, no. 2, pp. 287—306. DOI:10.1111/jopy.12491
- Bogolyubova O., Panicheva P., Ledovaya Y., Tikhonov R., Yaminov B. The Language of Positive Mental Health: Findings From a Sample of Russian Facebook Users. SAGE Open, 2020. Vol. 10, no. 2, 8 p. DOI:10.1177/2158244020924370
- Le M.T., Woodworth M., Gillman L., Hutton E., Hare R.D. The linguistic output of psychopathic offenders during a PCL-R interview. Criminal justice and behavior, 2017. Vol. 44, no. 4, pp. 551—565. DOI:10.1177/0093854816683423
- Franz P.J., Nook E.C., Mair P., Nock M.K. Using Topic Modeling to Detect and Describe Self-Injurious and Related Content on a Large-Scale Digital Platform. Suicide and Life-Threatening Behavior, 2020. Vol. 50, no. 1, pp. 5—18. DOI:10.1111/sltb.12569
- Vergani M., Bliuc A.M. The language of new terrorism: Differences in psychological dimensions of communication in Dabiq and Inspire. Journal of Language and Social Psychology, 2018. Vol. 37, no. 5, pp. 523—540. DOI:10.1177/0261927X17751011
- Weintraub W. Verbal behavior: Adaptation and psychopathology. New York: Springer Publishing Company, 1981. 214 p. DOI:10.2307/3790837
- Murakami A., Thompson P., Hunston S., Vajn D. ‘What is this corpus about?': using topic modelling to explore a specialised corpus. Corpora, 2017. Vol. 12, no. 2, pp. 243—277. DOI:10.3366/cor.2017.0118
- Wright A.G.C. Current directions in personality science and the potential for advances through computing. IEEE Transactions on Affective Computing, 2014. Vol. 5, no. 3, pp. 292—296. DOI:10.1109/TAFFC.2014.2332331
Information About the Authors
Metrics
Views
Total: 2313
Previous month: 103
Current month: 56
Downloads
Total: 402
Previous month: 20
Current month: 6