Évènements

15 mai
15/05/2023 10:00

Sciences & Société

Soutenance de thèse : Alaa ALHAMZEH

Language Reasoning by means of Argument Mining and Argument Quality

Doctorante : Alaa ALHAMZEH

Laboratoire INSA : LIRIS

Ecole doctorale : ED512 Infomaths

Understanding of financial data has always been a point of interest for market participants to make better informed decisions. Recently, different cutting-edge technologies have been addressed in the Financial Technology (FinTech) domain, including data analysis, opinion mining and financial document processing. In this thesis, we are interested in analyzing the arguments of financial experts with the goal of supporting investment decisions. Although various business studies confirm the crucial role of argumentation in financial communications, no work has addressed this problem as a computational argumentation task. In other words, the automatic analysis of arguments. Focusing on this issue, this thesis presents contributions in the three essential dimensions of theory, data, and evaluation. First, we propose a method for annotating the structure of the arguments stated by company representatives during the earnings conference calls. The proposed scheme is derived from argumentation theory at the micro-structure level of discourse. We further conducted the corresponding annotation study and published the first financial dataset annotated with arguments: FinArg. Moreover, we further investigate the question of evaluating the quality of arguments in this genre of text. To tackle this challenge, we suggest using two levels of quality metrics, considering both the Natural Language Processing (NLP) literature of argument quality and the financial era peculiarities. We have also enriched the FinArg data with our quality dimensions to produce the FinArgQuality dataset. In terms of evaluation, we validate the principle of ensemble learning on the argument identification and argument unit classification tasks. We show that combining a traditional machine learning model along with a
deep learning one, via an integration model (stacking), improves the overall performance, especially in small dataset settings. Although argument mining is mainly a domain-dependent task, to this date, the number of studies that tackle the generalization of argument mining models is still relatively small. Therefore, using our stacking approach and in comparison to the transfer learning model of DistilBert, we address and analyze three real-world scenarios concerning the model robustness over unseen domains and variant topics. In addition, with the aim of the automatic assessment of argument strength, we have investigated and compared different (refined) versions of Bert-based models. Consequently, we managed to outperform the baseline model by 13 ± 2% in terms of F1-score through integrating Bert with encoded categorical features. Beyond our theoretical and methodological proposals, the dimensions of argument quality assessment, annotated corpora, and evaluation approaches are publicly available, and can serve as strong baselines for future work in both FinNLP and computational argumentation domains. Hence, directly exploiting this thesis, we described and proposed to the community, within the framework
of the NTCIR-17 conference, a new task/challenge, the FinArg-1 task, relating to the analysis of financial arguments. We also used our proposals to respond to the Touché challenge at the CLEF 2021 conference. Our contribution was selected among the « Best of Labs ».

Informations complémentaires

  • Salle de créativité 202 - (Bibliothèque Marie Curie) - Villeurbanne