|
|
Please use this identifier to cite or link to this item:
http://hdl.handle.net/10174/41452
|
| Title: | A Galician-Portuguese Generative Model |
| Authors: | Gamallo, Pablo Rodríguez, Pablo Sotelo, Susana Miquelina, Nuno Paniagua, Silvia Schmidt, Daniela de-Dios-Flores, Iria Quaresma, Paulo Bardanca, Daniel Pichel, José Ramom Nogueira, Vítor Barro, Senén |
| Keywords: | Large Language Models Generative Models alician Portuguese Continual Pretraining |
| Issue Date: | 16-Nov-2024 |
| Publisher: | Springer |
| Citation: | Gamallo, P. et al. (2025). A Galician-Portuguese Generative Model. In: Santos, M.F., Machado, J., Novais, P., Cortez, P., Moreira, P.M. (eds) Progress in Artificial Intelligence. EPIA 2024. Lecture Notes in Computer Science(), vol 14969. Springer, Cham. https://doi.org/10.1007/978-3-031-73503-5_24 |
| Abstract: | Large language models (LLMs) have revolutionized natural language processing, but their predominant focus on English has resulted in biases and performance differences across various languages. This situation is maintained in generative multilingual models, where English continues to be the predominant language. In these models, the presence of European Portuguese is marginal and that of the Galician variety is almost residual. In this work, we describe an open-source Galician-Portuguese generative model, Carvalho_pt-gl, focused precisely on these two language variants, which are very close lexically and syntactically. The model was trained using a GPT architecture with 1.3 billion parameters on more than 6B words, balanced between the two varieties. The strategy of continual pertaining was used to adapt a pre-existing LLM that was trained on a trilingual dataset with related languages, thereby overcoming the data limitations that would be faced if the training was started from scratch. Evaluation results involving task-based datasets from standardized benchmarks indicate a promising performance. These findings highlight the critical importance of supporting linguistic diversity in generative models. |
| URI: | https://link.springer.com/chapter/10.1007/978-3-031-73503-5_24 http://hdl.handle.net/10174/41452 |
| Type: | article |
| Appears in Collections: | VISTALab - Artigos em Livros de Actas/Proceedings
|
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.
|