Attention: The Celtic Language Technology Workshop has been moved to a virtual event. Please see more details below.
Stay tuned for information about the programme.
The CLTW community and workshop – inaugurated at COLING (Dublin) in 2014 – has become a critical focus and forum for researchers working in natural language processing (NLP) and language technologies for Celtic languages. In particular, it has galvanised and catalysed research by facilitating communication and collaboration internationally. Our community is interested in language technology for both contemporary and historical stages of the Celtic languages.
In Classical times, Celtic languages were found across a wide swathe of modern Eurasia. Today, they are spoken in regions of the UK and Ireland, as well as in Brittany, France. The modern languages are: Irish, Breton, Manx, Welsh, Cornish and Scottish Gaelic. Although their hereditary communities are small compared to those of most other European languages, they continue to have a vibrant presence in their traditional areas as well as in urban centres. While Irish is the only Celtic language that has official EU language status (since 2007), Welsh, Gaelic and Manx have co-official status. Breton and Cornish also have some limited status in their home regions. That said, all Celtic languages face the same issue in lacking NLP resources to ensure continued technology support in the digital era.
While the Celtic languages share certain aspects of their sociolinguistic situation with other minority languages, their common linguistic features (e.g. VSO word order, initial mutations and reasonably complex morphology) also present unique challenges for the development of robust NLP tools. By gathering researchers from all of the Celtic languages, CLTW aims to share best practice in overcoming these difficulties.
The fifth edition in the Celtic Language Technology Workshop series will be a virtual event co-located with COLING 2025. The event was altered to be a virtual event to allow for higher attendance to the workshop. We apologise for any inconvenience caused, and hope to see many of you online on the 20th of January.
All times listed are in UTC+0.
09:00–09:10 Welcome
09:10–09:50 Keynote Speech 1
Delivered by Dr. Alham Fikri Aji, Mohamed bin Zayed University of Artificial Intelligence
09:50–10:15 An Assessment of Word Separation Practices in Old Irish Text Resources and a Universal Method for Tokenising Old Irish Text
Adrian Doyle and John P. McCrae
10:15–10:35 Break
10:35–11:00 Synthesising a Corpus of Gaelic Traditional Narrative with Cross-Lingual Text Expansion
William Lamb, Dongge Han, Ondrej Klejch, Beatrice Alex and Peter Bell
11:00–11:25 A Pragmatic Approach to Using Artificial Intelligence and Virtual Reality in Digital Game-Based Language Learning
Monica Ward, Liang Xu and Elaine Uí Dhonnchadha
11:25–11:45 Break
11:45–12:25 Keynote Speech 2
Delivered by Linda Heimisdóttir, CEO of Miðeind
12:25–12:50 Fotheidil: an Automatic Transcription System for the Irish Language
Liam Lonergan, Ibon Saratxaga, John Sloan, Oscar Maharg Bravo, Mengjie Qian, Neasa Ní Chiaráin, Christer Gobl and Ailbhe Ní Chasaide
12:50–13:15 Gaeilge Bhriste ó Shamhlacha Cliste: How Clever Are LLMs When Translating Irish Text?
Teresa Clifford, Abigail Walsh, Brian Davis and Mícheál J. Ó Meachair
13:15–13:25 Concluding Remarks
CLTW Committee
The CLTW series has seen four successful previous editions:
We invite submissions of original contributions on resources, theories, systems, applications, and methods in Natural Language Processing for any of the Celtic languages. Topics of interest include, but are not limited to the following:
We invite authors to submit unpublished work representing original research in the topics mentioned above, or related topics.
Submissions may be of two types:
The paper can include unlimited appendix and references. Papers should be written in English, and should follow the guidelines for submissions to COLING 2025. All submissions, including the main paper and its supplementary materials, should be fully anonymised.
All papers will be double-blind peer-reviewed. Authors of the accepted papers will present their work in either the Oral or Poster session. All accepted papers will appear in the workshop proceedings that will be published in ACL Anthology.
Papers should be submitted to the SoftConf page: https://softconf.com/coling2025/CLT25/