Multimodality and AI
In the transformative landscape of Artificial Intelligence, a comprehensive understanding of language and communication is crucial. Human society is challenged to have meaningful interactions with AI systems and to find responsible ways of reproducing communication and knowledge by and with machines.
However, the language(s) and communicative means used in these interactions and reproductions are never only characterized by purely verbal means. Especially in digital contexts and in relation to digital technologies, spoken and written languages come deeply intertwined with and enriched by other modes of expression such as static and moving images, emojis, infographics, sounds, music, gestures, and body movements, etc. It is in fact now commonly assumed that language and communication are inherently multimodal – and artificial intelligence has yet to process and replicate this depth and complexity inherent in human expression. Systems and tools already partly include and combine multiple types of data and modes to facilitate human-like reasoning and decision-making (e.g. written text and speech recognition, image recognition, etc.) but their accuracy and effectiveness still have a huge potential to improve.
Multiple modes
A relevant addition to the theme 'Language and AI' is therefore the expertise and knowledge gained from research on the multimodality of language and communication - a field that has gained more and more importance in the last three decades. Besides extensive and systematic knowledge about the characteristics of specific modes and expressive forms, the field has established particular expertise on the meaningful interplay and integration of multiple modes, which is a key component for future artificial intelligence systems.
The Multimodality and AI group provides and further develops this expertise by studying the multimodal intricacies of human communication on a multitude of levels (both qualitative and quantitative, with theoretical and empirical output). Their aim is to enrich research on artificial intelligence by a particularly broad view, but nuanced understanding of the complex communication processes and problems in our society. From enhancing natural language understanding through the integration of visual and contextual cues, to enabling more intuitive human-computer interactions through gesture and facial expression recognition, to supporting multimodal data processing in health, corporate, or environmental communication, to smart-assistive technologies and tools for the identification of misinformation, fake news and/or hate speech – the implications can and will be far-reaching.

Contact
Contact the group for chats, collaborations, discussions via Janina Wildfeuer, j.wildfeuer rug.nl
Last modified: | 14 February 2025 4.02 p.m. |