Round Corner
Department of Computer and Information Science


Code-switching and multi-lingualism in social media

When two individuals who are bi- or multi-lingual in an overlapping set of languages communicate, they tend to switch seemlessly and effortlessly between the languages (codes) they share. Such code-switching is most prominent in spoken language conversations, but also occurs frequently in social media texts that are fairly informal and conversational in nature. The aim of this project is to apply various machine learning methods to such code-switched texts from Twitter, Facebook or Whatsapp, and to, e.g., identify the language of each word or to annotate the texts with part-of-speech tags or utterance boundaries.


Björn Gambäck Björn Gambäck
315 IT-bygget
735 93354 
NTNU logo