Program outline
– Introduction to text mining: At the intersection between big data and linguistics
– Natural Language Processing: Syntax, semantics and data preprocessing
– Dictionary approaches: Use of existing dictionaries and development of customized ones
– Supervised learning: Classifications of documents into predefined categories
– Unsupervised approaches: Text clustering and topic extraction
– Application exercises with Knime Analytics and R