Practical Skills for Working with Linguistic Data
M.A. course, University of Cologne, 2023
Academic Year: 2023-2024
This course equips students with essential skills and techniques to effectively handle linguistic data throughout the data lifecycle. Through hands-on training, students learn essential skills in preprocessing, working with data, and postprocessing using R.
Course Overview
The curriculum covers diverse preprocessing methods for cleaning and preparing primarily text-based linguistic data, enabling further annotation and analysis. Additionally, students explore techniques for data annotation, metadata management, and representation of linguistic information. The course also delves into the postprocessing phase, where students learn how to analyze and visualize linguistic data.
No prior programming experience required - this course is accessible to all students interested in managing and analyzing linguistic data.
Session Materials
- Session 1: Introductory session
- Session 2: Importance of Practical Skills in Linguistics & R basics (1)
- Session 3: R basics (2) and R markdown
- Session 4: R markdown & Tidyverse
- Session 5: Review & File Operations in R
- Session 6: Data Cleaning and Preprocessing
- Session 7: Data Cleaning and Preprocessing (2)
- Session 8: Pivoting, grouping, string operations
- Session 9: String operations and regular expressions
- Session 10: Web scraping in R
- Session 11: Automatic Annotation
- Session 12: Data Visualization
- Session 13: Data Visualization (2)
- Session 14: Session materials
Sessions 6-14 are available as dedicated pages with rendered HTML tutorials and downloadable materials.
Student Feedback Highlights
- 4.8/5.0 average rating in official faculty evaluation
- Top 25% ranking among all linguistics courses offered that semester
