CV

πŸŽ“ Education

Ph.D. in Linguistics (2017-2022)

πŸ›οΈ University of Cologne, Germany
SFB1252 "Prominence in Language"

πŸ“„ Thesis: Referring Expression Generation in Context: Combining Linguistic and Computational Approaches
πŸ‘₯ Supervisors: Prof. Dr. Nikolaus P. Himmelmann, Prof. Dr. Kees van Deemter


Research Master's Programme in Linguistics (2012-2014)

πŸ›οΈ Utrecht University, the Netherlands

πŸ“„ Thesis: The Vector-based Semantics of Distal PP Modification
πŸ‘₯ Supervisors: Prof. Dr. Yoad Winter, Dr. Choonkyu Lee


Exchange Student in Cognitive Science Department / Intern in Mercator Research Group (2013-2014)

πŸ›οΈ Ruhr University Bochum, Germany


M.A. in General Linguistics (2009-2011)

πŸ›οΈ Allameh Tabatabai University, Iran

πŸ“„ Thesis: The Morpho-semantic Analysis of Bahuvrihi Compounds in Persian
πŸ‘₯ Supervisor: Prof. Dr. Koorosh Safavi


B.A. in English Language and Literature (2005-2009)

πŸ›οΈ University of Isfahan, Iran


Diploma in Mathematics and Physics Discipline (2001-2005)

πŸ›οΈ Edalat High School, Iran

πŸ’Ό Work Experience

Data Scientist (2024 - Present)

🏒 Trivago N.V.
DΓΌsseldorf, Germany

πŸ’Ό Core Responsibilities:

  • Build rule-based, ML, and LLM models for different tasks such as matching, deduplication, and data enrichment
  • Develop and optimise BigQuery SQL and Python pipelines at scale
  • Perform EDA, anomaly detection, data transformation, and visualisation
  • Collaborate with backend engineers to productionize scalable models
  • Work closely with stakeholders to deliver data-driven solutions

Research Associate (2017 - 2024)

πŸ›οΈ University of Cologne
Germany

πŸ”¬ Collaborative Research Center "Prominence in Language" (SFB1252)
πŸ“‹ Project: "INF – Infrastructure: Data, Design and Sustainability"
πŸ‘₯ PIs: Prof. Dr. Nikolaus P. Himmelmann, Prof. Dr. Nils Reiter

πŸ’Ό Core Responsibilities:

  • Ensuring sustainable data management of ~20 research projects in compliance with the German Research Foundation (DFG) regulations, including tasks such as long-term data archiving, handling metadata, and fostering open science practices
  • Assisting projects with different data workflow tasks including data preprocessing, automatic annotation, data wrangling, query matching, string manipulation, and data visualisation
  • Consulting individual projects on experimental design, corpus design, survey design, and data annotation practices

Research Associate (2015 - 2017)

πŸ›οΈ University of Cologne
Germany

πŸ”¬ DFG Project: The relation between grammar and usage: null subjects and subject position in Spanish and Persian
πŸ‘₯ PI: Prof. Dr. Aria Adli

πŸ’Ό Core Responsibilities:

  • Multi-layer annotation of Persian spontaneous speech
  • Data preparation and transformation
  • Developing hierarchical data representation strategies in alignment with ISO frameworks, MAF and SynAF, and TEI guidelines

πŸ› οΈ Skills & Expertise

πŸ’» Programming & Scripting

Python: pandas, NumPy, scikit-learn, TensorFlow
R: Statistical analysis and data manipulation
SQL: Database querying and management

☁️ Cloud & Big Data Ecosystem (GCP)

BigQuery: Data warehousing and analytics
BigQuery ML: Machine learning in the cloud
Cloud Storage: Scalable object storage
Looker: Business intelligence and data visualization

πŸ€– Machine Learning

Classical Methods: Linear/logistic regression, tree-based and ensemble models
Deep Learning: CNNs, RNNs/LSTMs
Transformer-based: Large Language Models (LLMs)

πŸ“Š Survey & Crowdsourcing Platforms

Amazon Mechanical Turk: Human intelligence tasks
Prolific: Academic research participant recruitment
Qualtrics: Survey design and data collection
Google Forms: Simple survey creation

πŸ”€ Natural Language Processing

Toolkits: spaCy, NLTK, UDPipe, Quanteda, Hugging Face Transformers
Annotation Tools: INCePTION, WebAnno, MMAX2, ELAN
Corpus Analysis: Sketch Engine (CQL)

πŸ“ Markup & Document Preparation

Typesetting: LaTeX, Markdown
Data Formats: XML, XPath
Text Processing: Regular Expressions (Regex)

πŸ”„ Version Control & Collaboration

Git: Source code management and collaboration

πŸŽ“ Teaching

πŸ“š Publications

🎀 Talks