The main research goals are to:

1.      Analyze the spatial patterns of linguistic variation in tweets;

2.      Analyze how regions formed and developed as a consequence of migration;

3.      Understand how migration patterns are linked to linguistic variation;

4.      Develop and share benchmark data, methods and software for broader research.


Big data of family trees and tweets, contributed by the public, present an exciting new opportunity. For migration analysis, a large collection of family trees with individual locations (e.g., birth place and residence) contains rich information about migration over space, time, and generations, which are not available elsewhere. For linguistic analysis, the tweets over a time span (e.g., one year) can be analyzed to extract linguistic characteristics for both individuals and geographic places.