Item Details

Feature-Based Spatio-Temporal Modeling

Wang, Xiaofeng
Thesis/Dissertation; Online
Wang, Xiaofeng
Brown, Donald
Dimensions of data are expanding. An increasing number of spatio-temporal data are available with numerous features, including ordinary numerical and categorical features as well as unstructured features like text. Although those high dimensional data can help improve predictions, efficient methods of processing spatio-temporal data with many different types of features are limited. This dissertation formalized an important class of problems related to spatio-temporal data. In the dissertation, an effective mathematical model, the local spatio-temporal generalized additive model(LSTGAM), was developed to predict and classify spatio-temporal data. This model can fully utilize many different types of data, such as spatial and temporal data, geographic data, demographic data, textual data, etc. The model can be easily estimated by available algorithms and has good interpretability. To assist the building of LSTGAM, a randomized least angle regression (RLAR) method was used to select features for non-linear regression models. Tests with simulated data and real data showed RLAR performed well. In addition, a new method, the semantic role labeling-based latent Dirichlet allocation (SRL-LDA) model, was developed to extract key information from text. This method is based on the automatic semantic analysis and understanding of natural language, combined with dimensionality reduction via latent Dirichlet allocation. The above two models, LSTGAM and SRL-LDA, can be applied together to applications where unstructured textual data contains indicators relevant to the spatio-temporal properties of events. The newly developed models have been applied to four real problems, including predictions of criminal incidents and analysis of train accidents. Results showed the LSTGAM outperformed several previous models, such as spatial generalized linear models and hot spot models, in evaluations with the spatio-temporal classification problem. It also showed that SRL-LDA can effectively extract useful information from unstructured textual data like Twitter posts. Information extracted by SRL-LDA showed the ability to improve the prediction performance in different cases. Those applications also revealed interesting sources of data for criminal prediction: social media services like Twitter. As discussed at the end of the dissertation, a large scale text analysis system with modeling techniques developed in this dissertation can provide solutions for many areas where predictions are important.
Date Received
University of Virginia, Department of Systems Engineering, PHD (Doctor of Philosophy), 2012
Published Date
PHD (Doctor of Philosophy)
Libra ETD Repository
Logo for In CopyrightIn Copyright


Read Online