11:00 - 12:20 | Mon 14 Dec | Scotland A | MbGS-L
Mining the large volume textual data produced by microblogging services has attracted much attention in recent years. An important preprocessing step of microblog text mining is to convert natural language texts into proper numerical representations. Due to the short-length characteristic, finding proper representations of microblog texts is nontrivial. In this paper, we propose to build deep network-based models to learn low-dimensional representations of microblog texts. The proposed models take advantage of the semantic relatedness derived from two types of microblog-specific information, namely the retweet relationship and hashtags. Experiment results show that the deep models perform better than traditional dimensionality reduction methods such as latent semantic analysis and latent Dirichlet allocation topic model, and the use of microblog-specific information can help to learn better representations.
No information added