November Data Science Seminar Series: Xing Qiu
November 11, 2015
2015-2016 Goergen Institute for Data Science Seminar Series
Presented by the Rochester Center of Excellence in Data Science
How many clusters are there on a circle? A new information criterion for clustering circular data with application to time course genomic data
Xing Qiu, Ph.D.
Associate Professor of Biostatistics & Computational Biology, University of Rochester
Common pre-processing procedures for time course microarray analysis such as standardization and gene filtering based on the functional F-test, often result in directional data that lie on a sphere . While there have been some efforts in designing spherical clustering algorithms, few researchers have developed methods for selecting the number of clusters for spherical cluster analysis. In this talk, I will present a novel information-based criterion ICCC (information criterion for circular clustering) to determine the number of clusters when clustering circular data. This new criterion, ICCC, is based on a finite mixture model of Langevin distributions and is derived from the asymptotic properties of the maximum likelihood of the Langevin mixture distribution. Through the study of both simulated data and a large set of time course microarray data, we demonstrate that the ICCC criterion provides better estimates of the number of clusters than such existing methods: AIC, BIC, the Gap criterion, and the Maitra-Ramler criterion.
Xing Qiu is a computational biologist with a strong background in mathematical statistics. His research focuses on developing statistical and computational methodology of multiple hypothesis testing for analyzing large scale data with complex correlation structures such as *omics and biomedical imaging data. In recent years, he has published 39 peer-reviewed papers covering topics in statistics and bioinformatics research fields such as microarray pre-processing, nonparametric statistical analysis, empirical Bayes methodology, controlling false discovery rate under correlation, and efficient computational method of multiple hypothesis testing based on correlation scores. Qiu has also published novel analysis method for Diffusion Tensor Imaging, cluster analysis for time course data, and using ordinary differential equations to infer large-scale gene regulatory networks.
Wednesday, November 11, 2015
5:00 – 5:30 Reception | 5:30 – 6:30 Talk
Goergen 101 (River Campus)
(Food will be provided thanks to the support of the NYS CoE)