学术报告

当前位置:学术交流 >> 学术报告 >> 浏览文章

学术报告:Fully Bayesian Classification with Heavy-tailed Priors for Selection in High-dimensional Features with Grouping Structure

发布时间:2017年06月12日 浏览次数:发布者:mathky

报告人:Longhai LI副教授

Department of Mathematics and Statistics,University of Saskatchewan,CANADA

 

报告题目:Fully Bayesian Classification with Heavy-tailed Priors for Selection in High-dimensional Features with Grouping Structure

报告时间: 2017621日下午15:30

报告地点:海韵实验楼105

内容摘要:Feature selection is demanded in many modern scientific research problems that use high-dimensional data. A typical example is to find the most useful genes that are related to a certain disease (eg, cancer) from high-dimensional gene expressions.   The expressions of genes have grouping structures, for example, a group of co-regulated genes that have similar biological functions tend to have similar expressions. Many statistical methods have been proposed to take the grouping structure into consideration in feature selection, including group LASSO, supervised group LASSO, and regression on group representatives.  In this paper, we propose a fully Bayesian Robit regression method with heavy-tailed (sparsity) priors (shortened by FBRHT) for selecting features with grouping structure.  The main features of FBRHT include that it discards more aggressively unrelated features than LASSO, and it can make feature selection within groups automatically without a pre-specified  grouping structure.  In this paper, we use simulated and real datasets to demonstrate that the predictive power of the sparse feature subsets selected by FBRHT are comparable with other much larger feature subsets selected by LASSO, group LASSO, supervised group LASSO, penalized logistic regression and random forest, and that the succinct feature subsets selected by FBRHT have significantly better predictive power than the feature subsets of the same size taken from the top features selected by the aforementioned methods. .

报告人简介:Longhai LI博士,是加拿大萨斯喀彻温大学,数学与统计系副教授。研究领域是机器学习中的大数据和复杂结构数据分析,包括高维数据的特征选择和预测分析,模型选择,空间数据分析,聚类分析等。应用的领域是生物信息、公共健康和金融。

研究成果发表在JASA,Computational Statistics & Data Analysis,Bayesian Analysis,Scandinavian Journal of Statistics等重要国际统计学期刊上。

学院联系人:刘继春教授

欢迎广大师生参加!