報告時間: 2019年1月16日 星期三 下午3:00-5:30
報告地點:雁山校區(qū) 07122
主講人:艾明要教授
題目: Optimal Subsampling Algorithm for Big Data Generalized Linear Models
主要內(nèi)容:
To fast approximate the MLE with massive data, this paper study the optimal subsampling method under the A-optimality criterion for generalized linear models (GLM). The consistency and asymptotic normality of the estimator from a general subsampling algorithm are established, and optimal subsampling probabilities under the A- and L-optimality criteria are derived. Furthermore, using Frobenius norm matrix concentration inequality, finite sample properties of the subsample estimator based on optimal subsampling probabilities are also derived. Since the optimal subsampling probabilities depend on the full data estimate, an adaptive two-step algorithm is developed. Asymptotic normality and optimality of the estimator from this adaptive algorithm are established. The proposed methods are illustrated and evaluated through numerical experiments on simulated and real datasets.
艾明要教授簡介:
艾明要,男,2003年在南開大學取得博士學位,之后來北京大學數(shù)學科學學院工作至今。2007年8月至2009年1月,美國佐治亞理工學院工業(yè)與系統(tǒng)工程系訪問學者。現(xiàn)為北京大學數(shù)學科學學院統(tǒng)計學教研室主任、教授、博士生導師。兼任中國概率統(tǒng)計學會秘書長,中國現(xiàn)場統(tǒng)計研究會常務(wù)理事,試驗設(shè)計分會理事長,高維數(shù)據(jù)統(tǒng)計分會副理事長等,國際重要統(tǒng)計期刊《Statistica Sinica》、《Journal of Statistical Planning and Inference》、《Statistics and Probability Letters》、《STAT》副主編,國內(nèi)核心期刊 《系統(tǒng)科學與數(shù)學》編委,科學出版社《統(tǒng)計與數(shù)據(jù)科學系列叢書》編委。
主要從事試驗設(shè)計與分析、計算機試驗、大數(shù)據(jù)分析和應(yīng)用統(tǒng)計的教學和研究工作,在Ann Statist、JASA、Biometrika、Technometrics、Statist Sinica等國內(nèi)外頂尖期刊發(fā)表學術(shù)論文六十余篇,主持完成國家自然科學基金面上項目5項、國家自然科學基金重點項目子課題1項,參與完成國家科技部973課題2項。