Biostatistics Events

Biostatistics Departmental Calendar

Event Category
Thu 9/12/2019 3:30PM - 4:30PM
Mingyao Li, University of Pennsylvania Biostatistics Seminar Series
Mingyao Li, University of Pennsylvania
Thu 9/12/2019 3:30PM - 4:30PM
Public Health Lecture Hall (A115)

Mingyao Li, PhD, Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania.


Public Health Lecture Hall (A115)
Biostatistics
Seminar Series
Thu 9/19/2019 3:30PM - 4:30PM
Novartis Information Session Biostatistics Seminar Series
Novartis Information Session
Thu 9/19/2019 3:30PM - 4:30PM
Public Health Lecture Hall (A115)


Public Health Lecture Hall (A115)
Biostatistics
Seminar Series
Thu 10/10/2019 3:30PM - 4:30PM
Peter Mueller, University of Texas at Austin Biostatistics Seminar Series
Peter Mueller, University of Texas at Austin
Thu 10/10/2019 3:30PM - 4:30PM
Public Health Lecture Hall (A115)

Peter Mueller, PhD, Department of Mathematics, Department of Statistics and Data Sciences, University of Texas at Austin


Public Health Lecture Hall (A115)
Biostatistics
Seminar Series
Thu 10/24/2019 3:30PM - 4:30PM
Lu Mao, University of Wisconsin-Madison Biostatistics Seminar Series
Lu Mao, University of Wisconsin-Madison
Thu 10/24/2019 3:30PM - 4:30PM
Public Health Lecture Hall (A115)

Lu Mao, PhD, Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison


Public Health Lecture Hall (A115)
Biostatistics
Seminar Series
Thu 11/7/2019 3:30PM - 4:30PM
Snehalata Huzurbazar, West Virginia University Biostatistics Seminar Series
Snehalata Huzurbazar, West Virginia University
Thu 11/7/2019 3:30PM - 4:30PM
Public Health Lecture Hall (A115)

Snehalata Huzurbazar, PhD, Department of Biostatistics, West Virginia University


Public Health Lecture Hall (A115)
Biostatistics
Seminar Series
Sun 3/22/2020 to Wed 3/25/2020
ENAR 2020 Spring Meeting of the International Biometric Society -- JW Marriott Nashville Biostatistics Conference
ENAR 2020 Spring Meeting of the International Biometric Society -- JW Marriott Nashville
Sun 3/22/2020 to Wed 3/25/2020


Meetings of the Eastern North American Region of the International Biometric Society (a.k.a. "ENAR meetings") are held in late March or early April each year and reflect the broad interests of the Society, including both quantitative techniques and application areas. Faculty and student presenters from the Department of Biostatistics regularly participate giving invited talks, contributed talks, and poster presentations.


Biostatistics
Conference
Sat 8/1/2020 to Thu 8/6/2020
Joint Statistical Meetings - - JSM 2020, Philadelphia, PA Biostatistics Conference
Joint Statistical Meetings - - JSM 2020, Philadelphia, PA
Sat 8/1/2020 to Thu 8/6/2020


The Joint Statistical Meetings, known simply as "JSM", is the largest gathering of statisticians held annually in North American. Faculty and student presenters from the  Department of Biostatistics regularly participate giving invited talks, contributed talks, and poster presentations. Our students often receive top awards and participate in the affiliated career marketplace at the event.


Biostatistics
Conference
Sun 3/14/2021 to Wed 3/17/2021
ENAR 2021 Spring Meeting of the International Biometric Society -- Baltimore Biostatistics Conference
ENAR 2021 Spring Meeting of the International Biometric Society -- Baltimore
Sun 3/14/2021 to Wed 3/17/2021


Meetings of the Eastern North American Region of the International Biometric Society (a.k.a. "ENAR meetings") are held in late March or early April each year and reflect the broad interests of the Society, including both quantitative techniques and application areas. Faculty and student presenters from the Department of Biostatistics regularly participate giving invited talks, contributed talks, and poster presentations.


Biostatistics
Conference
Sat 8/7/2021 to Thu 8/12/2021
Joint Statistical Meetings - - JSM 2021, Seattle, WA Biostatistics Conference
Joint Statistical Meetings - - JSM 2021, Seattle, WA
Sat 8/7/2021 to Thu 8/12/2021


The Joint Statistical Meetings, known simply as "JSM", is the largest gathering of statisticians held annually in North American. Faculty and student presenters from the  Department of Biostatistics regularly participate giving invited talks, contributed talks, and poster presentations. Our students often receive top awards and participate in the affiliated career marketplace at the event.


Biostatistics
Conference
Sun 3/27/2022 to Wed 3/30/2022
ENAR 2022 Spring Meeting of the International Biometric Society -- Houston Biostatistics Conference
ENAR 2022 Spring Meeting of the International Biometric Society -- Houston
Sun 3/27/2022 to Wed 3/30/2022


Meetings of the Eastern North American Region of the International Biometric Society (a.k.a. "ENAR meetings") are held in late March or early April each year and reflect the broad interests of the Society, including both quantitative techniques and application areas. Faculty and student presenters from the Department of Biostatistics regularly participate giving invited talks, contributed talks, and poster presentations.


Biostatistics
Conference
Sat 8/6/2022 to Thu 8/11/2022
Joint Statistical Meetings - - JSM 2022, Washington, DC Biostatistics Conference
Joint Statistical Meetings - - JSM 2022, Washington, DC
Sat 8/6/2022 to Thu 8/11/2022


The Joint Statistical Meetings, known simply as "JSM", is the largest gathering of statisticians held annually in North American. Faculty and student presenters from the  Department of Biostatistics regularly participate giving invited talks, contributed talks, and poster presentations. Our students often receive top awards and participate in the affiliated career marketplace at the event.


Biostatistics
Conference

Recent Events

Biostatistics Dissertation Defense

Md Tanbin Rahman - Classification and Clustering for RNA-seq Data with Variable Selection

Friday 6/7 9:00AM - 11:00AM
7139 Public Health, Peterson Seminar Room

Md Tanbin Rahman of the Department of Biostatistics defends his dissertation on "Classification and Clustering for RNA-seq Data with Variable Selection". 

Committee Chairperson: George Tseng, ScD, Department of Biostatistics

Committee Members:

Abdus Wahed, PhD, Department of Biostatistics

Ying Ding, PhD, Department of Biostatistics

Hyun Jung Park, PhD, Department of Human Genetics

Graduate faculty of the University and all other interested parties are invited to attend


ABSTRACT:

Machine learning plays an important role in genomics due to its high-dimensional structure. Classification and clustering are the two main branches of machine learning which have led to important discoveries in the field of genomics. Clustering plays crucial roles in identifying sub-types of complex disease which are not previously known while the use of classification in building predictive model is widely used in screening, prediction of the chance of developing certain diseases in the field of medicine. In transcriptomic data, where expression levels are measured on tens of thousands of genes and only a small number of subjects, the use of machine learning is often necessary. Often times, the number of genes is much higher compared to the number of samples leading to small-n-large-p problem. Variable selection in such cases is required to identify the genes that can distinguish between the different sub-types of a medical condition. In recent years, lowering of cost and high accuracy has made RNA-seq widely popular which is expected to continue to grow over the next few years. One of the important features of RNA-Seq data is that its count data structure. While there has been a great deal of literature in both clustering and classification method, most of them are either heuristic or suitable for continuous data and does not directly generalize to count data.

In Chapter 2, we propose a classifier for the count structure of the RNA-seq data with variable selection and covariate adjustment. Supervised machine learning methods have been increasingly used in biomedical research and in clinical practice. In transcriptomic applications, RNA-seq data have become dominating and have gradually replaced traditional microarray due to their reduced background noise and increased digital precision. Most existing machine learning methods are, however, designed for continuous intensities of microarray and are not suitable for RNA-seq count data. In this paper, we develop a negative binomial model via generalized linear model framework with double regularization for gene and covariate sparsity to accommodate three key elements: adequate modeling of count data with overdispersion, gene selection and adjustment for covariate effect. The proposed sparse negative binomial classifier (snbClass) is evaluated in simulations and two real applications using cervical tumor miRNA-seq data and schizophrenia post-mortem brain tissue RNA-seq data to demonstrate its superior performance in prediction accuracy and feature selection.

In Chapter 3, we will discuss a model-based clustering method which is able to use the count structure of the data without transformation thereby not losing information. Clustering with variable selection is a challenging but critical task for modern small-n-large-p data. Existing methods based on Gaussian mixture models or sparse K-means provide a solution to continuous data. With the prevalence of RNA-seq technology and lack of count data modeling for clustering, the current practice is to normalize count expression data into continuous measures and apply existing models with Gaussian assumption. In this paper, we develop a negative binomial mixture model with gene regularization to cluster samples (small n) with high-dimensional gene features (large p). EM algorithm and Bayesian information criterion are used for inference and determining tuning parameters. The method is compared with sparse Gaussian mixture model and sparse K-means using extensive simulations and two real transcriptomic applications in breast cancer and rat brain studies. The result shows superior performance of the proposed count data model in clustering accuracy, feature selection and biological interpretation by pathway enrichment analysis.

Contribution to public health:

Transcriptomic data play an important role in identifying genes that are differentially expressed under various external conditions and diseases. RNA-seq data are now the most popular method when measuring the expression level in transcriptomic data. This thesis deals with two important aspects of machine learning, namely classification and clustering for count data. The method proposed in this thesis is tailor-made for the structure of the count data produced in RNA-seq data.

Last Updated On Tuesday, May 28, 2019 by Valenti, Renee Nerozzi
Created On Tuesday, May 28, 2019

JulAugust 2019Sep
SunMonTueWedThuFriSat
28293031123
45678910
11121314151617
18192021222324
25262728293031
1234567

Submit events and news

Enter upcoming calendar events or share your school news and announcements at publichealth.pitt.edu/submit.