Open Access. Powered by Scholars. Published by Universities.®

Life Sciences Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 17 of 17

Full-Text Articles in Life Sciences

James-Stein Estimation And The Benjamini-Hochberg Procedure, Debashis Ghosh Jan 2012

James-Stein Estimation And The Benjamini-Hochberg Procedure, Debashis Ghosh

Debashis Ghosh

For the problem of multiple testing, the Benjamini-Hochberg (B-H) procedure has become a very popular method in applications. Based on a spacings theory representation of the B-H procedure, we are able to motivate the use of shrinkage estimators for modifying the B-H procedure. Several generalizations in the paper are discussed, and the methodology is applied to real and simulated datasets.


Shrinkage In Adaptive Procedures For False Discovery Rate Estimation In Multiple Testing: Structure And Synthesis, Debashis Ghosh Jan 2012

Shrinkage In Adaptive Procedures For False Discovery Rate Estimation In Multiple Testing: Structure And Synthesis, Debashis Ghosh

Debashis Ghosh

There has been much interest in the study of adaptive estimation procedures for controlling the false discovery rate (FDR). In this article, we take the direct approach to estimation of FDR of Storey (2002) and show how it can reexpressed as a particular type of shrinkage estimator. This representation leads to natural conditions on finite-sample FDR control for a general class of shrinkage estimators. In addition, many previous proposals from the literature can be unified under this framework for which finite-sample FDR results can be developed. Some asymptotic results are also provided.


Generalized Benjamini-Hochberg Procedures Using Spacings, Debashis Ghosh Jan 2011

Generalized Benjamini-Hochberg Procedures Using Spacings, Debashis Ghosh

Debashis Ghosh

For the problem of multiple testing, the Benjamini-Hochberg (B-H) procedure has become a very popular method in applications. We show how the B-H procedure can be interpreted as a test based on the spacings corresponding to the p-value distributions. Using this equivalence, we develop a class of generalized B-H procedures that maintain control of the false discovery rate in finite-samples. We also consider the effect of correlation on the procedure; simulation studies are used to illustrate the methodology.


Software For Assumption Weighting For Meta-Analysis Of Genomic Data, Debashis Ghosh, Yihan Li Jan 2011

Software For Assumption Weighting For Meta-Analysis Of Genomic Data, Debashis Ghosh, Yihan Li

Debashis Ghosh

This is the software that accompanies Li and Ghosh, "Assumption weighting for incorporating heterogeneity into meta-analysis of genomic data."


A Causal Framework For Surrogate Endpoints With Semi-Competing Risks Data, Debashis Ghosh Jan 2011

A Causal Framework For Surrogate Endpoints With Semi-Competing Risks Data, Debashis Ghosh

Debashis Ghosh

In this note, we address the problem of surrogacy using a causal modelling framework that differs substantially from the potential outcomes model that pervades the biostatistical literature. The framework comes from econometrics and conceptualizes direct effects of the surrogate endpoint on the true endpoint. While this framework can incorporate the so-called semi-competing risks data structure, we also derive a fundamental non-identifiability result. Relationships to existing causal modelling frameworks are also discussed.


Propensity Score Modelling In Observational Studies Using Dimension Reduction Methods, Debashis Ghosh Jan 2011

Propensity Score Modelling In Observational Studies Using Dimension Reduction Methods, Debashis Ghosh

Debashis Ghosh

Conditional independence assumptions are very important in causal inference modelling as well as in dimension reduction methodologies. These are two very strikingly different statistical literatures, and we study links between the two in this article. The concept of covariate sufficiency plays an important role, and we provide theoretical justication when dimension reduction and partial least squares methods will allow for valid causal inference to be performed. The methods are illustrated with application to a medical study and to simulated data.


Discrete Nonparametric Algorithms For Outlier Detection With Genomic Data, Debashis Ghosh Jan 2010

Discrete Nonparametric Algorithms For Outlier Detection With Genomic Data, Debashis Ghosh

Debashis Ghosh

In high-throughput studies involving genetic data such as from gene expression mi- croarrays, dierential expression analysis between two or more experimental conditions has been a very common analytical task. Much of the resulting literature on multiple comparisons has paid relatively little attention to the choice of test statistic. In this article, we focus on the issue of choice of test statistic based on a special pattern of dierential expression. The approach here is based on recasting multiple comparisons procedures for assessing outlying expression values. A major complication is that the resulting p-values are discrete; some theoretical properties of sequential testing ...


Detecting Outlier Genes From High-Dimensional Data: A Fuzzy Approach, Debashis Ghosh Jan 2010

Detecting Outlier Genes From High-Dimensional Data: A Fuzzy Approach, Debashis Ghosh

Debashis Ghosh

A recent nding in cancer research has been the characterization of previously undis- covered chromosomal abnormalities in several types of solid tumors. This was found based on analyses of high-throughput data from gene expression microarrays and motivated the development of so-called `outlier' tests for dierential expression. One statistical issue was the potential discreteness of the test statistics. Using ideas from fuzzy set theory, we develop fuzzy outlier detection algorithms that have links to ideas in multiple comparisons. Two- and K-sample extensions are considered. The methodology is illustrated by application to two microarray studies.


Links Between Analysis Of Surrogate Endpoints And Endogeneity, Debashis Ghosh, Jeremy M. Taylor, Michael R. Elliott Jan 2010

Links Between Analysis Of Surrogate Endpoints And Endogeneity, Debashis Ghosh, Jeremy M. Taylor, Michael R. Elliott

Debashis Ghosh

There has been substantive interest in the assessment of surrogate endpoints in medical research. These are measures which could potentially replace \true" endpoints in clinical trials and lead to studies that require less follow-up. Recent research in the area has focused on assessments using causal inference frameworks. Beginning with a simple model for associating the surrogate and true endpoints in the population, we approach the problem as one of endogenous covariates. An instrumental variables estimator and general two-stage algorithm is proposed. Existing surrogacy frameworks are then evaluated in the context of the model. A numerical example is used to illustrate ...


Meta-Analysis For Surrogacy: Accelerated Failure Time Models And Semicompeting Risks Modelling, Debashis Ghosh, Jeremy M. Taylor, Daniel J. Sargent Jan 2010

Meta-Analysis For Surrogacy: Accelerated Failure Time Models And Semicompeting Risks Modelling, Debashis Ghosh, Jeremy M. Taylor, Daniel J. Sargent

Debashis Ghosh

There has been great recent interest in the medical and statistical literature in the assessment and validation of surrogate endpoints as proxies for clinical endpoints in medical studies. More recently, authors have focused on using meta-analytical methods for quanti cation of surrogacy. In this article, we extend existing procedures for analysis based on the accelerated failure time model to this setting. An advantage of this approach relative to proportional hazards model is that it allows for analysis in the semi-competing risks setting, where we constrain the surrogate endpoint to occur before the true endpoint. A novel principal components procedure is ...


Spline-Based Models For Predictiveness Curves, Debashis Ghosh, Michael Sabel Jan 2010

Spline-Based Models For Predictiveness Curves, Debashis Ghosh, Michael Sabel

Debashis Ghosh

A biomarker is dened to be a biological characteristic that is objectively measured and evaluated as an indicator of normal biologic processes, pathogenic processes, or pharmacologic responses to a therapeutic intervention. The use of biomarkers in cancer has been advocated for a variety of purposes, which include use as surrogate endpoints, early detection of disease, proxies for environmental exposure and risk prediction. We deal with the latter issue in this paper. Several authors have proposed use of the predictiveness curve for assessing the capacity of a biomarker for risk prediction. For most situations, it is reasonable to assume monotonicity of ...


Combining Multiple Models With Survival Data: The Phase Algorithm, Debashis Ghosh, Zheng Yuan Jan 2010

Combining Multiple Models With Survival Data: The Phase Algorithm, Debashis Ghosh, Zheng Yuan

Debashis Ghosh

In many scientic studies, one common goal is to develop good prediction rules based on a set of available measurements. This paper proposes a model averaging methodology using proportional hazards regression models to construct new estimators of predicted survival probabilities. A screening step based on an adaptive searching algorithm is used to handle large numbers of covariates. The nite-sample properties of the proposed methodology is assessed using simulation studies. Application of the method to a cancer biomarker study is also given.


Hierarchical Hidden Markov Model With Application To Joint Analysis Of Chip-Chip And Chip-Seq Data, Hyungwon Choi, Debashis Ghosh, Zhaohui S. Qin Jan 2009

Hierarchical Hidden Markov Model With Application To Joint Analysis Of Chip-Chip And Chip-Seq Data, Hyungwon Choi, Debashis Ghosh, Zhaohui S. Qin

Debashis Ghosh

Motivation: Identication of transcription factor binding sites (TFBS) is a fundamental problem in understanding the mechanism of gene regulation. The ChIP-chip technology has accelerated this eort by providing a simultaneous genome-wide map of TFBS in a high-throughput fashion. Recently, a sequencing-based ChIP-seq has appeared as a promising alternative that can identify targets with an improved sensitivity/specicity in high resolution. However, studies have suggested that distinct experimental platforms can be complementary in TFBS identication. The availability of data obtained from multiple platforms motivates a meta-analysis for improved identication of candidate motifs.

Results: In this work, we propose a hierarchical hidden ...


A Double-Layered Mixture Model For The Joint Analysis Of Dna Copy Number And Gene Expression Data, Debashis Ghosh Jan 2009

A Double-Layered Mixture Model For The Joint Analysis Of Dna Copy Number And Gene Expression Data, Debashis Ghosh

Debashis Ghosh

Copy number aberration is a common form of genomic instability in cancer. Gene expression is closely tied to cytogenetic events by the central dogma of molecular biology, and serves as a mediator of copy number changes in disease phenotypes. Accordingly, it is of interest to develop proper statistical methods for jointly analyzing copy number and gene expression data. This work describes a novel Bayesian inferential approach for a double-layered mixture model (DLMM) which directly models the stochastic nature of copy number data and identifies abnormally expressed genes due to aberrant copy number. Simulation studies were conducted to illustrate the robustness ...


Discrete Nonparametric Algorithms For Outlier Detection With Genomic Data, Debashis Ghosh Jan 2009

Discrete Nonparametric Algorithms For Outlier Detection With Genomic Data, Debashis Ghosh

Debashis Ghosh

In high-throughput studies involving genetic data such as from gene expression microarrays, differential expression analysis between two or more experimental conditions has been a very common analytical task. Much of the resulting literature on multiple comparisons has paid relatively little attention to the choice of test statistic. In this article, we focus on the issue of choice of test statistic based on a special pattern of differential expression. The approach here is based on recasting multiple comparisons procedures for assessing outlying expression values. A major complication is that the resulting p-values are discrete; some theoretical properties of sequential testing procedures ...


A Double-Layered Mixture Model For The Joint Analysis Of Dna Copy Number And Gene Expression Data, Debashis Ghosh Jan 2009

A Double-Layered Mixture Model For The Joint Analysis Of Dna Copy Number And Gene Expression Data, Debashis Ghosh

Debashis Ghosh

Copy number aberration is a common form of genomic instability in cancer. Gene expression is closely tied to cytogenetic events by the central dogma of molecular biology, and serves as a mediator of copy number changes in disease phenotypes. Accordingly, it is of interest to develop proper statistical methods for jointly analyzing copy number and gene expression data. This work describes a novel Bayesian inferential approach for a double-layered mixture model (DLMM) which directly models the stochastic nature of copy number data and identifies abnormally expressed genes due to aberrant copy number. Simulation studies were conducted to illustrate the robustness ...


Discrete Nonparametric Algorithms For Outlier Detection With Genomic Data, Debashis Ghosh Jan 2009

Discrete Nonparametric Algorithms For Outlier Detection With Genomic Data, Debashis Ghosh

Debashis Ghosh

In high-throughput studies involving genetic data such as from gene expression microarrays, differential expression analysis between two or more experimental conditions has been a very common analytical task. Much of the resulting literature on multiple comparisons has paid relatively little attention to the choice of test statistic. In this article, we focus on the issue of choice of test statistic based on a special pattern of differential expression. The approach here is based on recasting multiple comparisons procedures for assessing outlying expression values. A major complication is that the resulting p-values are discrete; some theoretical properties of sequential testing procedures ...