<?xml version="1.0" encoding="iso-8859-1" ?>
<rss version="2.0">
<channel>
<title>The International Journal of Biostatistics</title>
<copyright>Copyright (c) 2009 Berkeley Electronic Press All rights reserved.</copyright>
<link>http://www.bepress.com/ijb</link>
<description>Recent documents in The International Journal of Biostatistics</description>
<language>en-us</language>
<lastBuildDate>Wed, 11 Nov 2009 23:22:16 PST</lastBuildDate>
<ttl>3600</ttl>


	

	




<item>
<title>Modeling Cumulative Incidences of Dementia and Dementia-Free Death Using a Novel Three-Parameter Logistic Function</title>
<link>http://www.bepress.com/ijb/vol5/iss1/29</link>
<guid isPermaLink="true">http://www.bepress.com/ijb/vol5/iss1/29</guid>
<pubDate>Tue, 10 Nov 2009 11:39:08 PST</pubDate>
<description>Parametric modeling of univariate cumulative incidence functions and logistic models have been studied extensively. However, to the best of our knowledge, there is no study using logistic models to characterize cumulative incidence functions.  In this paper, we propose a novel parametric model which is an extension of a widely-used four-parameter logistic function for dose-response curves.  The modified model can accommodate various shapes of cumulative incidence functions and be easily implemented using standard statistical software.   The simulation studies demonstrate that the proposed model is as efficient as or more efficient than its nonparametric counterpart when it is correctly specified, and outperforms the existing Gompertz model when the underlying cumulative incidence function is sigmoidal.  The practical utility of the modified three-parameter logistic model is illustrated using the data from the Cache County Study of dementia.</description>

<author>Yu Cheng</author>


<category>Survival Analysis</category>

</item>


<item>
<title>Mixed-Effects Models for Conditional Quantiles with Longitudinal Data</title>
<link>http://www.bepress.com/ijb/vol5/iss1/28</link>
<guid isPermaLink="true">http://www.bepress.com/ijb/vol5/iss1/28</guid>
<pubDate>Sat, 07 Nov 2009 16:14:34 PST</pubDate>
<description>We propose a regression method for the estimation of conditional quantiles of a continuous response variable given a set of covariates when the data are dependent. Along with fixed regression coefficients, we introduce random coefficients which we assume to follow a form of multivariate Laplace distribution. In a simulation study, the proposed quantile mixed-effects regression is shown to model the dependence among longitudinal data correctly and estimate the fixed effects efficiently. It performs similarly to the linear mixed model at the central location when the regression errors are symmetrically distributed, but provides more efficient estimates when the errors are over-dispersed. At the same time, it allows the estimation at different locations of conditional distribution, which conveys a comprehensive understanding of data. We illustrate an application to clinical data where the outcome variable of interest is bounded within a closed interval.</description>

<author>Yuan Liu</author>


<category>Longitudinal Data Analysis and Time Series</category>

<category>Statistical Models</category>

<category>Statistical Theory and Methods</category>

</item>


<item>
<title>Measures to Summarize and Compare the Predictive Capacity of Markers</title>
<link>http://www.bepress.com/ijb/vol5/iss1/27</link>
<guid isPermaLink="true">http://www.bepress.com/ijb/vol5/iss1/27</guid>
<pubDate>Thu, 01 Oct 2009 16:58:55 PDT</pubDate>
<description>The predictive capacity of a marker in a population can be described using the population distribution of risk (Huang et al. 2007; Pepe et al. 2008a; Stern 2008). Virtually all standard statistical summaries of predictability and discrimination can be derived from it (Gail and Pfeiffer 2005). The goal of this paper is to develop methods for making inference about risk prediction markers using summary measures derived from the risk distribution. We describe some new clinically motivated summary measures and give new interpretations to some existing statistical measures. Methods for estimating these summary measures are described along with distribution theory that facilitates construction of confidence intervals from data. We show how markers and, more generally, how risk prediction models, can be compared using clinically relevant measures of predictability. The methods are illustrated by application to markers of lung function and nutritional status for predicting subsequent onset of major pulmonary infection in children suffering from cystic fibrosis. Simulation studies show that methods for inference are valid for use in practice.</description>

<author>Wen Gu</author>


<category>Clinical Epidemiology</category>

<category>Statistical Models</category>

</item>


<item>
<title>Using Generalized Additive Models to Detect and Estimate Threshold Associations</title>
<link>http://www.bepress.com/ijb/vol5/iss1/26</link>
<guid isPermaLink="true">http://www.bepress.com/ijb/vol5/iss1/26</guid>
<pubDate>Wed, 16 Sep 2009 15:12:31 PDT</pubDate>
<description>In a variety of research settings, investigators may wish to detect and estimate a threshold in the association between continuous variables. A threshold model implies a non-linear relationship, with the slope changing at an unknown location. Generalized additive models (GAMs) (Hastie and Tibshirani, 1990) estimate the shape of the non-linear relationship directly from the data and, thus, may be useful in this endeavour.We propose a method based on GAMs to detect and estimate thresholds in the association between a continuous covariate and a continuous dependent variable.  Using simulations, we compare it with the maximum likelihood estimation procedure proposed by Hudson (1966). We search for potential thresholds in a neighbourhood of points whose mean numerical second derivative (a measure of local curvature) of the estimated GAM curve was more than one standard deviation away from 0 across the entire range of the predictor values.  A threshold association is declared if an F-test indicates that the threshold model fit significantly better than the linear model.  For each method, type I error for testing the existence of a threshold against the null hypothesis of a linear association was estimated. We also investigated the impact of the position of the true threshold on power, and precision and bias of the estimated threshold.Finally, we illustrate the methods by considering whether a threshold exists in the association between systolic blood pressure (SBP) and body mass index (BMI) in two data sets.</description>

<author>Andrea Benedetti</author>


<category>General Biostatistics</category>

</item>


<item>
<title>On the Use of K-Fold Cross-Validation to Choose Cutoff Values and Assess the Performance of Predictive Models in Stepwise Regression</title>
<link>http://www.bepress.com/ijb/vol5/iss1/25</link>
<guid isPermaLink="true">http://www.bepress.com/ijb/vol5/iss1/25</guid>
<pubDate>Mon, 27 Jul 2009 11:09:49 PDT</pubDate>
<description>This paper addresses a methodological technique of leave-many-out cross-validation for choosing cutoff values in stepwise regression methods for simplifying the final regression model. A practical approach to choose cutoff values through cross-validation is to compute the minimum Predicted Residual Sum of Squares (PRESS). A leave-one-out cross-validation may overestimate the predictive model capabilities, for example see Shao (1993) and So et al (2000). Shao proves with asymptotic results and simulation that the model with the minimum value for the leave-one-out cross validation estimate of predictor errors is often over specified. That is, too many insignificant variables are contained in set &#946;i of the regression model. He recommended using a method that leaves out a subset of observations, called K-fold cross-validation. Leave-many-out procedures can be more adequate in order to obtain significant and optimal results. We describe various investigations for the assessment of performance of predictive regression models, including different values of K in K-fold cross-validation and selecting the best possible cutoff-values for automated model selection methods. We propose a resampling procedure by introducing alternative estimates of boosted cross-validated PRESS values for deciding the number of observations (l) to be omitted and number of folds/subsets (K) subsequently in K-fold cross-validation. Salahuddin and Hawkes (1991) used leave-one-out cross-validation to select equal cutoff values in stepwise regression which minimizes PRESS. We concentrate on applying K-fold cross-validation to choose unequal cutoff values that is F-to-enter and F-to-remove values which are then used for determining predictor variables in a regression model from the full data set. Our computer program for K-fold cross-validation can be efficiently used for choosing both equal and unequal cutoff values for automated model selection methods. Some previously analyzed data and Monte Carlo simulation are used to evaluate the proposed method against alternatives through a design experiment approach.</description>

<author>Zafar Mahmood</author>


<category>Statistical Models</category>

</item>


<item>
<title>Inference in Epidemic Models without Likelihoods</title>
<link>http://www.bepress.com/ijb/vol5/iss1/24</link>
<guid isPermaLink="true">http://www.bepress.com/ijb/vol5/iss1/24</guid>
<pubDate>Mon, 20 Jul 2009 13:08:37 PDT</pubDate>
<description>Likelihood-based inference for epidemic models can be challenging, in part due to difficulties in evaluating the likelihood. The problem is particularly acute in models of large-scale outbreaks, and unobserved or partially observed data further complicates this process. Here we investigate the performance of Markov Chain Monte Carlo and Sequential Monte Carlo algorithms for parameter inference, where the routines are based on approximate likelihoods generated from model simulations. We compare our results to a gold-standard data-augmented MCMC for both complete and incomplete data. We illustrate our techniques using simulated epidemics as well as data from a recent outbreak of Ebola Haemorrhagic Fever in the Democratic Republic of Congo and discuss situations in which we think simulation-based inference may be preferable to likelihood-based inference.</description>

<author>Trevelyan McKinley</author>


<category>Disease Modeling</category>

</item>


<item>
<title>Predicting Potential Placebo Effect in Drug Treated Subjects</title>
<link>http://www.bepress.com/ijb/vol5/iss1/23</link>
<guid isPermaLink="true">http://www.bepress.com/ijb/vol5/iss1/23</guid>
<pubDate>Mon, 06 Jul 2009 15:32:19 PDT</pubDate>
<description>Non-specific responses to treatment (commonly known as placebo response) are pervasive when treating mental illness.  Subjects treated with an active drug may respond in part due to non-specific aspects of the treatment, i.e, those not related to the chemical effect of the drug.  To determine the extent a subject responds due to the chemical effect of a drug, one must disentangle the specific drug effect from the non-specific placebo effect. This paper presents a unique statistical model that allows for the separate prediction of a specific effect and non-specific effects in drug treated subjects.  Data from a clinical trial comparing fluoxetine to a placebo for treating depression is used to illustrate this methodology.</description>

<author>Eva Petkova</author>


<category>Clinical Trials</category>

<category>General Biostatistics</category>

<category>Longitudinal Data Analysis and Time Series</category>

<category>Statistical Models</category>

</item>


<item>
<title>Semiparametrically Efficient Estimation of Conditional Instrumental Variables Parameters</title>
<link>http://www.bepress.com/ijb/vol5/iss1/22</link>
<guid isPermaLink="true">http://www.bepress.com/ijb/vol5/iss1/22</guid>
<pubDate>Tue, 30 Jun 2009 19:24:51 PDT</pubDate>
<description>In this paper, I propose a set of parameters designed to identify the slope of structural relationships based on a combination of conditioning on covariates and the use of an exogenous instrument. After giving structural interpretations to these parameters in the context of specific semiparametric models, I derive their efficient influence curves in a fully nonparametric context as well as under imposition of restrictions on the instrument. These influence curves give the semiparametric efficiency bounds for regular asymptotically linear estimators of the parameters and allow the construction of asymptotically efficient estimators. Monte Carlo experiments finally demonstrate the good finite sample performance of such estimators.</description>

<author>Maximilian Kasy</author>


<category>Statistical Theory and Methods</category>

</item>


<item>
<title>Mixed-Effects Poisson Regression Models for Meta-Analysis of Follow-Up Studies with Constant or Varying Durations</title>
<link>http://www.bepress.com/ijb/vol5/iss1/21</link>
<guid isPermaLink="true">http://www.bepress.com/ijb/vol5/iss1/21</guid>
<pubDate>Fri, 26 Jun 2009 11:52:46 PDT</pubDate>
<description>We present a framework for meta-analysis of follow-up studies with constant or varying duration using the binary nature of the data directly. We use a generalized linear mixed model framework with the Poisson likelihood and the log link function. We fit models with fixed and random study effects using Stata for performing meta-analysis of follow-up studies with constant or varying duration. The methods that we present are capable of estimating all the effect measures that are widely used in such studies such as the Risk Ratio, the Risk Difference (in case of studies with constant duration), as well as the Incidence Rate Ratio and the Incidence Rate Difference (for studies of varying duration). The methodology presented here naturally extends previously published methods for meta-analysis of binary data in a generalized linear mixed model framework using the Poisson likelihood. Simulation results suggest that the method is uniformly more powerful compared to summary based methods, in particular when the event rate is low and the number of studies is small. The methods were applied in several already published meta-analyses with very encouraging results. The methods are also directly applicable to individual patients' data offering advanced options for modeling heterogeneity and confounders. Extensions of the models for more complex situations, such as competing risks models or recurrent events are also discussed. The methods can be implemented in standard statistical software and illustrative code in Stata is given in the appendix.</description>

<author>Pantelis G. Bagos</author>


<category>Clinical Epidemiology</category>

<category>Clinical Trials</category>

<category>Epidemiology</category>

<category>General Biostatistics</category>

<category>Multivariate Analysis</category>

<category>Statistical Models</category>

</item>


<item>
<title>Optimal Sufficient Statistics for Parametric and Non-Parametric Multiple Simultaneous Hypothesis Testing</title>
<link>http://www.bepress.com/ijb/vol5/iss1/20</link>
<guid isPermaLink="true">http://www.bepress.com/ijb/vol5/iss1/20</guid>
<pubDate>Tue, 23 Jun 2009 13:45:27 PDT</pubDate>
<description>In multiple simultaneous hypothesis testing (MSHT), a significance thresholding function as a scalar statistic can be designed in an adaptive manner by sharing information among many tests performed simultaneously. By using such an adapted statistic, MSHT has greater detection power than tests using simple individual statistics. To systematically obtain an optimal thresholding function that maximizes the detection power in MSHT, Storey (2007) proposed a theoretical framework called the optimal discovery procedure (ODP). He also proposed an empirical estimation of the ODP thresholding function for a parametric MSHT that presupposes parametric forms of the null and alternative likelihood functions. Empirical Bayesian testing (Efron et al. 2001), which is based on a non-parametric treatment of arbitrary test statistics, has sometimes exhibited comparable power to the ODP. These two MSHT frameworks appear to be closely related but, because of differences in their approach (frequentist vs. Bayesian), the relationship is not well understood.We present the new concept of an optimal sufficient statistic that links the ODP and empirical Bayesian frameworks, and we show that the local false discovery rate based on the empirical Bayes can be an optimal thresholding function if a certain condition holds. We lay out exhaustive sets of presumptions to achieve optimal thresholding functions and show that, if an optimal thresholding function is derived for a parametric MSHT problem, it is still optimal for a more general and broader range of MSHT problems defined in a non- or semi-parametric way. A guide to designing optimal thresholding functions for general MSHT problems is thus provided by our study.</description>

<author>Shigeyuki Oba</author>


<category>Microarrays</category>

<category>Statistical Theory and Methods</category>

</item>



</channel>
</rss>
