<?xml version="1.0" encoding="iso-8859-1" ?>
<rss version="2.0">
<channel>
<title>The International Journal of Biostatistics</title>
<copyright>Copyright (c) 2010 Berkeley Electronic Press All rights reserved.</copyright>
<link>http://www.bepress.com/ijb</link>
<description>Recent documents in The International Journal of Biostatistics</description>
<language>en-us</language>
<lastBuildDate>Thu, 28 Jan 2010 03:26:13 PST</lastBuildDate>
<ttl>3600</ttl>


	
		
	







<item>
<title>A Comparison of Variable Selection Approaches for Dynamic Treatment Regimes</title>
<link>http://www.bepress.com/ijb/vol6/iss1/6</link>
<guid isPermaLink="true">http://www.bepress.com/ijb/vol6/iss1/6</guid>
<pubDate>Tue, 26 Jan 2010 09:44:53 PST</pubDate>
<description>In estimating optimal adaptive treatment strategies, the tailor treatment variables used for patient profiles are typically hand-picked by experts. However these variables may not yield an estimated optimal dynamic regime that is close to the optimal regime which uses all variables. The question of selecting tailoring variables has not yet been answered satisfactorily, though promising new approaches have been proposed. We compare the use of reducts--a  variable selection tool from computer sciences--to the S-score criterion proposed by Gunter and colleagues in 2007 for suggesting collections of useful variables for treatment regime tailoring. Although the reducts-based approach promised several advantages such as the ability to account for correlation among tailoring variables, it proved to have several undesirable properties. The S-score performed better, though it too exhibited some disappointing qualities.</description>

<author>Peter Biernot</author>


<category>General Biostatistics</category>

</item>






<item>
<title>Comment: Analyzing Propensity Score Matched Count Data</title>
<link>http://www.bepress.com/ijb/vol6/iss1/5</link>
<guid isPermaLink="true">http://www.bepress.com/ijb/vol6/iss1/5</guid>
<pubDate>Fri, 15 Jan 2010 16:26:24 PST</pubDate>
<description>We offer an explanation to the simulation result of Austin (2009) regarding rate ratios, and argue that unmatched analysis of propensity score matched count data results in conservative statistical inferences on the rate ratios.</description>

<author>Liang Li</author>


<category>General Biostatistics</category>

</item>






<item>
<title>Lack of Fit in Self Modeling Regression: Application to Pulse Waveforms</title>
<link>http://www.bepress.com/ijb/vol6/iss1/4</link>
<guid isPermaLink="true">http://www.bepress.com/ijb/vol6/iss1/4</guid>
<pubDate>Thu, 14 Jan 2010 11:06:42 PST</pubDate>
<description>Self modeling regression (SEMOR) is an approach for modeling sets of observed curves that have a common shape (or sequence of features) but have variability in the amplitude (y-axis) and/or timing (x-axis) of the features across curves. SEMOR assumes the x and y axes for each observed curve can be separately transformed in a parametric manner so that the features across curves are aligned with the common shape, usually represented by non-parametric function.  We show that when the common shape is modeled with a regression spline and the transformational parameters are modeled as random with the traditional distribution (normal with mean zero), the SEMOR model may surprisingly suffer from lack of fit and the variance components may be over-estimated. A random effects distribution that restricts the predicted random transformational parameters to have mean zero or the inclusion of a fixed transformational parameter improves estimation. Our work is motivated by arterial pulse pressure waveform data where one of the variance components is a novel measure of short-term variability in blood pressure.</description>

<author>Lyndia C. Brumback</author>


<category>Clinical Epidemiology</category>

<category>Longitudinal Data Analysis and Time Series</category>

<category>Multivariate Analysis</category>

</item>






<item>
<title>Estimation of Modified Concordance Ratio in Sib-Pairs: Effect of Consanguinity on the Risk of Congenital Heart Diseases</title>
<link>http://www.bepress.com/ijb/vol6/iss1/3</link>
<guid isPermaLink="true">http://www.bepress.com/ijb/vol6/iss1/3</guid>
<pubDate>Wed, 06 Jan 2010 15:10:17 PST</pubDate>
<description>Family studies are widely used for research into genetic and environmental influences on human traits. In this paper, we establish statistical methodology for the estimation of a new measure of sib similarity with respect to dichotomous traits measured on each member of within family sib-pair. We call this parameter &quot;excess risk.&quot; For inference problems involving a single sample, we construct a large sample confidence interval on the concerned parameter. It has long been suspected that consanguinity is a risk factor for many genetic defects. Therefore, we establish a procedure to test the significance of the difference between excess risk parameters in a sample of consanguineous marriages and another sample of non-consanguineous marriages. We apply the methodology to data from a hospital-based congenital heart defects registry in Saudi Arabia, a population in which consanguinity is quite common.</description>

<author>Mohamed M. Shoukri</author>


<category>Epidemiology</category>

<category>General Biostatistics</category>

</item>






<item>
<title>Comparing Mortality in Renal Patients on Hemodialysis versus Peritoneal Dialysis Using a Marginal Structural Model</title>
<link>http://www.bepress.com/ijb/vol6/iss1/2</link>
<guid isPermaLink="true">http://www.bepress.com/ijb/vol6/iss1/2</guid>
<pubDate>Wed, 06 Jan 2010 15:10:13 PST</pubDate>
<description>When comparing the causal effect of peritoneal dialysis (PD) and hemodialysis (HD) treatment on lowering mortality in renal patients, using observational data, it is necessary to adjust for different forms of confounding and informative censoring. Both the type of dialysis treatment that is started with and mortality are affected by baseline covariates. Longitudinal and baseline variables can affect both the probability of switching from one type of dialysis to the other, and mortality. Longitudinal and baseline variables can also affect the probability of receiving a kidney transplant, possibly causing informative censoring. Adjusting for longitudinal variables by including them as covariates in a regression model potentially causes bias, for instance by losing a possible indirect effect of dialysis on mortality via these longitudinal variables. Instead, we fitted a marginal structural model (MSM) to estimate the causal effect of dialysis type, adjusted for confounding and informative censoring. We used the MSM to compare the hazard of death as well as cumulative survival between the potential treatment trajectories &quot;always PD&quot; and &quot;always HD&quot; over time, conditional on age and diabetes mellitus status. We used inverse probability weighting (IPW) to fit the MSM.</description>

<author>Willem M. van der Wal</author>


<category>Clinical Epidemiology</category>

<category>General Biostatistics</category>

<category>Longitudinal Data Analysis and Time Series</category>

<category>Multivariate Analysis</category>

</item>






<item>
<title>Comparison of Estimators for Measures of Linkage Disequilibrium</title>
<link>http://www.bepress.com/ijb/vol6/iss1/1</link>
<guid isPermaLink="true">http://www.bepress.com/ijb/vol6/iss1/1</guid>
<pubDate>Wed, 06 Jan 2010 15:10:11 PST</pubDate>
<description>The measurement of biallelic pair-wise association called linkage disequilibrium (LD) is an important issue in order to understand the genomic architecture. A plethora of measures of association in two by two tables have been proposed in the literature. Beside the problem of choosing an appropriate measure, the problem of their estimation has been neglected in the literature. It needs to be emphasized that the definition of a measure and the choice of an estimator function for it are conceptually unrelated tasks.    

In this paper, we compare the performance of various estimators for the three popular LD measures D', r and Y in a simulation study for small to moderate samples sizes (N&#60;=500). The usual frequency-plug-in estimators can lead to unreliable or undefined estimates. Estimators based on the computationally expensive volume measures have been proposed recently as a remedy to this well-known problem. We confirm that volume estimators have better expected mean square error than the naive plug-in estimators. But they are outperformed by estimators plugging-in easy to calculate non-informative Bayesian probability estimates into the theoretical formulae for the measures. Fully Bayesian estimators with non-informative Dirichlet priors have comparable accuracy but are computationally more expensive. 

We recommend the use of non-informative Bayesian plug-in estimators based on Jeffreys' prior, in particular when dealing with SNP array data where the occurrence of small table entries and table margins is likely.</description>

<author>Markus Scholz</author>


<category>Genetics</category>

<category>Microarrays</category>

<category>Statistical Theory and Methods</category>

</item>






<item>
<title>Modeling Cumulative Incidences of Dementia and Dementia-Free Death Using a Novel Three-Parameter Logistic Function</title>
<link>http://www.bepress.com/ijb/vol5/iss1/29</link>
<guid isPermaLink="true">http://www.bepress.com/ijb/vol5/iss1/29</guid>
<pubDate>Tue, 10 Nov 2009 11:39:08 PST</pubDate>
<description>Parametric modeling of univariate cumulative incidence functions and logistic models have been studied extensively. However, to the best of our knowledge, there is no study using logistic models to characterize cumulative incidence functions.  In this paper, we propose a novel parametric model which is an extension of a widely-used four-parameter logistic function for dose-response curves.  The modified model can accommodate various shapes of cumulative incidence functions and be easily implemented using standard statistical software.   The simulation studies demonstrate that the proposed model is as efficient as or more efficient than its nonparametric counterpart when it is correctly specified, and outperforms the existing Gompertz model when the underlying cumulative incidence function is sigmoidal.  The practical utility of the modified three-parameter logistic model is illustrated using the data from the Cache County Study of dementia.</description>

<author>Yu Cheng</author>


<category>Survival Analysis</category>

</item>






<item>
<title>Mixed-Effects Models for Conditional Quantiles with Longitudinal Data</title>
<link>http://www.bepress.com/ijb/vol5/iss1/28</link>
<guid isPermaLink="true">http://www.bepress.com/ijb/vol5/iss1/28</guid>
<pubDate>Sat, 07 Nov 2009 16:14:34 PST</pubDate>
<description>We propose a regression method for the estimation of conditional quantiles of a continuous response variable given a set of covariates when the data are dependent. Along with fixed regression coefficients, we introduce random coefficients which we assume to follow a form of multivariate Laplace distribution. In a simulation study, the proposed quantile mixed-effects regression is shown to model the dependence among longitudinal data correctly and estimate the fixed effects efficiently. It performs similarly to the linear mixed model at the central location when the regression errors are symmetrically distributed, but provides more efficient estimates when the errors are over-dispersed. At the same time, it allows the estimation at different locations of conditional distribution, which conveys a comprehensive understanding of data. We illustrate an application to clinical data where the outcome variable of interest is bounded within a closed interval.</description>

<author>Yuan Liu</author>


<category>Longitudinal Data Analysis and Time Series</category>

<category>Statistical Models</category>

<category>Statistical Theory and Methods</category>

</item>






<item>
<title>Measures to Summarize and Compare the Predictive Capacity of Markers</title>
<link>http://www.bepress.com/ijb/vol5/iss1/27</link>
<guid isPermaLink="true">http://www.bepress.com/ijb/vol5/iss1/27</guid>
<pubDate>Thu, 01 Oct 2009 16:58:55 PDT</pubDate>
<description>The predictive capacity of a marker in a population can be described using the population distribution of risk (Huang et al. 2007; Pepe et al. 2008a; Stern 2008). Virtually all standard statistical summaries of predictability and discrimination can be derived from it (Gail and Pfeiffer 2005). The goal of this paper is to develop methods for making inference about risk prediction markers using summary measures derived from the risk distribution. We describe some new clinically motivated summary measures and give new interpretations to some existing statistical measures. Methods for estimating these summary measures are described along with distribution theory that facilitates construction of confidence intervals from data. We show how markers and, more generally, how risk prediction models, can be compared using clinically relevant measures of predictability. The methods are illustrated by application to markers of lung function and nutritional status for predicting subsequent onset of major pulmonary infection in children suffering from cystic fibrosis. Simulation studies show that methods for inference are valid for use in practice.</description>

<author>Wen Gu</author>


<category>Clinical Epidemiology</category>

<category>Statistical Models</category>

</item>






<item>
<title>Using Generalized Additive Models to Detect and Estimate Threshold Associations</title>
<link>http://www.bepress.com/ijb/vol5/iss1/26</link>
<guid isPermaLink="true">http://www.bepress.com/ijb/vol5/iss1/26</guid>
<pubDate>Wed, 16 Sep 2009 15:12:31 PDT</pubDate>
<description>In a variety of research settings, investigators may wish to detect and estimate a threshold in the association between continuous variables. A threshold model implies a non-linear relationship, with the slope changing at an unknown location. Generalized additive models (GAMs) (Hastie and Tibshirani, 1990) estimate the shape of the non-linear relationship directly from the data and, thus, may be useful in this endeavour.We propose a method based on GAMs to detect and estimate thresholds in the association between a continuous covariate and a continuous dependent variable.  Using simulations, we compare it with the maximum likelihood estimation procedure proposed by Hudson (1966). We search for potential thresholds in a neighbourhood of points whose mean numerical second derivative (a measure of local curvature) of the estimated GAM curve was more than one standard deviation away from 0 across the entire range of the predictor values.  A threshold association is declared if an F-test indicates that the threshold model fit significantly better than the linear model.  For each method, type I error for testing the existence of a threshold against the null hypothesis of a linear association was estimated. We also investigated the impact of the position of the true threshold on power, and precision and bias of the estimated threshold.Finally, we illustrate the methods by considering whether a threshold exists in the association between systolic blood pressure (SBP) and body mass index (BMI) in two data sets.</description>

<author>Andrea Benedetti</author>


<category>General Biostatistics</category>

</item>






<item>
<title>On the Use of K-Fold Cross-Validation to Choose Cutoff Values and Assess the Performance of Predictive Models in Stepwise Regression</title>
<link>http://www.bepress.com/ijb/vol5/iss1/25</link>
<guid isPermaLink="true">http://www.bepress.com/ijb/vol5/iss1/25</guid>
<pubDate>Mon, 27 Jul 2009 11:09:49 PDT</pubDate>
<description>This paper addresses a methodological technique of leave-many-out cross-validation for choosing cutoff values in stepwise regression methods for simplifying the final regression model. A practical approach to choose cutoff values through cross-validation is to compute the minimum Predicted Residual Sum of Squares (PRESS). A leave-one-out cross-validation may overestimate the predictive model capabilities, for example see Shao (1993) and So et al (2000). Shao proves with asymptotic results and simulation that the model with the minimum value for the leave-one-out cross validation estimate of predictor errors is often over specified. That is, too many insignificant variables are contained in set &#946;i of the regression model. He recommended using a method that leaves out a subset of observations, called K-fold cross-validation. Leave-many-out procedures can be more adequate in order to obtain significant and optimal results. We describe various investigations for the assessment of performance of predictive regression models, including different values of K in K-fold cross-validation and selecting the best possible cutoff-values for automated model selection methods. We propose a resampling procedure by introducing alternative estimates of boosted cross-validated PRESS values for deciding the number of observations (l) to be omitted and number of folds/subsets (K) subsequently in K-fold cross-validation. Salahuddin and Hawkes (1991) used leave-one-out cross-validation to select equal cutoff values in stepwise regression which minimizes PRESS. We concentrate on applying K-fold cross-validation to choose unequal cutoff values that is F-to-enter and F-to-remove values which are then used for determining predictor variables in a regression model from the full data set. Our computer program for K-fold cross-validation can be efficiently used for choosing both equal and unequal cutoff values for automated model selection methods. Some previously analyzed data and Monte Carlo simulation are used to evaluate the proposed method against alternatives through a design experiment approach.</description>

<author>Zafar Mahmood</author>


<category>Statistical Models</category>

</item>






<item>
<title>Inference in Epidemic Models without Likelihoods</title>
<link>http://www.bepress.com/ijb/vol5/iss1/24</link>
<guid isPermaLink="true">http://www.bepress.com/ijb/vol5/iss1/24</guid>
<pubDate>Mon, 20 Jul 2009 13:08:37 PDT</pubDate>
<description>Likelihood-based inference for epidemic models can be challenging, in part due to difficulties in evaluating the likelihood. The problem is particularly acute in models of large-scale outbreaks, and unobserved or partially observed data further complicates this process. Here we investigate the performance of Markov Chain Monte Carlo and Sequential Monte Carlo algorithms for parameter inference, where the routines are based on approximate likelihoods generated from model simulations. We compare our results to a gold-standard data-augmented MCMC for both complete and incomplete data. We illustrate our techniques using simulated epidemics as well as data from a recent outbreak of Ebola Haemorrhagic Fever in the Democratic Republic of Congo and discuss situations in which we think simulation-based inference may be preferable to likelihood-based inference.</description>

<author>Trevelyan McKinley</author>


<category>Disease Modeling</category>

</item>






<item>
<title>Predicting Potential Placebo Effect in Drug Treated Subjects</title>
<link>http://www.bepress.com/ijb/vol5/iss1/23</link>
<guid isPermaLink="true">http://www.bepress.com/ijb/vol5/iss1/23</guid>
<pubDate>Mon, 06 Jul 2009 15:32:19 PDT</pubDate>
<description>Non-specific responses to treatment (commonly known as placebo response) are pervasive when treating mental illness.  Subjects treated with an active drug may respond in part due to non-specific aspects of the treatment, i.e, those not related to the chemical effect of the drug.  To determine the extent a subject responds due to the chemical effect of a drug, one must disentangle the specific drug effect from the non-specific placebo effect. This paper presents a unique statistical model that allows for the separate prediction of a specific effect and non-specific effects in drug treated subjects.  Data from a clinical trial comparing fluoxetine to a placebo for treating depression is used to illustrate this methodology.</description>

<author>Eva Petkova</author>


<category>Clinical Trials</category>

<category>General Biostatistics</category>

<category>Longitudinal Data Analysis and Time Series</category>

<category>Statistical Models</category>

</item>






<item>
<title>Semiparametrically Efficient Estimation of Conditional Instrumental Variables Parameters</title>
<link>http://www.bepress.com/ijb/vol5/iss1/22</link>
<guid isPermaLink="true">http://www.bepress.com/ijb/vol5/iss1/22</guid>
<pubDate>Tue, 30 Jun 2009 19:24:51 PDT</pubDate>
<description>In this paper, I propose a set of parameters designed to identify the slope of structural relationships based on a combination of conditioning on covariates and the use of an exogenous instrument. After giving structural interpretations to these parameters in the context of specific semiparametric models, I derive their efficient influence curves in a fully nonparametric context as well as under imposition of restrictions on the instrument. These influence curves give the semiparametric efficiency bounds for regular asymptotically linear estimators of the parameters and allow the construction of asymptotically efficient estimators. Monte Carlo experiments finally demonstrate the good finite sample performance of such estimators.</description>

<author>Maximilian Kasy</author>


<category>Statistical Theory and Methods</category>

</item>






<item>
<title>Mixed-Effects Poisson Regression Models for Meta-Analysis of Follow-Up Studies with Constant or Varying Durations</title>
<link>http://www.bepress.com/ijb/vol5/iss1/21</link>
<guid isPermaLink="true">http://www.bepress.com/ijb/vol5/iss1/21</guid>
<pubDate>Fri, 26 Jun 2009 11:52:46 PDT</pubDate>
<description>We present a framework for meta-analysis of follow-up studies with constant or varying duration using the binary nature of the data directly. We use a generalized linear mixed model framework with the Poisson likelihood and the log link function. We fit models with fixed and random study effects using Stata for performing meta-analysis of follow-up studies with constant or varying duration. The methods that we present are capable of estimating all the effect measures that are widely used in such studies such as the Risk Ratio, the Risk Difference (in case of studies with constant duration), as well as the Incidence Rate Ratio and the Incidence Rate Difference (for studies of varying duration). The methodology presented here naturally extends previously published methods for meta-analysis of binary data in a generalized linear mixed model framework using the Poisson likelihood. Simulation results suggest that the method is uniformly more powerful compared to summary based methods, in particular when the event rate is low and the number of studies is small. The methods were applied in several already published meta-analyses with very encouraging results. The methods are also directly applicable to individual patients' data offering advanced options for modeling heterogeneity and confounders. Extensions of the models for more complex situations, such as competing risks models or recurrent events are also discussed. The methods can be implemented in standard statistical software and illustrative code in Stata is given in the appendix.</description>

<author>Pantelis G. Bagos</author>


<category>Clinical Epidemiology</category>

<category>Clinical Trials</category>

<category>Epidemiology</category>

<category>General Biostatistics</category>

<category>Multivariate Analysis</category>

<category>Statistical Models</category>

</item>






<item>
<title>Optimal Sufficient Statistics for Parametric and Non-Parametric Multiple Simultaneous Hypothesis Testing</title>
<link>http://www.bepress.com/ijb/vol5/iss1/20</link>
<guid isPermaLink="true">http://www.bepress.com/ijb/vol5/iss1/20</guid>
<pubDate>Tue, 23 Jun 2009 13:45:27 PDT</pubDate>
<description>In multiple simultaneous hypothesis testing (MSHT), a significance thresholding function as a scalar statistic can be designed in an adaptive manner by sharing information among many tests performed simultaneously. By using such an adapted statistic, MSHT has greater detection power than tests using simple individual statistics. To systematically obtain an optimal thresholding function that maximizes the detection power in MSHT, Storey (2007) proposed a theoretical framework called the optimal discovery procedure (ODP). He also proposed an empirical estimation of the ODP thresholding function for a parametric MSHT that presupposes parametric forms of the null and alternative likelihood functions. Empirical Bayesian testing (Efron et al. 2001), which is based on a non-parametric treatment of arbitrary test statistics, has sometimes exhibited comparable power to the ODP. These two MSHT frameworks appear to be closely related but, because of differences in their approach (frequentist vs. Bayesian), the relationship is not well understood.We present the new concept of an optimal sufficient statistic that links the ODP and empirical Bayesian frameworks, and we show that the local false discovery rate based on the empirical Bayes can be an optimal thresholding function if a certain condition holds. We lay out exhaustive sets of presumptions to achieve optimal thresholding functions and show that, if an optimal thresholding function is derived for a parametric MSHT problem, it is still optimal for a more general and broader range of MSHT problems defined in a non- or semi-parametric way. A guide to designing optimal thresholding functions for general MSHT problems is thus provided by our study.</description>

<author>Shigeyuki Oba</author>


<category>Microarrays</category>

<category>Statistical Theory and Methods</category>

</item>






<item>
<title>A Simulation Study of the Validity and Efficiency of Design-Adaptive Allocation to Two Groups in the Regression Situation</title>
<link>http://www.bepress.com/ijb/vol5/iss1/19</link>
<guid isPermaLink="true">http://www.bepress.com/ijb/vol5/iss1/19</guid>
<pubDate>Fri, 29 May 2009 12:08:48 PDT</pubDate>
<description>Dynamic allocation of participants to treatments in a clinical trial has been an alternative to randomization for nearly 35 years. Design-adaptive allocation is a particularly flexible kind of dynamic allocation. Every investigation of dynamic allocation methods has shown that they improve balance of prognostic factors across treatment groups, but there have been lingering doubts about their influence on the validity of statistical inferences. Here we report the results of a simulation study focused on this and similar issues. Overall, it is found that there are no statistical reasons, in the situations studied, to prefer randomization to design-adaptive allocation. Specifically, there is no evidence of bias, the number of participants wasted by randomization in small studies is not trivial, and when the aim is to place bounds on the prediction of population benefits, randomization is quite substantially less efficient than design-adaptive allocation. A new, adjusted permutation estimate of the standard deviation of the regression estimator under design-adaptive allocation is shown to be an unbiased estimate of the true sampling standard deviation, resolving a long-standing problem with dynamic allocations. These results are shown in situations with varying numbers of balancing factors, different treatment and covariate effects, different covariate distributions, and in the presence of a small number of outliers.</description>

<author>Mikel Aickin</author>


<category>Clinical Trials</category>

</item>






<item>
<title>Likelihood Estimation of Conjugacy Relationships in Linear Models with Applications to High-Throughput Genomics</title>
<link>http://www.bepress.com/ijb/vol5/iss1/18</link>
<guid isPermaLink="true">http://www.bepress.com/ijb/vol5/iss1/18</guid>
<pubDate>Fri, 29 May 2009 12:08:42 PDT</pubDate>
<description>In the simultaneous estimation of a large number of related quantities, multilevel models provide a formal mechanism for efficiently making use of the ensemble of information for deriving individual estimates. In this article we investigate the ability of the likelihood to identify the relationship between signal and noise in multilevel linear mixed models. Specifically, we consider the ability of the likelihood to diagnose conjugacy or independence between the signals and noises.  Our work was motivated by the analysis of data from high-throughput experiments in genomics.  The proposed model leads to a more flexible family. However, we further demonstrate that adequately capitalizing on the benefits of a well fitting fully-specified likelihood in the terms of gene ranking is difficult.</description>

<author>Brian S. Caffo</author>


<category>Genetics</category>

</item>






<item>
<title>Measuring Agreement about Ranked Decision Choices for a Single Subject</title>
<link>http://www.bepress.com/ijb/vol5/iss1/17</link>
<guid isPermaLink="true">http://www.bepress.com/ijb/vol5/iss1/17</guid>
<pubDate>Thu, 28 May 2009 10:49:50 PDT</pubDate>
<description>Introduction. When faced with a medical classification, clinicians often rank-order the likelihood of potential diagnoses, treatment choices, or prognoses as a way to focus on likely occurrences without dropping rarer ones from consideration. To know how well clinicians agree on such rankings might help extend the realm of clinical judgment farther into the purview of evidence-based medicine. If rankings by different clinicians agree better than chance, the order of assignments and their relative likelihoods may justifiably contribute to medical decisions. If the agreement is no better than chance, the ranking should not influence the medical decision. 
 Background. Available rank-order methods measure agreement over a set of decision choices by two rankers or by a set of rankers over two choices (rank correlation methods), or an overall agreement over a set of choices by a set of rankers (Kendall's W), but will not measure agreement about a single decision choice across a set of rankers. Rating methods (e.g. kappa) assign multiple subjects to nominal categories rather than ranking possible choices about a single subject and will not measure agreement about a single decision choice across a set of rankers. 
Method. In this article, we pose an agreement coefficient A for measuring agreement among a set of clinicians about a single decision choice and compare several potential forms of A. A takes on the value 0 when agreement is random and 1 when agreement is perfect. It is shown that A = 1 - observed disagreement/maximum disagreement. A particular form of A is recommended and tables of 5% and 10% significant values of A are generated for common numbers of ranks and rankers. 
Examples. In the selection of potential treatment assignments by a Tumor Board to a patient with a neck mass, there is no significant agreement about any treatment. Another example involves ranking decisions about a proposed medical research protocol by an Institutional Review Board (IRB). The decision to pass a protocol with minor revisions shows agreement at the 5% significance level, adequate for a consistent decision.</description>

<author>Robert H. Riffenburgh</author>


<category>Statistical Theory and Methods</category>

</item>






<item>
<title>Modelling and Assessing Differential Gene Expression Using the Alpha Stable Distribution</title>
<link>http://www.bepress.com/ijb/vol5/iss1/16</link>
<guid isPermaLink="true">http://www.bepress.com/ijb/vol5/iss1/16</guid>
<pubDate>Wed, 13 May 2009 12:00:46 PDT</pubDate>
<description>After normalization, the distribution of gene expressions for very different organisms have a similar shape, usually exhibit heavier tails than a Gaussian distribution, and have a certain degree of asymmetry. Therefore, this distribution has been modeled in the literature using different parametric families of distributions, such the Asymmetric Laplace or the Cauchy distribution. Moreover, it is known that the tails of spot-intensity distributions are described by a power law and the variance of a given array increases with the number of genes. These features of the distribution of gene expression strongly suggest that the alpha-stable distribution is suitable to model it.In this work, we model the error distribution for gene expression data using the alpha-stable distribution. This distribution is tested successfully for four different datasets. The Kullback-Leibler, Chi-square and Hellinger tests are performed to compare how alpha-stable, Asymmetric Laplace and Gaussian fit the spot intensity distribution. The alpha-stable is proved to perform much better for every array in every dataset considered.Furthermore, using an alpha-stable mixture model, a Bayesian log-posterior odds is calculated allowing us to decide whether a gene is differently expressed or not. This statistic is based on the Scale Mixture of Normals and other well known properties of the alpha-stable distribution. The proposed methodology is illustrated using simulated data and the results are compared with the other existing statistical approach.</description>

<author>Diego Salas-Gonzalez</author>


<category>Microarrays</category>

<category>Statistical Models</category>

</item>





</channel>
</rss>
