CensoredDistribution

CensoredDistribution[{xmin,xmax},dist]

represents the distribution of values that come from dist and are censored to be between xmin and xmax.

CensoredDistribution[{{xmin,xmax},{ymin,ymax},},dist]

represents the distribution of values that come from the multivariate distribution dist and are censored to be between xmin and xmax, ymin and ymax, etc.

Details

  • CensoredDistribution[{xmin,xmax},dist] is equivalent to TransformedDistribution[f,xdist], where f is given by Piecewise[{{xmin,x<=xmin},{x,xmin<x<xmax},{xmax,x>=xmax}}].
  • Common cases for {xmin,xmax} include:
  • {-,xmax}censoring from above, right-censoring
    {xmin,}censoring from below, left-censoring
    {xmin,xmax}doubly censored, interval-censoring
    {-,},Noneno censoring, uncensored
  • CensoredDistribution can be used with such functions as Mean, CDF, RandomVariate, etc.

Background & Context

  • CensoredDistribution[{xmin,xmax},dist] represents a statistical distribution modeling data that is known to be taken from the univariate distribution dist for all in the interval and that is assumed to be constantly equal to (respectively, constantly equal to ) for (respectively for ). The terms uncensored, left-censored, right-censored, and doubly censored are used to describe univariate censorings for which {xmin,xmax} has the form {-,}, {xmin,}, {-,xmax}, and {xmin,xmax}, respectively, while univariate dist may be either continuous (e.g. NormalDistribution, GammaDistribution, or BetaDistribution) or discrete (e.g. PoissonDistribution, BinomialDistribution, or BernoulliDistribution) and may be defined in terms of transformations, censoring, or truncations (by way of TransformedDistribution, CensoredDistribution, and TruncatedDistribution, respectively) of known distributions.
  • The multivariate CensoredDistribution[{{,},{,}, ,{,}},dist] is defined analogously and thus represents the distribution of vectors taken from the multivariate distribution dist and whose ^(th) component is censored to be in the interval . As in the univariate case, multivariate dist may again be either continuous (e.g. MultinormalDistribution) or discrete (e.g. MultivariateHypergeometricDistribution), and may also be defined as a copula or product (using CopulaDistribution and ProductDistribution, respectively) of known distributions.
  • Censored distributions arise when modeling data for which the values are only partially known (i.e. those datasets containing only partially observed or accuracy-constrained data), and the analysis of datasets containing censored values dates back to the eighteenth-century smallpox investigations of Daniel Bernoulli. The existence of such data is relatively common in fields such as medicine and physiology, as well as in reliability and manufacturing, where failure predictions must sometimes be made without having observed actual failure. Censored distributions are also commonly utilized tools in survival analysis, and a variety of specialized statistical tools (e.g. censored regression) exists to analyze such datasets.
  • By definition, CensoredDistribution[{xmin,xmax},dist] is equivalent to TransformedDistribution[f,xdist], where f is given by Piecewise[{{xmin,x<=xmin},{x,xmin<x<xmax},{xmax,x>=xmax}}]. CensoredDistribution is often confused with TruncatedDistribution, though the two are fundamentally different in the sense that censoring puts the probability at the end of the censoring interval, while the probability is distributed over the truncation interval via truncation.

Examples

open allclose all

Basic Examples  (2)

Define a left-censored discrete distribution:

Probability density function:

Define a right-censored continuous distribution:

Cumulative distribution function:

Scope  (26)

Basic Uses  (9)

Define different types of censoring for a univariate discrete distribution:

Define different types of censoring for a univariate continuous distribution:

Define a right-censored discrete distribution:

Compare probability density functions:

Find probability at 9 for the censored distribution:

Compare to the probability of obtaining a value of at least 9 for the original distribution:

Censor a continuous distribution:

Use a histogram and plot the original density function to visualize the point probability:

A distribution censored to one point:

Censoring for a multivariate continuous distribution:

Compute the expectation of an expression for this distribution:

Censoring for a multivariate discrete distribution:

Compute the mean for the distribution:

Compare with the result obtained using a random sample drawn from the distribution:

Define a doubly censored distribution:

Cumulative distribution function:

The mean and variance of the censored distribution:

Moment has closed form for symbolic order:

Estimate the censoring interval:

Parametric Distributions  (5)

Define a right-censored continuous distribution:

Cumulative distribution function:

Plot a histogram for a random sample. The spike corresponds to the DiracDelta part of PDF:

Define a censored GeometricDistribution:

Compare probability density functions:

The values of the PDF at the censoring points are equal to the following probabilities:

Define a doubly censored GammaDistribution:

Find generating functions:

Define a right-censored PoissonDistribution:

Compare HazardFunction:

Define a two-dimensional censored DirichletDistribution:

Compare CDFs:

Mean and variance for the censored distribution:

Compute probabilities and expectations:

Nonparametric Distributions  (3)

Define a censored EmpiricalDistribution:

Compare cumulative distribution functions:

Define a censored HistogramDistribution:

Compare CDFs:

Define a censored SmoothKernelDistribution:

Compare cumulative distribution functions:

Derived Distributions  (9)

Define a censored ParameterMixtureDistribution:

Cumulative distribution function:

Define a censored MixtureDistribution:

Cumulative distribution function:

Define a a censored OrderDistribution:

Probability density function:

Compare means:

Define a censored CensoredDistribution:

Probability density function:

Compare to doubly censored Poisson distribution:

Define a censored TruncatedDistribution:

Compare cumulative distribution functions:

Define a censored TransformedDistribution:

Cumulative distribution function:

Define a censored MarginalDistribution:

Probability density function:

Compare with the PDF of the marginal:

Define a censored ProductDistribution:

Visualize the density function using a random sample:

Censoring of a QuantityDistribution evaluates to QuantityDistribution:

Compute the mean velocity:

Applications  (4)

An insurance company buys reinsurance at retention level . Assuming claims follow lognormal distribution, find moments of insurer's payout random variate:

Find moments of reinsurer's payout random variate:

The lifetime of a component follows a RayleighDistribution. The components are tested for failures for hours, and if a component has not failed, it is assumed to have a lifetime of exactly hours. Find the length of the test so that at most 5% of the tested components have a lifetime longer than :

Find the test lifetime distribution for :

Compare the censored distribution with the actual lifetime distribution:

Compare the average lifetimes:

The number of shots a beginner golf player needs to sink a 4-par hole follows a PoissonDistribution with an average of 9 shots. Assuming that on the golf course he picks up the ball after the tenth shot, find the distribution of the number of shots on a 4-par hole:

Probability density function:

The average number of shots per 4-par hole on the golf course:

Find the probability that he needs more than 4 shots to sink the ball:

The body weight of adult males in the US follows a normal distribution with a mean of 191 lbs and a standard deviation of 70 lbs. Assuming that each bathroom scale has an upper limit of 300 lbs, find the weight distribution when the measurements are done with a generic bathroom scale:

Cumulative distribution function:

Visualize the density function with a random sample:

Find the average weight:

Find the probability of weighing at least 200 lbs:

Find the probability of weight at or above the scale limit:

Compare to the uncensored distribution:

Properties & Relations  (4)

CensoredDistribution is a special case of TransformedDistribution:

Compare censoring with truncating for a discrete distribution:

While censoring, the weight from outside is placed at the ends of the censoring interval:

While truncating, the weight from outside is evenly distributed over the truncation interval:

Compare censoring and truncating of a continuous distribution:

While censoring, the probability is put at the end of the censoring interval:

While truncating, the probability is distributed over the truncation interval:

Censoring of a continuous distribution may result in a mixed distribution, which is neither continuous nor discrete:

The CDF for the mixed type censored distribution is discontinuous at censoring points:

Visualize the cumulative distribution function:

The probability density function for a censored distribution is not defined, and PDF returns unevaluated:

Differentiation of the CDF results in a function that does not integrate to one:

The sample probability density estimators do not converge as the sample size increases:

Compare this with histograms for the underlying continuous distribution, where estimators do converge:

Possible Issues  (1)

Censoring of a continuous distribution may result in a mixed distribution, which is neither continuous nor discrete:

The PDF for the mixed type censored distribution is not defined: »

Computations with mixed type distributions are fully supported. Compute special moments:

Estimate a censored distribution:

Mixed type distribution can be interpreted as mixture of a continuous and discrete distributions:

Introduced in 2010
 (8.0)
 |
Updated in 2016
 (10.4)