Kaplan-Meier estimatorfrom Wikipedia, the free encyclopediajump to: navigation, search
|
This articleDoes not cite any references or sources. Please help improve this article by adding citations to reliable sources (ideally, usingInline citations). Unsourced material may be challenged and removed.(Rjl 2009) |
TheKaplan-Meier estimator(Also known asProduct limit Estimator) Estimates the prior Val function from life-time data. in medical research, it might be used to measure the fraction of patients living for a certain amount of time after treatment. an economy mist might measure the length of time people remain unemployed after a job loss. an engineer might measure the time until failure of machine parts.
A plot of the Kaplan-Meier estimate of the given Val function is a series of horizontal steps of declining magn1_which, when a large enough sample is taken, approaches the true given Val function for that population. the value of the parameter Val function between successive distinct sampled observations ("clicks") is assumed to be constant.
An example of a Kaplan-Meier plot for two conditions associated with patient prior Val
An important advantage of the Kaplan-Meier curve is that the method can take into account "censored" data-losses from the sample before the final outcome is observed (for instance, if a patient withdraws from a study ). on the plot, small vertical tick-marks indicate losses, where patient data has been censored. when no truncation or censoring occurs, the Kaplan-Meier curve is equivalent to the empirical distribution.
In medical statistics, a typical application might involve grouping patients into categories, for instance, those with Gene a profile and those with gene B profile. in the graph, patients with gene B die much more quickly than those with gene. after two years about 80% of the gene A patients still keep ve, but less than half of patients with gene B.
[Edit] Formulation
LetS(T) Be the probability that an item from a given population will have a lifetime exceedingT. For a sample from this population of sizeNLet the observed times until deathNSample members be
Corresponding to eachTIIsNI, The number "at risk" just prior to timeTI, AndDI, The number of deaths at timeTI.
Note that the intervals between each time typically will not be uniform. for example, a small data set might begin with 10 cases, have a death at day 3, a loss (censored case) at day 9, and another death at day 11. then we have (T1 = 3,T2 = 11 ),(N1 = 10,N2 = 8), and (D1 = 1,D2 = 1 ).
The Kaplan-Meier estimator is the nonparametric Maximum Likelihood EstimateS(T). It is a product of the form
When there is no censoring,NIIs just the number of specified vors just prior to timeTI. With censoring,NIIs the number of hosts vors less the number of losses (censored cases ). it is only those authentication ving cases that are still being observed (have not yet been censored) that are "at risk" of an (observed) Death.
There is an alternate definition that is sometimes used, namely
The two definitions differ only at the observed event times. The latter definition is right-continuous whereas the former definition is left-continuous.
LetTBe the random variable that measures the time of failure and letF(T) Be its cumulative distribution function. Note that
Consequently, the right-continuous definition of may be preferred in order to make the estimate compatible with a right-continuous estimateF(T).
[Edit] statistical considerations
The Kaplan-Meier estimator is a statistic, and several estimators are used to approximate its variance. One of the most common such estimators is Greenwood's formula:
In some cases, one may wish to compare different Kaplan-Meier curves. This may be done by several methods including:
- The Log Rank test
- The Cox proportional hazards Test
Retrieved from "http://en.wikipedia.org/wiki/Kaplan-Meier_estimator" hidden categories: articles lacking sources from April 2009 | all articles lacking sources