Survival Analysis

Biostatistical Methods


Biostatistical Methods

BioXpedia is proud to offer data analysis using survival analysis.

This data analysis focuses on using survival analysis to determine if two different groups of patients have different survival rates. 

The data analysis includes the following components:

  • Detailed PDF report.
  • Data handling.
  • Employment of cox regression models.
  • Visualization of survival using Kaplan-Meier curves and logrank tests.

Read below for more information on survival analysis: 

Survival analysis can, as the name suggests, be used to answer questions related to survival. What proportion of patients suffering from lung cancer will survive past five years? Or, does survival differ between two groups? Survival analysis is, however, not only used to look at survival, and it is also widely used outside of the medical/biological fields. As an example, survival analysis can be used to estimate time to an event. This event could be death, but it could also be recovery or infection (George et al., 2014).

Cox-regression is a so-called multivariate analysis, which means that it can be used to investigate the effects of multiple variables on the probability of experiencing an event. Cox-regression can handle both categorical variables (e.g. male or female, treatment or placebo) and quantitative variables (e.g. age or height). In survival analysis literature, variables will often be referred to as covariates (Gozal et al., 2016).

Cox-regression calculates the probability of suffering an event given that the patient is alive at a specific time. This is sometimes referred to as the hazard rate and similarly cox-regression is also known as Proportional Hazards Regression.

When comparing the hazard of groups, the hazard ratio, which is analogous to odds ratio, is often used. The hazard ratio is the ratio of observed events divided by expected events in two groups – group 1 and group 2. A value of 1 indicates that the risk of an event is equal in the two groups while a value larger than one e.g. 3.5 means that the risk of an event is 3.5 times higher in group 1.

Kaplan-Meier plots are a very common way of visualizing results from a survival analysis e.g. cox-regression. Kaplan-Meier plots the time on the x-axis and the probability of surviving on the y-axis. If a Kaplan-Meier plot is made after a cox-regression analysis of two groups, the plot will have two lines starting at 1.0 on the y-axis that gradually decrease as the x-values increase. Each vertical drop on one of the curves represent an event, e.g. a patient dying. If the difference in survival between the two groups are large, then the difference in x-values for the same y-value will be large. Often Kaplan-Meier plots will show confidence intervals as dashed lines or shaded areas around the curves (Dudley et al., 2016).

Logrank test is a non-parametric test used to compare the survival of two or more groups. That the test is non-parametric means, that it makes no assumptions about the distribution of the survivors.

The null hypothesis of the logrank test is that there is no difference in survival between the groups. The p-values is then obtained by calculating how likely the observed difference is, given that the underlying populations have the same survival distribution (Schober et al. 2018).


  1. Dudley, William N et al. “An Introduction to Survival Statistics: Kaplan-Meier ” Journal of the advanced practitioner in oncologyvol. 7,1 (2016): 91-100.

  1. George, Brandon et al. “Survival analysis and regression models.” Journal of nuclear cardiology: official publication of the American Society of Nuclear Cardiology 21,4 (2014): 686-94.

  1. Gozal, David et al. “Sleep Apnea and Cancer: Analysis of a Nationwide Population Sample.” Sleep 39,8 1493-500. 1 Aug. 2016.

  1. Schober, Patrick, and Thomas R Vetter. “Survival Analysis and Interpretation of Time-to-Event Data: The Tortoise and the Hare.” Anesthesia and analgesia 127,3 (2018): 792-798.