Power in Prediction – Part 2: A need for data and analytics in the future

Read more

Read full article

A need for data

Predictive analytics, in all its forms, rely on data.1 Data is the means by which mathematical models are ‘taught’, using outputs from past experiences to drive predictions for events that are yet to happen. This need for large data sets is echoed by the law of large numbers. As the number of variables (data points) increases, the sample mean gets closer to approaching the theoretical mean.2 Put simply, the more data that are available, the more accurate our predictions can be.2

Importantly also, large data sets allow us to draw experiences from a large number of cases in order to explore parameters we may as yet, be unfamiliar with. Large quantities of data may enable a computer algorithm to find patterns in intricate relationships between data points that humans would not be able to discover alone.3 The larger the dataset therefore, the greater the wealth of information that a machine may have to make these connections.

Where limited data sets may have historically been a barrier to effective predictions, we now generate more data than ever before. It is estimated that we generate an estimated 2.5 quintillion bytes of data every day.4 With this in mind, the seemingly limitless exposure to data that we now have has seen another wave in our predictive capabilities. The ability to fine-tune predictions is theoretically greater than ever. Understanding how best to utilize such vast quantities of data, however, remains an important challenge.

While the size of data sets is important, the quality of data must also be considered. Data bias and variance are important challenges with any large data set, with good computational models aiming to minimize both.3 Large data sets aim to reduce these factors however we must consider the implications of training computational models with incomplete data sets; not necessarily incomplete from a sizing perspective but rather a data quality perspective. Data scientists have fallen foul when training algorithms with data that does not accurately reflect the diversity of a given patient population. In some instances, this has resulted in substantial bias when identifying patients for prioritised treatment to optimize discharge lists,5 or when developing tools to explore genomic data from only a limited population sample.6  Data inputs may mirror the unequal healthcare access that we experience today.5 We cannot expect predictive algorithms to function in a balanced manner if we do not expose it to appropriately balanced data. Overall data bias is an important hurdle to address when developing healthcare and indeed all predictive algorithms.

Predictive analytics in healthcare

With advances in predictive mathematics across so many different sectors, it is no surprise that the healthcare industry is also exploring the ways patient care can be optimized through the use of predictive analytics. The potential application of these analytics is vast. They may be used to improve differential diagnoses, to alert users to potential health risks, to prevent disease exacerbation, to deliver healthcare solutions and to provide treatment recommendations.7

Predictive analytics could be used to support ophthalmologists in the diagnosis of diabetic retinopathy and dermatologists in the identification of skin lesions in different skin cancers.8,9 Predicting disease exacerbations may be another means in which analytics are used to proactively identify when patients may be susceptible to conditions such as asthma attacks, septic shock, organ failure or migraine; the potential of which has already been evaluated in studies with promising results.10–15

Predictive analytics may even be used as a treatment option itself, as demonstrated by a team in the USA who developed AI to control upper limb prostheses.16 Moreover, they may support patient care by optimizing administrative tasks within healthcare services to offer much needed cost-saving initiatives.7 Triage algorithms may also be beneficial where waiting times are currently unsatisfactory.17 Finally, rural communities with poor medical resources may benefit from the application of data analysis. In rural areas of China for example, a portable all-in-one diagnostic station has been trialled. The machine can run these tests automatically and send the results for online data analysis in order to generate a diagnosis.18

New technologies are now also available to passively assess treatment use and disease control in asthma and COPD patients and are able to alert users to disease worsening up to 10 days before they are at risk of hospitalization.14 By alerting patients and healthcare professionals early, steps can be taken to avoid hospitalization in some of these subjects.

At its extreme, the advent of analytics and predictive analytics and their increasing implementation within our daily lives, may lead critics to ask whether this technology will eventually completely replace humans in several capacities. With such a broad scope of application, predictive analytics are, at first glance, able (or at least potentially able) to manage many of those tasks traditionally assigned to humans. However, others have argued that a blend of human experience and digital augmentation is, and will always be, necessary. Each type of intelligence will bring something different to the treatment pathway and ensure a complimentary balance of human-led empathy and computer-derived experiences and recommendations.

While much is still to be done, these new technologies, techniques and insights have in many ways built on the foundations as laid by our early forefathers in their approach to predictions. There is an exciting future ahead; no doubt with challenges and extensive and necessary debate, but there is a growing role for analytics and predictive analytics in all our futures.

Want to learn more? See our article ‘Power in Prediction – Part 1: A look through history’ and explore respiratory_care v2.0 and have a read of our whitepapers on small steps towards big data, redefining the value of healthcare, technology and healthcare collaboration and a need for accessibility.


  1. ‘Prediction by the numbers’ transcript. A Netflix documentary. 2018. Available at: https://www.pbs.org/wgbh/nova/video/prediction-by-the-numbers. [Accessed January 2020].
  2. Routledge R. Law of large numbers. Available at: https://www.britannica.com/science/law-of-large-numbers. [Accessed January 2020].
  3. Maheswari JP. Breaking the curse of small datasets in machine learning: part 1. Available at: https://towardsdatascience.com/breaking-the-curse-of-small-datasets-in-machine-learning-part-1-36f28b0c044d. [Accessed February 2020].
  4. How much data do we create every day? The mind-blowing stats everyone should read. Available at: https://www.forbes.com/sites/bernardmarr/2018/05/21/how-much-data-do-we-create-every-day-the-mind-blowing-stats-everyone-should-read/#4ebb608c60ba. [Accessed January 2020].
  5. Nordling L. A fairer way forward for AI in healthcare. 2019. Available at: https://www.nature.com/articles/d41586-019-02872-2. [Accessed November 2019].
  6. The medical futurist. AI bias in healthcare. 2019. Available at: https://medicalfuturist.com/a-i-bias-in-healthcare/. [Accessed December 2019].
  7. Business Insider. AI and machine learning are changing our approach to medicine and the future of healthcare. 2019. Available at: https://www.businessinsider.com/artificial-intelligence-healthcare?r=US&IR=T. [Accessed January 2020].
  8. Ives J. Study shows how AI can improve physician’s diagnostic accuracy. Available at: https://www.news-medical.net/news/20190319/Study-shows-how-AI-can-improve-physicians-diagnostic-accuracy.aspx. [Accessed January 2020].
  9. Esteva A, et al. Nature 2017; 542(7639): 115–118.
  10. Ghosh S et al. J Biomed Inform 2017; 66: 19–31.
  11. Arnold R, et al. Crit Care 2012; 16(Suppl 1): 37.
  12. Siirtola P, et al. Sensors (Basel) 2018; 18(5): 1374.
  13. Honkoop PJ et al. BMJ Open 2017; 7(1): e013935.
  14. Safioti et al. Poster (P693) presented at the ATS Annual Conference, Dallas, TX, USA. 17–22 May 2019.
  15. Tattersfield AE, et al. Facet International study Group. Exacerbations of Asthma: A Descriptive Study of 425 Severe Exacerbations. Am J Respir Crit Care Med. 1991; 160: 594–599.
  16. Bouton CE, et al. Nature 2016; 533: 247–250.
  17. Buch VH, et al. British Journal of General Practice. 2018; 68(668): 143–144.
  18. Guo J and Li B. Health Equity 2018; 2: doi:10.1089/heq.2018.0037.

March 2020 RESP-42108

Related articles

Stay informed

Stay informed with our monthly updates which contain the latest information on the future of connected respiratory and healthcare innovation.