Biomarker Discovery: Function and Advancement

Guest author

Biomarker Discovery & Validation Biomarker discovery is defined as the process in which biological markers, or “biomarkers” for short, are evaluated and measured. This process begins with defining a biological process and analyzing for different types of biomarkers. These can be prognostic, diagnostic, or predictive to determine treatment efficacy or monitor patient safety. Expanding upon this definition, the World Health Organization defines this term as "almost any measurement reflecting an interaction between a biological system and a potential hazard, which may be chemical, physical, or biological.” Biomarkers are also an integral part of drug development as they help accurately determine drug safety and efficacy. In contemporary medicine, many clinical decisions are based upon laboratory test results. According to the Centers for Disease Control, these tests account for 70% of clinical decisions.

Many global approaches including proteomics, metabolomics, and genomics are used to discover biomarkers.

  1. Mass Spectroscopy: Primarily used in toxicological testing and can analyze all biological molecules susceptible to ionization. Proteomics, a subset of mass spectroscopy, is categorized as the analysis of proteins in an organism.

  1. Metabolomics: The comprehensive analysis of metabolites (or the number of a particular type of cell) present in a sample or organism. For example, blood-glucose strips for diabetes testing.

  1. Gene Expression: The process in which genetic information is transcribed into producing a protein molecule. These proteins contribute to an organism’s genetic make-up or genotype. 

After the categorical analysis of thousands of individual molecules, these complex data sets are created. For new biomarkers to be discovered, extensive research is conducted through clinical trials to evaluate the efficacy of a proposed intervention. However, these studies can take upwards of 7-8 years to conclude and tens of millions of dollars. This problem is precisely why policymakers have cooperated with clinical decision-makers to improve the speed of biomarker discovery over the past few years. 

The validation of a biomarker is closely linked to its clinical significance and measurement of performance. A biomarker is only considered valid once a range is determined under which conditions the results will be replicable and accurate. Measures of validity are a necessary component in clinical research to confirm the effectiveness of biomarkers in clinical applications.

Four Main Types of Biomarkers

Biomarkers are found within every system (for example, the digestive, respiratory, and cardiovascular systems) in the human body. The biomarkers can also be analyzed from tissue and most bodily fluids such as urine, blood, stool. For example, a biopsy may be performed to get a sample for a patient with a breast tumor. The sample will then be analyzed for detection of the three biomarkers for breast cancer: the estrogen receptor (ER), progesterone receptor (PR), and human epidermal growth factor receptor 2 (HER2). In doing so, detecting diagnostic biomarkers is key to the early detection of diseases. 

These biomarkers can be broken down into four main types: molecular, histologic, radiographic, and physiologic.

  1. Molecular Biomarkers: Contain biophysical properties that can be further measured through a biological sample (for example, blood, plasma, serum, bodily fluid). A molecular marker can explain how well the body responds to a treatment for a disease or condition. To fully understand the clinical significance of a molecular biomarker, it is helpful to learn the biology that makes this process so important. DNA, known as deoxyribonucleic acid, holds the genetic material in our cells and is responsible for carrying this information to the next generation. RNA, or ribonucleic acid, is known as the messenger of the cell. It is responsible for bringing the genetic information out of the cell and decoding it into amino acids. Through a process called "Translation," the amino acids are further synthesized into protein, which is the basis of the human body. The most prominent molecular biomarkers belong to cancers, such as proteins and genes. These can predict the likelihood of cancer metastasis (or becoming more aggressive) and identify the possible interventions to which cancer would respond more favorably. 

  1. Histologic Biomarkers: Normally contained in tissue samples, histologic biomarkers focus on the toxicological effects and expand upon the pharmacokinetic processes of chemical interaction among the various bodily systems. Distinct cell types and tissues usually define these biomarkers through special stains or microarray construction. This process is considered the "gold standard" for most cancer diagnoses because of its ability to detect molecular alteration. For example, lesions containing intraepithelial neoplasia are considered predictive biomarkers for carcinomas. 

  1. Radiographic Biomarkers: Commonly referred to as imaging biomarkers, these feature an image or sequence of images used to diagnose conditions. For example, an MRI, X-Ray, and CT scan are all examples of radiographic procedures used to evaluate imaging biomarkers. These can be further broken down into either anatomical (like the diameter of a tumor) or functional (like oxygen levels and other physiological features). By having these images readily available, clinicians can determine a more accurate diagnosis, therefore, beginning your treatment before the disease progresses. Clinical trials also benefit from imaging biomarkers as visualization of the pharmacodynamic effects of drugs decreases the time and cost of drug development.

  1. Physiologic Biomarkers: Broadly defined as the measure of processes throughout the body. This definition can be expanded upon to include procedures like vital signs monitoring and electrocardiograms. Because of their simplicity in nature, physiologic biomarkers are the least expensive to detect and most broadly defined to incorporate all the bodily systems. Identification of these biomarkers during a standard doctor's visit is incorporated into many diagnostic exams.

Incorporation of Machine Learning

Recently, machine learning has been adopted as a transformative approach to biomarker discovery. Machine learning is defined as applying mathematical approaches to a model to formulate data for gene expression. An easy way to think of this is by teaching a child how to pick up blue-colored marbles out of a large pile of red-colored marbles. The same concept applies to machine learning and gene expression. An algorithm is programmed into a model that will look for characteristics of biomarkers among a large sample of genes and isolate those individuals. When the gene expression analysis methodology is compared to traditional discovery techniques (like MRI, CT scans, etc.), we can see the complete mechanism of disease progression. Using this technology, such approaches have been used to detect cancer treatment and survivability.


Advances in biomarker discoveries are being implemented to improve patient outcomes and dictate early therapeutic responses. These advancements will continue to improve the likelihood of having an earlier-stage diagnosis of diseases, likely mitigating late-stage disease progression. However, the human brain's capacity is unable to analyze and transcribe such large amounts of data compared to the algorithms programmed into machine learning. Traditional statistical analyses and tools are becoming outdated as vast amounts of molecular data are beyond their computational ability. Because of this, advances in biomarker discovery have led to a more extensive and more diverse classification of diseases. The development of biomarkers is evidence of advancement in clinical diagnosis, prognosis, and disease monitoring. [Further reading: Integrating AI/ML Models for Patient Stratification Leveraging Omics Dataset and Clinical Biomarkers from COVID-19 Patients: A Promising Approach to Personalized Medicine]