After you’ve collected the data from an investigator initiated trial, the next step is to run a series of statistical analyses to gather meaningful insights. For most people, statistical analysis can seem daunting because they’re not sure where or how to begin. So, in this article, we’ll outline how you can get started using descriptive statistics.
Descriptive Statistics
The aim of descriptive statistics, as the name implies, is to summarize and organize the information from data exported from a database. More specifically, it is used to describe the characteristics of the measured variables and their relationships. The three main types of descriptive statistics include distribution, central tendency, and variability.
Distribution
Distribution analysis is used to determine the frequency, or how often, a particular value appears in the data. For example, you ran a clinical study that evaluated the effects of a new viscous eye drop solution for treating dry eyes. The results of the trial revealed that there were several side effects for this product, as shown in the frequency distribution table below. Based on the distribution, you are able to identify that blurry vision is one of the symptoms that occurs most frequently, and therefore should be evaluated further in future studies.
Table 1: The side effects of using a novel eye drop
Symptoms |
Number of times reported |
Foreign body sensation |
10 |
Burning |
10 |
Redness |
5 |
Itchiness |
5 |
Blurry Vision |
50 |
Central Tendency
Central tendency is used to describe the center of the data set and can be described in one of three ways:
- Mean: the average of all the values
- Median: the value that is in the middle of the dataset
- Mode: the most frequent value that appears
Using the previous example, let’s assume that the participants were asked to fill out a survey indicating a comfort rating from 1 to 10 (1 for least comfortable and 10 for very comfortable) each time they received the eye drop. The scores for the participants, as well as the mean, median, and mode, are shown in Table 2 below. It’s important to keep in mind that while the mean, median, and mode often give similar values, the values can be drastically different depending on the data set.
Table 2: The mean, median, and mode for comfort rating score from 9 participants after their eye drop instillation
Data set |
1, 1, 1, 1, 8, 9, 9, 9,10 |
Mean |
5.4 |
Median |
8 |
Mode |
1 |
Based on the information provided, if only the mean, median, or mode was used to describe the central tendency of this data set, then you would derive very different conclusions. For instance, the mean would suggest that the eye drop was only somewhat effective at improving comfort. On the other hand, the median would suggest that the eye drop was very effective, while the mode would suggest the complete opposite. Therefore, it is important to consider all three values in your statistical analysis to accurately describe the data.
Variability
Variability is another important measure that describes the spread of the values within the data set. It can be described using the range, standard deviation, and variance.
- Range: the difference between the maximum and minimum values of the dataset
- Standard deviation: the average amount of variability, commonly denoted as mean ± standard deviation, which describes how far each value lies from the mean. A high standard deviation would suggest a high variability within the data set.
- Variance: describes the spread of the data, which is calculated by the square of the standard deviation
The variability of the data set is important when you need to compare if two different measures are truly different. For instance, you’re interested to know whether the new eye drop formulation is superior to another commercial eye drop in regard to comfort. In one instance, the comfort scores were 7 ± 0.5 for the new eye drop and 6 ± 0.5 for the commercial drop. In this case, because of the low variability, you can confidently conclude that the new eye drop is indeed superior. However, in another scenario, if the comfort scores were 7 ± 3 and 6 ± 3, then it becomes more difficult to discern the differences between these eye drops without further statistical analysis.
We hope that this article gave you a non-daunting starting point to statistical analysis. If you need more help with your analysis or any help regarding your clinical study, please contact Sengi. Stay tuned for our next article, where we’ll cover inferential statistics.