This is the first edition of my new weekly series ‘Simple Sunday’. The idea for this series was initially formed from a Carl Sagan quote I happened to stumble across..
‘“I think I’m able to explain things because understanding wasn’t entirely easy for me. Some things that the most brilliant students were able to see instantly I had to work to understand. I can remember what I had to do to figure it out. The very brilliant ones figure it out so fast they never see the mechanics of understanding.”
This weekly series will really try to encompass this quote by taking difficult concepts and breaking them down to their base ideas to try and help you guys understand them. This is also great because for me to even try and simplify concepts I have to learn the ins and outs too!
And as always rampant commenting is strongly encouraged…
So, without further ado lets kick this series off. For this weeks edition I asked a few of my friends which science concept or thing they found most difficult to understand with the general consensus being:
In a nutshell it is number crunching. But it is possibly one of the scientist’s greatest allies in describing data, analysing data and informing conclusions. This post therefore will describe the process of managing and presenting results and explain some basic statistics that can be done.
So lets start with the simple descriptive statistics you can use to make initial conclusions on your data (and to make it look better when you present it). Descriptive statistics can be a picture (graph/chart) or numerical.
Once you have completed your experiment and obtained your data the first step is to summarise it allowing you to gain an idea of ‘what’s going on’. For data that is categorical (where individuals are placed into groups), you will typically summarise your results by using the number in each group (frequency) or the percentage (relative frequency). So if I were to stand out on the street and count the number of different car makes that drove past we would end up with something like this:
Strange how they added up exactly to 100, its almost as if this data were made up… But anyway, this can be shown as a bar chart. This simply and quickly shows us that out of the 100 cars to drive past Ford and Vauxhall makes were the most common with Land Rover the least common.
So there is the first descriptive statistics you can do to present data. But if we have numerical data with actual numbers that have meaning (not just frequencies), then we will use more descriptive statistics. Lets imagine I’ve completed a hypothetical experiment on the affect of a drug on heart rate. In this experiment I’ve taken 30 individuals, measured their normal heart rate, and then subjected them to the drug and measured their heart rate again.
Here we have the results (completely fabricated), but at first look it does seem that the drug is increasing heart rate…
Now, the first step into trying to see the ‘bigger picture’ here would be to take averages of each result. After taking the averages we have already started to make our data better to report/present.
‘….the average heart rate (beats per minute) of the 30 individuals before testing was 70.1, this increased to 86.0 after drug administration with an average increase of 15.9….’
But lets improve this further… by taking the standard deviation of the results. Standard deviation is a measure of how spread out the results are.
- Find the mean of the results
- For each number; subtract the mean and square the difference (this is the variation of the results)
- Total up the differences and square root them
OR, as I do; simply uses excels formula to make it easy!
Use the ‘STDEV.P’ formula if you are finding the standard deviation of your entire sample and use the ‘STDEV.S’ if you want the standard deviation of a selection taken from a bigger population.
You can also compute the standard error of the results, which quantifies the precision of the mean. It is a measure of how far your sample mean is likely to be from the true population and it is calculated by dividing the mean by the square root of the number of results. Now we can say the average of our results and give the variation by using the plus or minus symbol: ‘±’.
You can use either the standard error or the standard deviation as a measure of variance. But now we have something that looks even better:
‘….the average heart rate (beats per minute) of the 30 individuals before testing was 70.1 ±10.5, this increased to 86.0 ±8.4 after drug administration with an average increase of 15.9 ±11.1….’
Couple this with a bar chart of the averages plus variation and we’re already making our results look pretty good. Note: whenever you add a chart or graph into an essay, report or whatever you should never just throw in the graph without first saying the results and referencing to them and the graph should always have a legend explaining what it shows. A rule of thumb i always work to is that you should be able to take the graph out of the report and still understand the results and the graph should be able to stand alone and still make sense.
So we should now have something like this….
‘….the average heart rate (beats per minute) of the 30 individuals before testing was 70.1±10.5, this increased to 86.0±8.4 after drug administration with an average increase of 15.9±11.1 (Figure 1.)….’
Figure 1. The average heart rate (beats per minute) of 30 individuals taken before and after drug administration. Error bars represent standard error.
Now we have done some descriptive statistics we are already starting to see the bigger picture; that there is an increase in heart rate after administering the drug, but it doesn’t seem to be very substantial. The variance also seems to be large.
The next step is now to determine if we can say that the drug has had a significant affect on the heart rate of the individuals. OR if the change we see is just due to chance.
It is important to note that, according to Karl Popper, we can never prove a hypothesis, we can only falsify it or determine that the result is likely to not be due to chance. This means we must create a slightly different hypothesis for our research question. Originally our hypothesis would have been something along the lines of:
‘This drug, upon administering to patients, will increase their heart rate’
Now we must generate the null hypothesis, which is NOT the negative/opposite of our hypothesis, it is simply to state that there will be no affect whatsoever:
‘This drug, upon administering to patients, will have no affect on heart rate (and any difference is due to chance)’
Of course there is a third possible hypothesis, if we didn’t know what the affect of the drug was going to be:
‘This drug, upon administering to patients, will have an affect on their heart rate’
Statistical tests essentially calculate the difference between populations (in our results: population 1 is heart rate before drug and population 2 is heart rate after drug) to determine if they are in fact separate populations (and are therefore significantly different) or if the results are all from the same population and the observed change is simply due to chance. This is what the null hypothesis suggests.
So with statistical tests we are essentially trying to falsify the null-hypothesis.
When using statistical tests we never receive a simple YES or NO answer to if our results are significant, instead we receive a p value. When deciding if results are significant we usually work to a 95 % (0.05) confidence level, that we are 95 % confident these results are not due to chance. If we receive a p value that is below that 0.05 interval than we can confidently reject the null hypothesis.
While the first and third hypothesis are subtly different, they are important when it comes to the p value you receive from your statistical test of choice. This is due to the one-tailed or two-tailed nature of the question. With the first hypothesis there is a direction – ‘the drug will increase heart rate’ – so statistical significance will only be tested in one direction, the whole 0.05 will be allotted to the ‘higher’ direction to determine significance. Where as with a two-tailed hypothesis – ‘the drug will have an affect’ – the statistical test will test significance on both sides and the 0.05 will be split in to 0.025 to check significance at either end of the population.
What you’re seeing here is the ‘bell curve’ of a normal (or gaussian) distribution. Normal distribution is based upon the mean and standard deviation of your results. Your mean makes up the central part and you move symmetrically away from it with the frequency of results in that range reducing. Typically 68% of values will be within one standard deviation of your mean and 95% within two standard deviations.
This is important in statistical testing because if the curves for two populations overlap by more than 5% than we can conclude that they are not two statistically different populations.
So for our data; the first step is to determine if our data is normally distributed:
So yes, our data is in fact normally distributed.. We know this because it is symmetrical either side of the mean (grey area=50% of results) and ~68% of all the results are found within one standard deviation.
Now that we have determined this we can now pick a statistical test, rather than explain each test I will just use this flowchart to help you decide, which can be found here
We have two samples of results (because we are are testing the difference between two sets of data) that are paired and are normally distributed. This leads us to the paired t-test, remember to check if you are testing a one-tailed or two-tailed hypothesis.
With this part i used an online programme (which can be found here) because i don’t currently have minitab installed.
The value of t= 7.743 which gives a p= <0.00001. This p-value is less than 0.05 so is significant at the 0.05 level, incredibly significant in fact.. this means that we can reject the null hypothesis that the drug has no effect on heart rate at a confidence level of 99.99%. (Not bad for made up data!). To report a paired t-test result you should use the format of:
(t(degrees of freedom)= _, p= _)
This means that our final presentation of these results will look something like this:
‘….the average heart rate (beats per minute) of the 30 individuals before testing was 70.1±10.5, upon administering the drug heart rate significantly increased to 86.0±8.4 (t(29)= 7.743, p= <0.05) with an average increase of 15.9±11.1 (Figure 1.)….’
Figure 1. The average heart rate (beats per minute) of 30 individuals taken before and after drug administration (t(29)= 7.743, p= <0.05). Error bars represent standard error.
But what if we had more than two sets of results?
Well in my experience we often use a one-way ANOVA test for testing significance between groups and post-hoc tukey tests to determine where this significance lies. So lets expand our data to include the affect of three different drugs:
‘….the average heart rate (beats per minute) of the 30 individuals before testing was 70.1 ±10.5, the administration of drug A and C both saw an increase in heart rate to an average of 86.0 ±8.4 and 105 ±8.6 respectively. Drug B resulted in a decrease to an average of 63.8 ±9.5 (Figure 1.)….’
Figure 1. The average heart rate (beats per minute) of 30 individuals taken before drug administration and after the drugs A, B and C were given. Error bars represent standard error.
As with before, we can see that the different drugs are clearly affecting the heart rate. But are they significant? well lets find out..
Going back up to our flowchart we have three or more samples that are normally distributed leading us to a one way ANOVA. Again i used an online programme to compute this result (which can be found here):
Don’t worry too much about all of these numbers, the main one is the p-value which is less than 0.05 (95 %) so we can confidently reject our null hypothesis of ‘none of the 3 drugs have any affect on the heart rate of patients’. Now we know that the groups are different, but which ones are different/ where does the significant difference lie? Post-hoc tukey test time (not to be confused with turkey…)
This test simple takes each group and checks it against each other, so A would equal our ‘heart rate before drug’ group, B is our ‘heart rate after durg B’ group and so on. This tells us that that each group is significantly different, group A and B are more different than A and C but they are still significant. As with the paired t-test we report the ANOVA result with degrees of freedom (DF) like this:
FDF,DF= _, p= _)
The first DF space if for degrees of freedom of the treatment (number of groups minus 1) and the second is the degrees of freedom of the error (total number of samples minus number of groups). So our final results section will look something like this:
‘….the average heart rate (beats per minute) of the 30 individuals before testing was 70.1 ±10.5, the administration of drug A and C both saw an increase in heart rate to an average of 86.0 ±8.4 and 105 ±8.6 respectively. Drug B resulted in a decrease to an average of 63.8 ±9.5 (Figure 1.). A one-way ANOVA test was done and the results were found to be significantly different (F3,116= 117.9 p=<0.05). A post-hoc tukey found that every group was significantly different….’
Figure 1. The average heart rate (beats per minute) of 30 individuals taken before drug administration and after the drugs A, B and C were given (F3,116= 117.0 p= <0.05). Error bars represent standard error.
while researching for this post I stumbled across this blog which explains stats incredibly well and in much more detail that i can hope to, so if you’re struggling with stats or would like to know more, i would definitely recommend heading over there.
So, in this post i’ve tried to explain how to present data, do basic descriptive statistics and how to present the results of a paired t-test and one-way ANOVA.
With a thing like stats there is a huge amount more to it, with an incredible amount of maths behind it. I’ve barely scratched the surface here, but i hope this has at least helped a few of you. Please comment/make me aware of any mistakes or things I have got slightly wrong, I don’t claim to be any where near an expert on statistics and would hate to pass on bad information. Also, if you’d like me to explain any of the terms i’ve used, drop me a comment and i’ll try my best to clarify.
Before you leave, can you please just do this quick poll. That way i can tell if this helped and if you would want more of this?