Many variables are related by a straight line, or more strictly, over a certain range for both variables the relationship between them can be expressed as a straight line. Often we have to find the equation of the line from a table of raw data. This may be a a table of temperatures at certain heights, or the yield of wheat for different amounts of rainfall, or English exam scores and maths exam scores. We often look for linear relationships first of all because a linear relationship is the simplest to calculate. Having found a linear relationship, we can test how good a relationship it is by calculating the correlation coefficientIfis close to either 1 or -1, there is a good linear relationship. Ifis close to 1 the two quantities increase together. Ifis close to -1, then as one quantity increases, the other one decreases.

The equation of the regression line of y on x is given by

whereandWe usually find a last using

The correlation coefficient can be found if necessary to found how good a relationship we have.

You may be given summary statisticsetc but I will illustrate an example from scratch.

Example: Find the equation of the regression line of heightabove ground level against temperatureusing the data in the table.

h |
100 |
1000 |
1500 |
2400 |
4000 |
5500 |
9000 |
10000 |
14000 |

t |
24 |
10 |
8 |
2 |
-4 |
-8 |
-20 |
-22 |
-28 |

It is a good habit to calculate and write down the summary statistics:

Hence