Useful Excel Functions
Note: To make use of these functions you must have selected a cell in an Excel spreadsheet and each function must begin with an equal sign (=)
For Simple Calculations
Addition: =1+1
Subtraction: =1-1
Multiplication: =1*1
Division: =1/1
Exponents: =1^1
You can also use the values from other cells in your calculations.
e.g., =A1+(B1*C1)
Square root
=sqrt( )
Simply type inside the parentheses the number you want the square root for, or
the cell reference (e.g., A1) that contains the number you want the square root for.
You can also do =number^0.5, because taking the square root of a number is the same
as a number to the power of ½.
Absolute Value
=abs( )
Type the number (or reference to a cell with a number already in it) between the
parentheses to obtain the absolute value of that number. For example, -5 will become
5. When dealing with positive numbers, +5 will remain 5.
For Simple Statistics
The Mean
=average(A1:A10)
A1 is where the range begins and A10 is where it ends. You insert any values here, e.g., (B10:B59), (R210:R3732), or even across rows, like (D1:W1).
Note: This is also true for the other simple statistics functions listed here.
You can also define the range with your mouse. Once the opening parenthesis is in
place, just select and drag. You want to see the “marching ants” box, which will
usually be blue. You can also position this box with the arrow keys, and define the
range by holding SHIFT and pressing the arrows.
The Median
=median(A1:A10)
The Mode
=mode(A1:A10)
The Sum
=sum(A1:A10)
Standard Deviation
=stdevp(A1:A10)
This formula divides by N. Use it when you want only to describe a sample without
trying to relate it to a population.
OR
=stdev(A1:A10)
This formula divides by N-1. Use it when you are interested in relating a sample
to a population (as in inferential statistics).
Range
=min(A1:A10) tells you the smallest value in the range of A1 to A10
=max(A1:A10) tells you the largest value in the range of A1 to A10
Sum of Squares
=sumsq(A1:A10)
This formula will square each number in a range of numbers and then sum them.
Note: This will not automatically calculate the “Sum of Squares” (used in an ANOVA or the standard deviation) from a raw range of numbers. That value is typically some sort of difference score, so you would need to obtain that first. However, this formula will spare you the trouble of squaring and summing those difference scores.
For Statistical Techniques
Z-scores
=standardize (score, mean, standard deviation)
Whatever number you want to convert into a z-score, type in first. Then, type
in the mean of your range of numbers, and the standard deviation of that range of
numbers. The end result will be the z-score, which tells you how many standard deviations
away from the mean that score is.
Correlation Coefficient
=correl (first range of scores, second range of scores)
This formula will give you the Pearson Product-Moment Correlation Coefficient.
It will always be something between -1 and 1.
Note: ALWAYS remember to create a scatterplot anytime you calculate a correlation coefficient. You always need to check for outliers and linearity.
Regression
Note: It is very important that you keep track of which scores are Y (for example A1:A10) and which are X (for example B1:B10). If you enter them in the wrong order, you will get the wrong slope.
=slope (range of y-scores, range of x-scores)
This formula will produce the slope of the regression line for a range of raw
scores.
=intercept (range of y-scores, range of x-scores)
This formula will produce the y-intercept of the regression line for a range of
raw scores. Again, it is very important that you keep track of which scores are your
Y-scores and which are your X-scores.
=steyx (range of y-scores, range of x-scores)
This formula will produce the standard error of the estimate for your regression
line. Note that this formula divides by N-2, as opposed to simply N.
t-tests
=ttest (first range of numbers, second range of numbers, 1-tailed vs. 2-tailed test,
paired samples vs. independent samples)
This formula will produce the p-value for different types of t-tests. Simply
define the range for the first range of scores, enter a comma, then the second range
of scores. Enter another comma, and specify whether you want to conduct a one-tailed
or two-tailed test. For a one-tailed test, enter a “1”. For a two-tailed test, enter
a “2”. Enter another comma, and the last step is to specify what type of t-test you
want to run. For a paired-samples t-test, enter a “1”. For an independent-samples
t-test, enter a “2”. It is also possible to enter a “3”, which would indicate there
are two separate groups but the variance is unequal between the groups.
Note: This formula will not produce the t-score for the test. If you want to determine
the t-score for that p-value you need to use the following formula:
=tinv (p-value, degrees of freedom)
This formula will produce the t-score that corresponds to a given p-value. (The
formula name stands for “t inverse.”) Enter in the p-value (which you calculated
with the =ttest formula, or preferably refer to the cell where it is already calculated),
then the degrees of freedom for that test. For a paired-samples test, the degrees
of freedom are N-1, where N is the number of pairs of scores. For an independent-samples
test, the degrees of freedom are N1+N2-2, where N1 is the number of scores in group
1, and N2 is the number of scores in group 2.
If you already have the t-score but want to know the corresponding p-value, you will
need to use the following formula:
=tdist (t-score, degrees of freedom, 1-tailed vs. 2-tailed test)
Use this formula if you already obtained your t-score and you want to know what
p-value corresponds to it. First, enter in the t-score (or refer to a cell where
it is already calculated). Enter a comma, and then enter the degrees of freedom,
calculated in the same way as before. Enter another comma, and enter a “1” for a
one-tailed test or a “2” for a two-tailed test.
Chi square tests
=chitest (observed frequencies, expected frequencies)
This formula will produce the p-value for a chi square test. It only asks for
two things: the observed frequencies, which are the counts you observe in your sample,
and the expected frequencies, which are the counts you would expect to see due to
probability. Unfortunately, you have to calculate your expected frequencies yourself;
Excel cannot do it for you. Once you have both your observed and expected frequencies,
simply highlight the range for each. Make sure that the expected frequencies are
arranged in the same type of table as the observed frequencies, such that each expected
frequency is in the same relative location of the table as its corresponding observed
frequency.
Note: This formula will not produce the chi square value for the test. If you want
to determine the chi square value for that p-value you need to use the following formula:
=chiinv (p-value, degrees of freedom)
This formula is very similar to the =tinv formula. Enter the p-value (or refer
to the cell where it was calculated with the =chitest formula). Enter a comma, and
enter the degrees of freedom for that test. For most chi square tests, the degrees
of freedom are determined by the formula (R-1)*(C-1), where R is the number of Rows
in your contingency table and C is the number of Columns in your contingency table.
Exceptions to this rule occur when you only have one row, or one column. If you only
have one row, degrees of freedom are (C-1). Likewise, if you only have one column,
degrees of freedom are (R-1).
If you already have the chi square value but want to know the corresponding p-value,
you will need to use the following formula:
=chidist (chi square value, degrees of freedom)
Simply enter in your chi square value, or refer to the cell where it is calculated.
Then, enter your degrees of freedom (calculated in the same way as before).
Formulas Useful for Sorting and Analyzing Data
Count
=count(A1:A10)
This function will count how many cells have numbers in them. It will not count non-numerical
values, such as Y, N, or any words.
CountA
=counta(A1:A10)
This function will count non-numerical values. It basically counts a cell as long
as it has something inside of it. This includes numbers, letters, invisible spaces,
and even formulas.
Countif
=countif(A1:A10, “X”)
This function will count only the cells that meet some criterion. The range is defined
by the first part, and the “X” is what that criterion is. If the criterion is a number,
do not use quotes. But if the criterion is a letter or word, you must use quotes.
If
=if(A1=1,“True”,“False”)
The first part defines the logical test. You can also use > and < for greater than
and less than, and >= and <= for greater than or equal to and less than or equal to.
The middle part defines what happens if the logical test is met, and the last part
defines what happens if the logical test is not met. These don’t have to be True/False.
You can make it say Correct/Incorrect, Studied/Unstudied, Ice Cream/Frozen Yogurt,
or anything else you want it to say. You can even make it report values of different
cells. For instance, =if(A1=1,H10,””) is saying that if the value in A1 is one, then
report the value that is in H10. Otherwise, report nothing. (Because there is nothing
between the quotation marks, this means nothing will be reported.)
Note: H10 is not in quotes. If you did put it in quotes, it would be interpreted
literally and say H10. If you want the value that’s in the cell of H10, then don’t
use quotes.
Other Useful Stuff
Random numbers between 0 and 1.
=rand()
If you want to get bigger values, just add *10 or *23 after the last parenthesis.
Note: This function produces numbers with decimal points. If you don’t want decimal
points, use the following formula:
Whole random numbers
=randbetween(1,100)
(1,100) is an arbitrary range. You can make it anything; (2,373), or (1003,1815),
or whatever.
Note: If you want the numbers to stop changing, copy them and paste as values.
Sort
This is not a function. It’s just an operation. Just go to Dataïƒ Sort. From here, you can specify which column to sort by, and whether the order is ascending or descending. You can also have it sort by rows, if you click on options. Finally, there is a little button on the toolbar that sorts automatically. It is faster, but the downside is it always sorts by the first column you select, and this is not always what you will want.
These last three functions can be very useful if you want to randomize the order of things (e.g., names). Just produce a column of random numbers, highlight both the column of numbers and the column of names, and then sort by the random numbers. You will find the names are properly randomized.