
Download the associated Workbook for this lesson
This lesson will be used to look at the various methods available within Excel to extract required information from databases and tables. There are many ways in which this can be done, but some ways are far better than others. By far the most important factor in this is how the information is laid out in the first place. By following these six rules (also mentioned in Lesson 1), life will be much easier come time to extract the information.
When putting in headings bold the font, this will help Excel recognise them as headings when you use one of its functions eg; Data>Sort.
When putting data into the data area of your spreadsheet try to avoid blank rows and columns if possible. This is because a lot of Excels built in features will assume a blank row or column is the end of your data. It also helps with a lot of formulas.
Have your data sorted if possible. Excel is very rich in what are known in Lookup & Reference formulas and many of these rely on your data being sorted in a logical order.
Use real dates for headings and format them appropriately. By this I mean if you want the names of the months as headings type them in as 1/1/2005 , 1/2/2005 , 1/3/2005 etc then format them as mmmm. This is a very simple procedure that is all too often overlooked by many. If you have real dates as headings life will be much easier further down the road when you need to use them in formulas.
Avoid merging cells if possible. Use Centre across selection instead. This can be found by going to Format>Cells/Alignment and selecting from the Horizontal box. Merging cells can cause all sorts of problems later on that that will have you scratching your head.
Don't put in one cell what could go in more than one cell. For example, say you have the names of 100 people to put in your spreadsheet. Don’t put their full name in one cell. Instead put the First name in one cell then their surname in the next cell to the right. If you need to place them into one single cell at a later stage, this can be done very easily, the same cannot always be said for the reverse.
The Functions that we will look at in this lesson are best suited for extracting information. These are the Lookup & Reference Functions, the Database Functions and Array Formulas.
However, before we start, I must stress that we will only use array formulas as a very last resort. Our reason for this is that over the years we have been helping others use Excel, we have consistently found that when users first learn how to use and write array formulas, they tend to fall back on them all too often. The major drawback to this is that you will end up with a workbook that becomes extremely slow in recalculating. You must remember also, that Excel by default recalculates when you open, save or enter data into a cell that a formula is dependent on. All too often we have seen workbooks that have become virtually unusable due to the amount of array formulas.
The other subject we will look at is the means and methods of deciding which formula to use in which situation. As Excel has over 300 Functions available, this can sometimes seem a bit daunting. Keep in mind that although Excel is very rich in Functions, you will find that most likely you will be using the same set of Functions over and over again. You must always keep in mind that Excel was designed for all people from all walks of life and businesses to use, by this I mean it contains many Functions that would be of no use at all to some people, but be the "be all and end all" to others. A couple of brief examples of this would be some of Excel's engineering and financial Functions, unless you were in that field of expertise they would be of no use at all.
We will look at the means and methods of deciding which formula to use in which situation first and then we will move on to ways and means of extracting information from tables and databases.
One of the many problems that can crop up when using Excel is trying to figure out which function to use in which situation. There
are certainly no hard and fast rules to this as more often than not you will need to use a nested formula that will incorporate many types of Functions. Whichever way you go do not fall into the trap of trying to make your formula as short as possible.
We constantly see 'so called' experienced users replacing False with 0 (zero),
True with 1 and formulas like: =IF(A1,
Anyway enough of that. Before you construct your formula you should ask yourself these five questions.
Is the data text or numeric?
Do you want text or a number as your result?
Do you need to locate your data first?
Does the result depend on the data type or value?
Are you dealing with dates and/or times?
If your data is Numeric then you can possibly rule out the need for any of the Text functions.
If your data is Text and you want a Numeric result you will very likely need one of the Text functions.
If you need to locate your data first you will very likely need one of the Lookup & Reference functions.
If your result depends on the data type or value you will need one of the Logical functions.
If you are dealing with dates and/or times then you may need one of the Date & Time functions
Excels Insert Function (Shift+F3) is always a good place to start. Under the heading Or Select a Category are the Function categories. If you know the category you want, you can select from this list, or if you do not know which Function Category you want, once you have the Insert Function dialog on screen, click in the Search for a Function box and type in some key words of the result you want. Don't be too vague though and don't be to detailed. If you were wanting to find some data that is within some more data and you just type "Find" you will end up with a very long list. The same problem would arise (to a lesser degree ) if you typed "Find some data in a list of data". But simply type "Find data" and you will get a reasonable list.
Let's look at the ones you are most likely to need and when to use them.
We will use a real problem and work through it to see how we might apply the "Decision making" process. This is only one example, but I believe it will demonstrate how it can be applied.
If your data contains a period of time you should consider using one of the Date & Time functions. This is not always completely necessary because, if you only need to find the difference between two dates you can simply subtract one date from the other.
Suppose you have the date 22/12/2005 in cell A1 and 15/4/2005 in cell A2 you could use: =A1B1 and format your result cell as General to find out the number of days between the two dates. But what if we need to find out the months between the two dates? We would use: =MONTH(A1)MONTH(A2). Of course this is not a problem because the years in both dates are the same, but what if they are not! This is where we ask ourselves the 5 questions:
Is the data text or numeric? Answer= numeric!
Do you want text or a number as your result? Answer= number!
Do you need to locate your data first? Answer= no!
Does the result depend on the data type or value? Answer= no!
Are you dealing with dates and/or times? Answer= yes!
We need to take into account the year and month portion of our dates. From this it is most likely a Date & Time function is called. We are being totally honest here when we say we have never had to do this! You will see why soon. Let's work through this problem step by step. We will assume we have not used: : =MONTH(A1)MONTH(A2) yet. Make sure the dates: 22/12/2007 is cell A1 and 15/4/2005 in cell A2
Push Shift+F3 and select Date & Time from Or Select a Category
First thing we are looking for is a function that mentions Years and/or Months
You should be able to see MONTH so click on it. The help text at the bottom of the Paste function box tells me that the argument type must be a serial number ie; MONTH(Serial_Number). That's good, as our dates are serial numbers (as are all dates)
The description below this tells me: Returns the month a number from 1 (January) to 12 (December). Sounds like a good start, let's try it. With MONTH still selected click OK
Now click the collapse dialog box and select cell A1, click it again (the collapse dialog box ) and look at the "formula result" is reads 12. The result we expected!
Lets use the same function on our date in cell A2 and subtract the result from our MONTH(A1)
Click in the formula bar and type a  (minus sign) after MONTH(A1), ie; MONTH(A1)
Now from the box to the left of the formula bar select "MONTH" and repeat step 5, but do it on cell A2. You will notice the Function result is 4, and our formula result is 8/01/1900. This is just Excel assuming we want a date as our result and has formatted our cell for one. Ignore this and click OK. Our end formula is: =MONTH(A1)MONTH(A2)
Format the cell as General. You should now have 8, the number of months between 12 (December) and 4 (April). Now we have to do the same for the years.
So select any cell and Push Shift+F3 and select Date & Time. This time we are only interested in the year portion of our date, so we basically repeat steps 3 to 9 substituting months for years. Our end formula is: =YEAR(A1)YEAR(A2)*
We end up with the result 2, the number of years between 2007 and 2005. Now we know that there are 12 months in each year so we add to our Year formula to make it: =YEAR(A1)YEAR(A2)*12 Enter this and we get ###. This is because we need to change the syntax. Excel is multiplying YEAR(A2) by 12 then subtracting this from YEAR(A1). Which gives us a negative date. Excel cannot handle negative Dates and/or Times!
So we add some parenthesis to change this, eg; =(YEAR(A1)YEAR(A2))*12 and format the cell as General. Now we get 24. The number of months between the years 2007 and 2005.
Let's now add the two formulas together , eg; =MONTH(A1)MONTH(A2)+(YEAR(A1)YEAR(A2))*12 We get a result of 32 as soon as we format as General. This looks pretty good to me!
Before we went through this step by step we stated that we were "being totally honest here when we say we have never had to do this!" The reason we have never had to do this before is because there is a little known about Excel function that can do this for us! It's called DATEDIF, you will not find much documentation on this function as it is only supplied for compatibility with Lotus 123. The DATEDIF calculates the number of days, months, or years between two dates. It's syntax is: DATEDIF(start_date,end_date,unit). Unit is the number of what you want returned.
Y = The number of complete years in the period
M = The number of complete months in the period.
D = The number of days in the period.
MD = The difference between the days in start_date and end_date. The months and years of the dates are ignored.
YM = The difference between the months in start_date and end_date. The days and years of the dates are ignored.
YD = The difference between the days of start_date and end_date. The years of the dates are ignored.
So if we use the formula =DATEDIF(A2,A1,"M") on our dates to check if the formula we wrote is correct we get: 32 also.
Although this is only one situation out of a possible infinite number, it should give you a good idea of a logical process that could be applied to many other situations.