Hello! Suzanne here with Excel Intuition. This particular module is about separating text. So you might if you haven’t already done so, pause for a moment and download your own copy of the companion file that you can see in the video here, because then you’ll be able to follow along with the lesson and do the same steps that we’re doing in the demonstrations. You’ll be able to use the same functions in your own version of excel.
So what do we mean when we talk about separating text? Well in excel spreadsheet, you can store a string of characters in a single cell, and we call that a text string. But many times you might want to, you might be given a text string that you’d like to separate into the parts that it’s composed of.
For example here we have a complete name given the format: last name, comma, space, first name, space, middle initial. Well what if you would like to separate out the last name from that complete name? Or the first name or the middle name or all three parts into separate cells? If you have all of these text characters in one cell and you have a format that you can rely upon, such as knowing that there’s a comma directly following the last name, and a space directly following that comma. And if there is a middle initial or middle name, knowing that there’s a space between the first name and that middle initial or middle name. If you know what format to expect, then you can use functions in excel to extract the different parts of a complete name.
So for example if you wanted to extract the last name, a function that would allow you to pull all the characters to the left of the comma would be useful. In order to do that you’d have to know the location of the comma. Notice in this diagram, we’re telling you, in this text string, if this text string was in a single cell. The R, the capital R in the last name would be in position 1. The comma would be in position 5. The second space would be in position 11. And the entire length of the text string would be 13, 13 characters including the commas and the spaces. So we could extract the last name knowing the position of the comma because we would know that the last name had the number of characters equal to the comma position minus one. Notice the comma’s in position five, and there is four characters in the last name.
We can use similar logic to extract the first name if we know the position of the comma. And depending on whether we assume that there could be a middle initial or a middle name or not would determine the logic we would use extracting the first name. We would do it differently if we expected a middle initial then if we didn’t, or a middle name.
So we’ll first take a look at the functions used to separate text. we’ve got the LENGTH function that tells us the length of the entire text string in a single cell in Excel. The ELFT function will let us pull out characters from the left of a text string. It takes two arguments: the text string to pull from and the number of characters to extract. The SEARCH function allows us to look for the position location of a particular character such as the comma, which we would need to know to extract the last name from the left of the complete name. Then the RIGHT function works the same way that LEFT function works, it just extracts from the right of the complete text string. And then we have the MID function. The MID function has three arguments: the text, the position location to start from, and the number of characters to extract. It needs the starting number because it doesn’t automatically go from the left or the right the way those functions do. It starts at the position you request it to start at. So for example, if we look back at what a text string looks like, if we wanted to extract the first name we’d have to know to start 2 positions beyond the location of the comma. Because we would know that if the complete name is going to be formatted last name, comma, space, first name, then the first thing will always start two position locations beyond the location of the comma. So in this case the first character of the first name is in position 7, which is 2 positions beyond the common location.
So now let’s look at two different examples extracting parts from a complete name. The first one, we’re going to expect a format of last name, comma, first name, and we aren’t expecting a middle name or a middle initial. So in this case we’ve got two names to work, one actually does have a middle initial, even though we’re not expecting it, and one doesn’t. So we find the length of the name for both of them by using the length formula, the LENGTH function. And that’s going to come in handy when we’re extracting the first name.
But we’re extracting the last name, we need to know the comma location. So here we’re looking for the comma location using the SEARCH function. It searches for the comma within the string found in column A, and we see that the common location is position location is 7 for the first name in this column, and it’s at 5 for the second name in this column. Then we use the LEFT function to pull out the last name using comma position minus one to calculate the number of characters that the last name is composed of. And notice that there’s a comment in here that explains how that LEFT function works. It extracts only the last name from the left of the complete name, the position location of the comma minus one is the correct number of characters to extract.
Now when we go to extract the first name, we’re using the RIGHT function, and we tell it the text string to extract from. And the way we calculate the correct number of characters for the first name, when we assume that there’s no middle name or in middle initial, is we use the length of the entire complete string, minus the location of the comma, minus one more. So notice with this first one, the total string is 12, the comma location is at 7. 12 minus 7 is 5, and minus 1 more gives us 4, which as you can see is the length of the first name. So we want four characters when we extract the first name. And we see that Dave is correct and so that formula works.
For the second name in the column, we end up with a middle initial as part of the first name, because we’re assuming there isn’t going to be one. So that the string that includes the second space and the middle initial is extracted along with the first name, because we aren’t accommodating the possibility that there would be a middle name or a middle initial. So we just extract that whole string to the right of the comma and the space.
Well let’s take a look at the second example. In the second example we are expecting that there could be a middle name or middle initial. We’re also expecting that there may not be. That’s why I put the brackets around middle name or middle initial. It’s optional, there may or may not be, so we have to have to check to see if it is there. So we start out still checking the length of the complete name. And then in this case we’re going to search for the location of the first space instead of searching for the location of the comma. The reason we’re doing that is because we can use the first space just as effectively to extract the last name as we were able to use the comma, but then we can also use the location of the first space to help us find the location of the second space, if it exists. So we’re looking for the location of the first space in this column, using the SEARCH function, and we’re getting the same answers for the first two that we got looking for the comma, except for the addition of one more in the value, because the space is one character position further into the text string. So the comma was at seven for the first complete name, and the first space is at eight, which makes sense. We extract the last name this time using the LEFT function, but we use the first space location minus 2, instead of the comma location minus 1 to tell us how many characters. So 8 minus 2 is 6 and there are 6 characters in the last name of the first complete name in the column.
Because we’re expecting there could be a middle name or middle initial, we’re going to look and see if there is. And if there’s a second space, that’s going to indicate that there is either a middle initial or a middle name. And we’ve got an example of each here. So we did the search for a space again, just like we did here, but in the first space character position search. But this time we start our search… the start_num, this argument, is the location of the first space plus 1. We’re going to start searching in this complete name string at one character position beyond the location of the first space to look for the second space.
So notice, in this case we got an error because there is no second space. But there is a second space for the next two names in the complete name column. Whether there’s a second space or not is going to determine how we extract both the first name and the middle name or initial. So the first thing we do when we design our formula to extract the first name is, we test, the first thing we do is we test to see if there’s an error in the cell where we searched for the second space. If there is an error, if it’s true that there’s an error, then we’re going to extract a name with one MID function formula. If it’s not true then we’re going to extract it with a different MID function formula. So we’ve got an IF function here around, surrounding the entire formula, that starts with a logical test testing for an error in the second space search formula. If that resulted in an error, then our mid function to extract the first name is going to start at the first space location plus 1, and the length of the first name is going to be calculated by the length of the entire string, minus the position location of the first space. So in the case of this first complete name the entire length was 12, the space location is 8 and 12 minus 8 is four, which is consistent with that first name being for characters.
Now if it was false that there was an error, meaning there was a second space, then we’re going to use the second version of the MID function. We have the same start position, the location of their first space plus 1. But then the length of characters for the first name is going to be the location of the second space, minus the location of the first space, minus one. And if you notice with the second complete name, we take a look at the location of the second space is at 11 first space is at 6, 11 minus 6 is 5,minus 1 takes that down to 4, and the first name is indeed 4 characters. So we’ll end up with Jose, the first name.
So when we’re going to check to see if we have a middle name or initial, the same logic is applied as we used to extract the first name. We first test to see if there was an error when we looked for a second space. If it’s true that there is an error then we know there is no middle name so the result will just be a quote quote with no space between the quotes, that’s an empty value for the cell. There is no middle name, it’s not a space it’s just an empty value. But if there is no error, then we will start, we will extract from the right of the complete name, and the number of characters we’ll extract, let’s look at it in terms of the second complete name since there is a middle initial here, we’ll use the length of the entire complete name, which is in column B, minus the location of the second space, to tell you how much to extract from the right for the middle name or middle initial. In this case the complete name is 13 characters long, the second space position is 11, 13 minus eleven is 2, and “L.” is a two character middle initial text string.
So now you have a chance try this on your own, extracting the last name, first name, and middle name or initial of this set of names, which were mayoral candidates for Monterey County in the midterm elections in 2022. And we’ve got comments here giving you guidance to go forward, what to do for each of the steps, and you might want to pause this video and complete that before watching the solutions.
So i’m gonna go ahead and go through this now. We’re going to use the LENGTH function, you have to start a function with an equal sign. I’m going to click in the cell that I want to find the length of. Then I’m going to double click on the black plus propagate to the rest of the column. I’m going to search for the location of the first space in each of the names. The text I want to find is a space, so I do quote space quote. Then the text I want to find it within is within this complete name. And for the first space the default is position 1, so I’m not going to put a 1 there, just going to end the function, and again propagate it to the rest of the column.
Now i’m searching for the second space in each name. So i’m putting the same find text value quote space quote, the same within text value, the complete name, but my start_num is going to be the first space location plus one. You can see that many of these candidates do not have a middle name.
Now i’m going to extract the last name from each complete name by using the LEFT function, pulling from the same complete name text. And the number of characters is going to be my first space minus 2.
Now for the first name I have to take into consideration that there may or may not be a middle name. So i am using my IF function, I’m testing using my ISERROR function to test for an error from the function that searched for a second space. And if it’s true that there’s an error, the equals true, then i’m going to use the MID function, searching the complete name text, starting at the first space plus 1, and the number of characters of the first name that i’m going to extract is going to be the length of the complete name, minus the location of the first space. And we see that that is 12 minus 8 equals 4, which will be exactly correct for the first name for this complete name. This is the formula i want to use if it’s true that there’s an error when we looked for a second space, because we know there’s no middle name then.
Now if it’s false that there is an error, meaning there is a middle name, then i would use my MID function in a different way. Same complete name, same start_num, the first space. But now the number of characters is going to be calculated by the location of the second space, minus the location of the first space, minus one. Then we have to close up our if function. And let’s propagate this, and we see that all of the names look correct.
Now we’re doing the same type of logic for the middle name, but instead of using the MID function we’ll use the RIGHT function to pull out the middle name or initial, if there is a second space. So we’ve got an IF ISERROR, we can say if the error equals true, if it’s true that there is an error then we know there is no middle name so we just want to put an empty value in the cell with the middle name or middle initial, otherwise we’re using the RIGHT function, pulling out from the complete name text the number of characters of the complete name length, minus the location of the second space. Close the RIGHT function, close the IF function, and send that formula down. And that’s the complete solution for that exercise opportunity.
You can look at the outline of this lesson here to summarize the different steps, you can go back to the intro page and if you could click on any of the different worksheets here and review, you might want to review the different functions used, and understand clearly what their different arguments are. And we welcome you to submit any comments related to this module or make any requests or suggestions. Thank you very much!