logo

 

Part 1 Exploring The Data
(Advanced Spreadsheet Format)

This section of the tutorial is for data downloaded through the advanced spreadsheet format. For the basic spreadsheet format click here.

SurveyMonkey data does not flow into SPSS easily, even with the advanced spreadsheet option. In order to understand why, we need to first look at how SurveyMonkey exports the data SurveyMonkey.

Before we begin let's open the survey.

Action: Open Survey

Now read through the survey and examine the different types of questions. There are a total of 15. Look at question number seven. It has has six Likert scales below the main question. In reality this one question is actually six individual questions resulting in six variables. We will learn more about this when we explore the data in SPSS and Excel.

Action: Open File Advanced-Spreadsheet-1

 

Section 1 Understanding SurveyMonkey Data in SPSS

In this section we will take a look at how SurveyMonkey data imports into SPSS and Excel.

Before we begin the process of importing data into SPSS we must clean the data up. This can be done in SPSS or Excel. THe advanced spreadsheet option does take fewer steps for cleaning then the basic spreadsheet but it is still quite messy. To understand what I mean, let's first take a look at the data in SPSS without any editing. This is data downloaded directly from SurveyMonkey into Excel and then opened in SPSS. No changes have been made.

Action: Open SPSS

Action: Import the Excel file  File>Open>Data> Then find the file you just downloaded. If you do not see your file you may need to change the file type (See Below).

Now take a look at the SPSS file. Expand the name column. It is not pretty. Let's take a closer look at several variables. The first 9 variable names work well. Number 10, however, does not. It is one very long name! Notice that this name is the wording of the first question. The same is true for line 11. Line 12, however, is a bit different. It uses the wording of the first question and a letter.

 

If you look back at the survey itself, you will see that question 2 corresponds to variables 11-16. The first variable name of the group corresponds to the first choice (Brachiosaours). The letters A-E each correspond to the next five answers. The same holds true for variables17-21. The first variable name in the set corresponds to the first answer option and the following variable names are the remaining answers.

Now lets take a look at the value labels. These labels also use the full questions but they also add the answers as part of the value labels. Unfortunately, the label names are so long that some of the longer questions are very difficult to work with. But don't worry, we are going to fix this!

 

Notice that the values are missing. We will have to enter these by hand when we clean the data.

 

Lastly, lets take a look at how the values appear in SPSS. To do this you must switch to data view. Next scroller to the right until you see the image below. Remember the value names? This is how they appear in data view. Not very helpful. We also have numbers for values. This is a good thing.

 

Now take a look at the labels column. Some variables have labels and some do not. One last column to look at is the values column. In this case there is nothing useful at all. Needless to say, our data is not ready for SPSS.

Action: Close file one

Section 2 Understanding data organization using value labels

So we will leave SPSS for the time being and jump into Excel. Let's start by opening an excel file that has the actual value labels and take a look at how SurveyMonkey organizes the data. The file will be used to take a closer look at the data as we go through the tutorial.

Action: Open Excel file Advanced-Spreadsheet-labels.xls

Columns A through I are put there by SurveyMonkey and contain information such as ResponderID, CollectorID, StartDate etc. These columns are not part of our survey and right now we will ignore them. Instead let's examine column J which contains the data for the question "I have a pet dinosaur or have had a pet dinosaurs in the past." Below is the Excel spreadsheet and the corresponding survey question. The question is a yes/no type of question and the user can only select one answer. (Hint: questions that allow only one answer use round radio buttons.) The entire question is in the first row of column J. After that each row contains either a yes or a no.

The next set of columns is not as simple. Columns K through P contain data for the following question "Below is a list of the most common domesticated dinosaurs. Please select any that you have had as a pet. These include dinosaurs that you have as pets now and those you have had in the past."

This question is displayed in the top row of column K. Take a look at your own version of the file. This question is loooooong, It is too long to read. We can make this easier by opening the row and wrapping the text.

 

Action: Select top row in Excel document and pull down, making the row higher. Once the row in down, right click and choose format cells. Select alignment and wrap text. You should be able to read the text in the cells now. This may take a bit of trial and error. See the images below.


 

 

Now that we can actually read the text we can continue with the tutorial...

Back to column K, this question allows the user to give more then one answer. (Hint: a question that allows more then one answer contains square check boxes.) SurveyMonkey exports the data by placing each choice in its own column. If the user checks the box the answer is placed in that user's row. If the user has left the box unchecked, the cell is left empty. This may seem pretty clear now but when we get to the actual file we will import into SPSS it is not as easy to understand.

Now let's examine columns Q through U and the corresponding question. "Below is a list of the most common domestic dinos. Please rate their scariness."  This type of question is a Likert scale and every cell contains an answer. Once again there are radio buttons which means that the user can only choose on answer per dinosaur. As above, we see that questions that use radio buttons have answers in each cell of the spreadsheet.

At this point there is no need to go through the rest of the questions since all the questions will be variations of these three. We cannot, however, import this file into SPSS because we need to import the value labels as numbers instead of words. Let's do that now.

Action: Close all opened files

Section 3 - Understanding data organization using values (numbers)

For the next part of this exercise we will use a new Excel file.

Action: Open file AdvancedSpreadsheet-numbers.xls

Take a look at the following question. Below is a list of the most common domestic dinos. Please rate their scariness.

Instead of using words as we saw above. SurveyMonkey has exported the data as numbers. This is what we want for analyzing our data in SPSS. This question is pretty straight forward. It is a Likert scale which goes from 1 (not scary at all) to 5 (Terrifying!) As you can see in our Excel spreadsheet below. The rows below each dino have the numbers 1 through 5, instead of the text. This one is straightforward.

This next example is what stumps most people. Take a look at columns K through P. The rows under each column contain either a blank cell or a specific number. K has either blank cells or the number 1, J has blank cells or the number 2, and so on.

So what is happening here? Remember when we examined the data using text values? Well, look again.

Each dino (variable) has a value which uses a specific number instead of specific text. This is problem because instead of yes and no values, you only have yes and missing values. This is one of the things we will have to clean up in the next lesson. It is this specific problem that confuses many people..

Section 4 -Understanding variable names

When SPSS imports Excel data the variable names are taken from the first row in the spreadsheet. This is a problem. Take a look in the at the top row and notice that some of the cells have text while others are blank. This may be hard to see because the questions seem to stretch out across the top cells. This is what we saw in the previous examples. Try expanding column Q and notice that R S T and U have blank cells in the top row.

 

At the same time, row 2 has text as well but SPSS will not read that text as variable labels because it is on the second row. In the next part of the tutorial, we will create a new row and use that row to make our variable labels.

Go to part 2 - Cleaning Advanced SurveyMonkey Data

© 2010 Laura Kazan