Combining Cases and Adding Variables in SPSS

Question

I am having difficulty achieving this functionality in SPSS. The data set is formatted like this (apologies for the excel format)

In this example, the AGGREGATE function was used to combine the cases by the same variable. In other words, CITY, Tampa in the example, is the break variable.

Unfortunately, each entry for Tampa gives 10 unique temperatures for each day. So the first entry for Tampa is days 0-10, and the second is days 10-20, they provide useful information. I can't figure out how to use the aggregate function to create new variables to avoid losing these days. I want to do this, as I want to be able to run tests on the mean temperature in Tampa over days 0-20, relative to days 0-20 in other cities.

My current syntax is:

AGGREGATE
  /OUTFILE=* MODE=ADDVARIABLES
  /BREAK=CITY
  /Temp=Max(Temp).

But this doesn't create the variables, and I'm not sure where to start on that end. I checked the SPSS manual and didn't see this as an option within aggregate, any idea on what function might allow this functionality?

eli-k eli-k · Accepted Answer · 2017-01-21T17:24:08

If I understand right, you are trying to reorganize all the CITY information into one line, and not to aggregate it. So what you are looking for is the restructure command casestovars.

First we'll create some fake data to demonstrate on:

data list list/City (a10) temp1 to temp10 (10f6).
begin data
Tampa 10 11 12 13 14 15 16 17 18 19  
Boston 20 21 22 23 24 25 26 27 28 29
Tampa 30 31 32 33 34 35 36 37 38 39 
NY 40 41 42 43 44 45 46 47 48 49 
Boston 50 51 52 53 54 55 56 57 58 59 
End data.

casestovars needs an index variable (e.g number of row within city). In your example your data doesn't have an index, so the following commands will create one:

sort cases by CITY.
if $casenum=1 or city<>lag(city) IndVar=1.
if city=lag(city) IndVar=lag(IndVar)+1.
format IndVar(f2).

Now we can restructure:

sort cases by CITY IndVar.
casestovars /id=CITY /index=IndVar/separator="_"/groupby=index.

This will also work if you have more rows per city.

Important note: my artificial index (IndVar) doesn't necessarily reflect the original order of rows in your file. If your file really doesn't contain an index and isn't ordered so the first row represents the first measurements etc', the restructured file will accordingly not be ordered either: the earlier measurements might appear on the left or on the right of the later ones - according to their order in the original file. To avoid this you should try to define a real index and use it in casestovars.

Combining Cases and Adding Variables in SPSS

2 Answers