0
votes

A particular field is supposed to be of the format "000000" (a 6 digit field containing only numbers & includes the leading zeros) but several records are coming through as unusual formats (ex. 10A3C1, 17, City, John Smith). Is there a way to group all of the records together that do not conform to a specified format?

3
Please show us your code, what you have tried.Andrew Truckle
agreed... sample data is key. be a good asker, get a good answer :)niko
I haven't tried anything other than having to manually click on the records to mark and then group them but some of these fields have thousands of unique records so it is very time consuming to go through them. Here are some examples of good, correctly formatted records: 012764, 109853, 836547, 378402, 000456, 907843. All have 6 digits, including leading zeroes (if any). Here are some examples of incorrectly formatted records: 2469, 57489435-1, 10A3C1, NYC, New York, 00354. They either don't have the correct # of digits or are not numeric onlyNAlverson
I need a way to either group these in a more automated way, or at least identify these them as being incorrect so that way I can filter on only the correct records. I want to say if format("record") = "000000", then "record", otherwise "invalid"NAlverson
If one of the answers below worked for you, please accept it as correct. @niko and mine should have worked, but his is cleaner hence the vote up (by me)scsimon

3 Answers

1
votes

Spotfire is not really great at validating data. it's best to do this outside of Spotfire.

that said, I think we can hack something together. if you create a calculated column with the following expression, you'll be able to identify matches:

if(RXReplace([column], "\\D", "X", "g") = [column] and Len([column]) = 6, "MATCH")

this Regular Expression replaces all non-digit characters with the character X. if we compare that to the original value, additionally checking its length, we can verify if the value fits your requirements.

I do want to re-emphasize that this is hacky and you Probably Shouldn't Be Doing This™ in Spotfire :)

0
votes

Insert a calculated column to validate length and convert the value to an Integer.

CASE WHEN Len([Your Column])=6 THEN Integer([Your Column]) ELSE null END

Anything that comes out empty does not meet your specs.

0
votes

I don't have enough rep points to comment on AmbivalentGeek's answer but his should work for you if you cast your column to an Integer after you check the length, or handle alpha numeric 6 length cases some other way.

case  when len([yourColumn])=6 then Integer([yourColumn]) else null end

This will remove the false positives you would get if you have an alpha numeric value of length 6, like 123A56. This will strip leading zeros though so you'll have to create an additional calculated column, reformatting to 6 digits.

Here is a long way since I can't think of a more elegant solution right now. test2 is the calculated column from above:

case  
when Len(String([test2]))=1 then Concatenate("00000",String([test2])) 
when Len(String([test2]))=2 then Concatenate("0000",String([test2])) 
when Len(String([test2]))=3 then Concatenate("000",String([test2])) 
when Len(String([test2]))=4 then Concatenate("00",String([test2])) 
when Len(String([test2]))=5 then Concatenate("0",String([test2])) 
when Len(String([test2]))=6 then String([test2])
else "000000"
end