Normalizing means to convert data to a predictable format.
This makes it easier to write validation and other code. Easy is good.
Let's see how you could do normalization while processing a Range
.
Goats like sweaters
Thanks to Teagan for the photo
Cthulhu is thinking about learning to knit. It tells its minions to survey some goaty friends, asking them which of three styles of sweater they like best:
- Goatneck
- Turtleneck
- Cardigan
Here's some of the data:
As you can see, the minions weren't careful about how they typed in the data. There are extra spaces everywhere. The capitalization is all over the place.
Hiding goat
Photo: Holly Vickery
Does a goat like goatnecks?
Suppose we want to test whether a goat likes goatneck sweaters. Because of all the variations, we would have to use an If
like this:
- If tGoatFaveSweater = "Goatneck" _
- Or tGoatFaveSweater = "goatneck" _
- Or tGoatFaveSweater = "GOATNECK" _
- Or tGoatFaveSweater = " goatneck " _
- ...
Yuck! What a pain!
It would be easier to normalize the data, that is, convert it to one format. The If
would be simpler, as we'll see.
The code pattern is something like:
- For each record:
- Grab the record's data from the Range
- Normalize the data
- Test the data
So, grab data each record's data, and test it. But before testing it, normalize the data, so that testing is easier.
Counting favorite sweater types
Here's code that will count the number of each sweater type.
- Set rSweaterData = Range(Cells(11, 1), Cells(11, 1).End(xlDown).End(xlToRight))
- sNumberOfRecords = rSweaterData.Rows.Count
- sGoatNeckCount = 0
- sTurtleNeckCount = 0
- sCardiganCount = 0
- 'Work through the records, one at a time.
- For sRecordNumber = 1 To sNumberOfRecords
- 'Grab the current record's fields.
- tGoatName = rSweaterData.Cells(sRecordNumber, NAME_COLUMN)
- tGoatFaveSweater = rSweaterData.Cells(sRecordNumber, FAVE_SWEATER_COLUMN)
- 'Normalize
- tGoatName = Trim(tGoatName)
- tGoatFaveSweater = LCase(Trim(tGoatFaveSweater))
- 'Add to counts.
- If tGoatFaveSweater = "goatneck" Then
- sGoatNeckCount = sGoatNeckCount + 1
- ElseIf tGoatFaveSweater = "turtleneck" Then
- sTurtleNeckCount = sTurtleNeckCount + 1
- ElseIf tGoatFaveSweater = "cardigan" Then
- sCardiganCount = sCardiganCount + 1
- End If
- Next sRecordNumber
- 'Output results.
- Cells(6, 2) = sGoatNeckCount
- Cells(7, 2) = sTurtleNeckCount
- Cells(8, 2) = sCardiganCount
Lines 8 to 10 get the current record's data, with one variable for each field. Lines 14 to 21 looks at the sweater type in the current record, and adds to the appropriate count. However, before the counting code starts, lines 11 to 13 normalize the data. The sweater type in the current record is stripped of spaces, and made all lowercase.
So instead of...
- If tGoatFaveSweater = "Goatneck" _
- Or tGoatFaveSweater = "goatneck" _
- Or tGoatFaveSweater = "GOATNECK" _
- Or tGoatFaveSweater = " goatneck " _
- ...
... we can write...
- If tGoatFaveSweater = "goatneck" Then
M0000t! Much easier.
You can download the workbook, if you want to try it out.
What's stranger than goats in sweaters? Goats in trees!
Up next
We've normalized the data read from the Range
. How about checking it for errors?