Noms!
Photo: Holly Vickery
The pattern
Here's what we did in the previous lesson.
- For each record:
- Grab the record's data from the Range
- Normalize the data
- Validate the data
- Process the data
Let's add to the last step. We'll keep it simple: computing the average of valid data.
Goat cuteness data
Cthulhu has some data on goat cuteness it wants you to analyze. Here's part of the start worksheet:
Cuteness should be a number, from 1 to 10.
When the Run button is pressed, show average cuteness, for valid data only.
Adela, could you give it a try?
Code outline
Adela
OK.
Let's see... I'll work it backwards, but instead of doing everything backwards, I'll work out the big steps backwards. Then, we'll see.
We want an average for valid data, and the count of valid data records.
You know what... I'll start VBE, and type in code comments as I go.
- 'Output average cuteness (valid data only)
- 'Output number of valid records
Note
I often do this, too. I start by typing comments into the code editor I'm using, for VBA, C#, PHP, JavaScript, whatever language I'm using
Adela
OK, to work out the average, I'll need the total of valid cuteness, and the count. Oh! The count is in the output as well.
- 'Compute average cuteness (valid data only) from total and count.
- 'Output average cuteness (valid data only)
- 'Output number of valid records
Adela
I know how to total. That's the accumulator pattern...
Loop over a data table. For each record, add a field to a total, count, or whatevs.
- 'For loop
- 'Total = Total + Cuteness
- 'Next
- 'Compute average cuteness (valid data only) from total and count.
- 'Output average cuteness (valid data only)
- 'Output number of valid records
Adela
Oh, but only for valid data. So I need to wrap the totaling thing in an If.
- 'For loop
- 'If cuteness is valid
- 'Total = Total + Cuteness
- 'Next
- 'Compute average cuteness (valid data only) from total and count.
- 'Output average cuteness (valid data only)
- 'Output number of valid records
Adela
Wait... I'll use the flag pattern, so I'll change the If to show that.
- 'For loop
- 'Data is OK = yes
- 'If cuteness is invalid:
- 'Data is OK = no
- 'If data is OK
- 'Total = Total + Cuteness
- 'Next
- 'Compute average cuteness (valid data only) from total and count.
- 'Output average cuteness (valid data only)
- 'Output number of valid records
Adela
Oh, it needs to say that cuteness is from the current record.
Some more finishing touches. Get cuteness for the current record, make the Range object, oh, and gotta count the valid records, too.
And... done!
- 'Make a Range
- 'For loop
- 'Get cuteness for current record
- 'Data is OK = yes
- 'If cuteness is invalid:
- 'Data is OK = no
- 'If data is OK
- 'Total = Total + Cuteness
- 'Count valid = Count valid + 1
- 'Next
- 'Compute average cuteness (valid data only) from total and count.
- 'Output average cuteness (valid data only)
- 'Output number of valid records
Georgina
Wow, great job, Adela! I really saw your thinking, there.
Ethan
Yeah, that was nice!
OK, let's look at that again, after the code is written.
Initialize
To compute an average, we need a total, and a count. Let's make those variables, and initialize them. Also, let's create the Range.
- Sub run()
- Dim rData As Range
- Dim sNumberOfRecords As Single
- Dim sNumberOfValidRecords As Single
- Dim sTotalCutenessValidOnly As Single
- Dim sAverageCutenessValidOnly As Single
- 'Initialize.
- sNumberOfValidRecords = 0
- sTotalCutenessValidOnly = 0
- 'Create the Range.
- Set rData = Range(Cells(4, 1), Cells(4, 1).End(xlToRight).End(xlDown))
- sNumberOfRecords = rData.Rows.Count
I named the variables to remind ourselves what we're doing. Instead of sTotalCuteness
, use sTotalCutenessValidOnly
. This is optional, but if it helps you, do it.
OK, add the loop, and the code that reads in the current record. I'm going to start leaving out code you've already seen. You can download the solution to see it all.
- Sub run()
- ...
- Dim sRecordNumber As Single
- Dim tUncheckedCurrentCuteness As String
- Dim tCurrentName As String
- Dim sCurrentCuteness As Single
- ...
- Const FIELD_NAME = 1
- Const FIELD_CUTENESS = 2
- 'Initialize.
- sNumberOfValidRecords = 0
- sTotalCutenessValidOnly = 0
- 'Create the Range.
- Set rData = Range(Cells(4, 1), Cells(4, 1).End(xlToRight).End(xlDown))
- sNumberOfRecords = rData.Rows.Count
- 'Loop over records.
- For sRecordNumber = 1 To sNumberOfRecords
- 'Get fields for current record
- tCurrentName = rData.Cells(sRecordNumber, FIELD_NAME)
- tUncheckedCurrentCuteness = rData.Cells(sRecordNumber, FIELD_CUTENESS)
- Next sRecordNumber
Notice the Consts
. They make the later code easier to read, and maintain.
Validation
We'll only validate the cuteness field. The flag pattern.
- 'Initialize validation flag.
- tIsDataOk = "yes"
- 'Validate
- If Not IsNumeric(tUncheckedCurrentCuteness) Then
- tIsDataOk = "no"
- Else
- sCurrentCuteness = tUncheckedCurrentCuteness
- If sCurrentCuteness < 1 Or sCurrentCuteness > 10 Then
- tIsDataOk = "no"
- End If
- End If
- 'Is the data valid?
- If tIsDataOk = "yes" Then
- 'Processing.
- End If
Processing
We want to have a count and total. If the data is valid, add to total and count variables.
- Option Explicit
- Sub run()
- ...
- 'Initialize.
- sNumberOfValidRecords = 0
- sTotalCutenessValidOnly = 0
- ...
- 'Loop over records.
- For sRecordNumber = 1 To sNumberOfRecords
- 'Get fields for current record
- ...
- tUncheckedCurrentCuteness = rData.Cells(sRecordNumber, FIELD_CUTENESS)
- '(Move into sCurrentCuteness if valid)
- ...
- 'Is the data valid?
- If tIsDataOk = "yes" Then
- 'Processing.
- sNumberOfValidRecords = sNumberOfValidRecords + 1
- sTotalCutenessValidOnly = sTotalCutenessValidOnly + sCurrentCuteness
- End If
- Next sRecordNumber
- ...
- End Sub
Lines 6 and 7 initialize variables for count and total.
Lines 10 to 22 loop over records. Line 13 gets a cuteness value for the current record. If it's valid, line 19 increases the count of valid records by one. Line 20 increases the total, by the new cuteness value.
Average and output
OK, after the loop ends, we'll have the count and total we need to compute the average.
- 'Compute average.
- sAverageCutenessValidOnly = sTotalCutenessValidOnly / sNumberOfValidRecords
- 'Output.
- Cells(4, 6) = sNumberOfValidRecords
- Cells(5, 6) = sAverageCutenessValidOnly
That's it! Remember, you can download the solution, and trace it with the debugger.
Ray
That backwards planning Adela did, that really helped me understand why the code was written that way.
Aye, I agree.
Notice Adela didn't get bogged down in detailed coding when she was working backwards. I recommend that. If you think about all the deets when you're planning, you'll lose track of the big picture. And it's the big picture you're working on.
Do the planning thing first. Then go back and write the code. Your brain will thank you for it.
Adela
A question. What if all of the data is invalid? The count will still be zero after the loop ends, right?
Oh, wow! Great thinking! Here's what Adela means:
- Sub run()
- ...
- 'Initialize.
- sNumberOfValidRecords = 0
- sTotalCutenessValidOnly = 0
- ...
- 'Loop over records.
- For sRecordNumber = 1 To sNumberOfRecords
- ...
- 'Is the data valid?
- If tIsDataOk = "yes" Then
- 'Processing.
- sNumberOfValidRecords = sNumberOfValidRecords + 1 Never happens!
- sTotalCutenessValidOnly = sTotalCutenessValidOnly + sCurrentCuteness
- End If
- Next sRecordNumber
- sNumberOfValidRecords is still 0
- 'Compute average.
- sAverageCutenessValidOnly = sTotalCutenessValidOnly / sNumberOfValidRecords
- Crashy crashy
- ...
- End Sub
sNumberOfValidRecords
starts at 0. If all of the data is invalid, it never increases. It's still 0 at the end of the loop.
What happens when you divide by 0? Crashy crashy.
You could use an If
to allow for this:
- Next sRecordNumber
- 'Guard against no valid data.
- If sNumberOfValidRecords = 0 Then
- Cells(4, 6) = "No valid records. Cannot compute average."
- End
- End If
- 'Compute average.
- sAverageCutenessValidOnly = sTotalCutenessValidOnly / sNumberOfValidRecords
- 'Output.
- Cells(4, 6) = sNumberOfValidRecords
- Cells(5, 6) = sAverageCutenessValidOnly
This is the guard pattern at work.
Exercises
Goat weight program
Cthulhu thinks its goat friends are looking a little thin, so it sponsors a weight gain program. Goats are weighed at the start and end of a week. The more weight gained, the better!
You can download the start workbook. It looks like this:
Write a program that computes the average start and end weights, and the average gain. Only include valid data. The names are all OK, so you don't need to check them. However, as you can see, there's some bad numeric data. Exclude it from the analysis. Also exclude negative weights. No floating goats!
Here's what the output should look like:
Upload your solution. The usual standards apply.
Goat ball spread
Cthulhu is thinking about betting on goat ball point spreads. It wants you to figure out some statistics.
Point spread is the difference between scores in a game. So if:
- The New York Nannies scored 10 points and the Buffalo Billies scored 7, the spread would be 3 for the game.
- The Brisbane Bed Bugs scored 11 points and the Yosemite Yolks scores 16, the spread would be 5.
- The average of the spreads for the games would be (3+5)/2 = 4.
You can download the start workbook with the data. It looks like this:
The scores should all be numbers that are zero or more. There are data errors.
Write a program to compute the two counts and the mean, as well as the score differences. Here's the output:
Differences are always positive, or zero.
You can assume that at least some of the data is valid.
Upload your solution. The usual standards apply.
Bouncy boots
Goats love to bounce. Like Tiggers.
Cthulhu got its goaty friends bouncy boots. Do the boots work? Write a program to find out.
Your worksheet should start like this:
You can download the start worksheet.
Each data row has three values:
- Goat name. Should not be blank. A name that is just spaces counts as blank.
- Height goat jumps, without bouncy boots (meters). Should be a number that's zero or more.
- Height goat jumps, with bouncy boots (meters). Should be a number that's zero or more.
Your program should validate the data, and show some statistics.
Upload your solution. The usual standards apply.
Database lesson
Goats love databases. They like nothing better than writing SQL queries.
Cthulhu teaches an SQL class. It had an idea for improving a lesson. It gave some goats the old lesson, then a test (the "before" test), then the new lesson, then another test (the "after" test). What happened to the scores?
Download the starting worksheet with the data. It's like this:
Notice that scores can go down. That's normal; scores have some randomness to them.
Each record has three values:
- Name.
- Before. Number, zero or more.
- After. Number, zero or more.
Some data might not fit the rules. You'll need to validate it.
Write a program to compute averages and counts. Here's the output.
Your program should warn the user if more than 10% of the records are invalid. Don't forget to remove the message each time the program runs.
Upload your solution. The usual standards apply.