Full CSV example

Log in

If you're a student, please log in, so you can get the most from this page.

Hey, Ray! Ready to write some code?

Ray
Ray

Ah... OK, but I'll need help.

No worries.

Adela
Adela

We got your back.

Goat

​Ray, Ray, Ray, Ray!

Ray
Ray

OK, let's give it a go.

The task

Here's the task.

- - CUT SCREEN HERE - -

There's a CSV file with data about dog happiness, collected from happiness recording collars. The file is named dogs-happiness.csv. For each dog, there's the dog's name, its happiness level, and phone number:

  • "Sima",8.12,"2486984674"
  • "Minnie",8.29,"2487307245"
  • "Meg",9.98,"2482451990"
  • "Shawana",25.46,"2484948544"

Sometimes the collars malfunction, because of excess drool. You need to check the data. Here are the validation rules:

  • Name must be present.
  • Happiness score must be from 1 to 10.
  • Phone number must be present.

Use case-wise deletion.

Treat the phone number as a string. You won't be doing any calculations on it.

You're worksheet should start like this:

Start

Here's the output:

Output

You can see what needs to be calculated.

The program should work no matter how many dogs there are. The usual coding standards apply.

- - CUT SCREEN HERE - -

The big picture

Ray
Ray

Let me check out the last lesson on CSVs...

OK, there are two loops. The first one reads records from the file, and puts data into arrays. It validates as it goes.

The second loop does the statistics.

Then output the results.

Is that right?

Georgina
Georgina

Sounds good so far.

Note

Ray looked at another program that did something similar, and reminded himself what the big picture is.

Ray
Ray

Great. Some coffee... Ah!

Now I'll start Excel, open VBE, put in a sub, and start with the big picture comments.

  • 'Read data from CSV file.
  •  
  • 'Compute stats.
  •  
  • 'Output.

W00t for planning

Ray didn't start to code. He planned, using comments to lay out the big steps.

Draw a diagram, use comments, sing your plan... whatever works for you.

Ray
Ray

Just checking the code from last time... Oh, yeah. I'll add some more comments to remind me what the code should do

  • 'Read data from CSV file.
  • 'Open the file.
  • 'Loop while there's more data.
  • 'Input a record.
  • 'Validate.
  • 'If data OK, store record in arrays.
  •  
  • 'Compute stats.
  • 'Initialize totals and counter.
  • 'Loop across the arrays.
  • 'Test for highest and lowest.
  •  
  • 'Output.

Input

Ray
Ray

OK, I know how the thing is going to hang together. Now the input piece.

Need to open the file. Where do I get the file name? Oh, it's given in the task.

"The file is named dogs-happiness.csv."

Open it for input. Add the loop statements, too. Oh, don't forget to close.

  • 'Read data from CSV file.
  • 'Open the file.
  • Open ThisWorkbook.Path & "\dogs-happiness.csv" For Input As #1
  • 'Loop while there's more data.
  • Do While Not EOF(1)
  •     'Input a record.
  •    'Validate.
  •    'If data OK, store record in arrays.
  •  
  • Loop
  • Close #1

Note

Ray is doing some copy-and-paste from an earlier example.

Ray
Ray

Input a record. Let's see... Oh, yeah, we did that into string variables, so we could validate them. There are three fields: name, happiness score, and phone.

Might as well Dim them, too.

  • Dim tNameIn As String
  • Dim tHappinessScoreIn As String
  • Dim tPhoneIn As String
  •  
  • 'Read data from CSV file.
  • 'Open the file.
  • Open ThisWorkbook.Path & "\dogs-happiness.csv" For Input As #1
  • 'Loop while there's more data.
  • Do While Not EOF(1)
  •     'Input a record.
  •     Input #1, tNameIn, tHappinessScoreIn, tPhoneIn
  •  
  •     'Validate.
  •     'If data OK, store record in arrays.
  •  
  • Loop
  • Close #1
Ray
Ray

Validation. The first field, name, it has to be there. Let me check the code from last time... Oh, wait, it's that flag thing. I'll add the flag, then the test. Like I did in the last exercise. Same thing.

The flag pattern

Ray learned the flag pattern, so he can drop it in as he needs it. Ray doesn't have to think too much about it, since he knows the pattern well.

Pattern

Flag

A variable that summarizes the results of a bunch of code.

  • Dim tNameIn As String
  • Dim tHappinessScoreIn As String
  • Dim tPhoneIn As String
  •  
  • Dim tDataOk As String
  •  
  • 'Read data from CSV file.
  • 'Open the file.
  • Open ThisWorkbook.Path & "\dogs-happiness.csv" For Input As #1
  • 'Loop while there's more data.
  • Do While Not EOF(1)
  •     'Data validation flag to true.
  •     tDataOk = "yes"
  •     'Input a record.
  •     Input #1, tNameIn, tHappinessScoreIn, tPhoneIn
  •     'Validate.
  •     'Is name missing?
  •     If tNameIn = "" Then
  •         tDataOk = "no"
  •     End If
  •     'If data OK, store record in arrays.
  •  
  • Loop
  • Close #1
Ray
Ray

Let's see... the phone is the same thing, just check for empty. I'll just do that now, while I'm thinking of it. Copy the code for checking the name, and change it a little.

  • Dim tNameIn As String
  • Dim tHappinessScoreIn As String
  • Dim tPhoneIn As String
  •  
  • Dim tDataOk As String
  •  
  • 'Read data from CSV file.
  • 'Open the file.
  • Open ThisWorkbook.Path & "\dogs-happiness.csv" For Input As #1
  • 'Loop while there's more data.
  • Do While Not EOF(1)
  •     'Data validation flag to true.
  •     tDataOk = "yes"
  •     'Input a record.
  •     Input #1, tNameIn, tHappinessScoreIn, tPhoneIn
  •     'Validate.
  •     'Is name missing?
  •     If tNameIn = "" Then
  •         tDataOk = "no"
  •     End If
  •     'Is phone present?
  •     If tPhoneIn = "" Then
  •         tDataOk = "no"
  •     End If
  •     'If data OK, store record in arrays.
  • Loop
  • Close #1
Ray
Ray

Happiness is numeric. Copy-and-paste the numeric check from last time, change it a little.

  • Dim tNameIn As String
  • Dim tHappinessScoreIn As String
  • Dim sHappinessScore As Single
  • Dim tPhoneIn As String
  •  
  • Dim tDataOk As String
  •  
  • 'Read data from CSV file.
  • 'Open the file.
  • Open ThisWorkbook.Path & "\dogs-happiness.csv" For Input As #1
  • 'Loop while there's more data.
  • Do While Not EOF(1)
  •     'Data validation flag to true.
  •     tDataOk = "yes"
  •     'Input a record.
  •     Input #1, tNameIn, tHappinessScoreIn, tPhoneIn
  •     'Validate.
  •     'Is name missing?
  •     If tNameIn = "" Then
  •         tDataOk = "no"
  •     End If
  •     'Is happiness score OK?
  •     If Not IsNumeric(tHappinessScoreIn) Then
  •         tDataOk = "no"
  •     Else
  •         sHappinessScore = tHappinessScoreIn
  •         If sHappinessScore < 1 Or sHappinessScore > 10 Then
  •             tDataOk = "no"
  •         End If
  •     End If
  •     'Is phone present?
  •     If tPhoneIn = "" Then
  •         tDataOk = "no"
  •     End If
  •     'If data OK, store record in arrays.
  • Loop
  • Close #1
Ray
Ray

Now the other fields... oh, there are none, just the three.

I'll look at the code from lad time...

OK, now check the flag, update the count of valid records... Hang on, I don't have that variable. Better Dim it. It's a count, so it probably has to be initialized. Check the other code... yes, there it is.

  • Dim tNameIn As String
  • Dim tHappinessScoreIn As String
  • Dim sHappinessScore As Single
  • Dim tPhoneIn As String
  •  
  • Dim sValidRecordCount As Single
  •  
  • Dim tDataOk As String
  •  
  • 'Read data from CSV file.
  • 'Open the file.
  • Open ThisWorkbook.Path & "\dogs-happiness.csv" For Input As #1
  • 'Loop while there's more data.
  • sValidRecordCount = 0
  • Do While Not EOF(1)
  •     'Data validation flag to true.
  •     tDataOk = "yes"
  •     'Input a record.
  •     Input #1, tNameIn, tHappinessScoreIn, tPhoneIn
  •     'Validate.
  •     'Is name missing?
  •     If tNameIn = "" Then
  •         tDataOk = "no"
  •     End If
  •     'Is happiness score OK?
  •     If Not IsNumeric(tHappinessScoreIn) Then
  •         tDataOk = "no"
  •     Else
  •         sHappinessScore = tHappinessScoreIn
  •         If sHappinessScore < 1 Or sHappinessScore > 10 Then
  •             tDataOk = "no"
  •         End If
  •     End If
  •     'Is phone present?
  •     If tPhoneIn = "" Then
  •         tDataOk = "no"
  •     End If
  •     'If data OK, store record in arrays.
  •     New stuff
  •     If tDataOk = "yes" Then
  •         'Data is OK.
  •         'Increment valid record counter.
  •         sValidRecordCount = sValidRecordCount + 1
  •  
  •     End If
  • Loop
  • Close #1
Ray
Ray

Let's see... the other code puts the data into arrays, one for each field. OK, better Dim them, comment that the phone number is a string... and move the data into them.

  • New stuff
  • 'Arrays for fields.
  • Dim atNames(1000) As String
  • Dim asHappinessScores(1000) As Single
  • 'Phone number is a string.
  • Dim atPhones(1000) As String
  •  
  • Dim tNameIn As String
  • Dim tHappinessScoreIn As String
  • Dim sHappinessScore As Single
  • Dim tPhoneIn As String
  •  
  • Dim sValidRecordCount As Single
  •  
  • Dim tDataOk As String
  •  
  • 'Read data from CSV file.
  • 'Open the file.
  • Open ThisWorkbook.Path & "\dogs-happiness.csv" For Input As #1
  • 'Loop while there's more data.
  • sValidRecordCount = 0
  • Do While Not EOF(1)
  •     'Data validation flag to true.
  •     tDataOk = "yes"
  •     'Input a record.
  •     Input #1, tNameIn, tHappinessScoreIn, tPhoneIn
  •     'Validate.
  •     'Is name missing?
  •     If tNameIn = "" Then
  •         tDataOk = "no"
  •     End If
  •     'Is happiness score OK?
  •     If Not IsNumeric(tHappinessScoreIn) Then
  •         tDataOk = "no"
  •     Else
  •         sHappinessScore = tHappinessScoreIn
  •         If sHappinessScore < 1 Or sHappinessScore > 10 Then
  •             tDataOk = "no"
  •         End If
  •     End If
  •     'Is phone present?
  •     If tPhoneIn = "" Then
  •         tDataOk = "no"
  •     End If
  •     'If data OK, store record in arrays.
  •     If tDataOk = "yes" Then
  •         'Data is OK.
  •         'Increment valid record counter.
  •         sValidRecordCount = sValidRecordCount + 1
  •         More new stuff
  •         'Store in arrays.
  •         atNames(sValidRecordCount) = tNameIn
  •         asHappinessScores(sValidRecordCount) = sHappinessScore
  •         atPhones(sValidRecordCount) = tPhoneIn
  •     End If
  • Loop
  • Close #1
Ray
Ray

Now, check the other code... need a counter for invalid records. Oh, wait, I forgot the overall record counter variable! Better add that, too.

  • 'Arrays for fields.
  • Dim atNames(1000) As String
  • Dim asHappinessScores(1000) As Single
  • 'Phone number is a string.
  • Dim atPhones(1000) As String
  •  
  • Dim tNameIn As String
  • Dim tHappinessScoreIn As String
  • Dim sHappinessScore As Single
  • Dim tPhoneIn As String
  •  
  • New stuff
  • Dim sNumberOfRecords As Single
  • Dim sValidRecordCount As Single
  • Dim sInvalidRecordCount As Single
  •  
  • Dim tDataOk As String
  •  
  • 'Read data from CSV file.
  • 'Open the file.
  • Open ThisWorkbook.Path & "\dogs-happiness.csv" For Input As #1
  • 'Loop while there's more data.
  • sValidRecordCount = 0
  • More new stuff
  • sNumberOfRecords = 0
  • sInvalidRecordCount = 0
  • Do While Not EOF(1)
  •     'Update record counter.
  •     sNumberOfRecords = sNumberOfRecords + 1
  •     'Data validation flag to true.
  •     tDataOk = "yes"
  •     'Input a record.
  •     Input #1, tNameIn, tHappinessScoreIn, tPhoneIn
  •     'Validate.
  •     'Is name missing?
  •     If tNameIn = "" Then
  •         tDataOk = "no"
  •     End If
  •     'Is happiness score OK?
  •     If Not IsNumeric(tHappinessScoreIn) Then
  •         tDataOk = "no"
  •     Else
  •         sHappinessScore = tHappinessScoreIn
  •         If sHappinessScore < 1 Or sHappinessScore > 10 Then
  •             tDataOk = "no"
  •         End If
  •     End If
  •     'Is phone present?
  •     If tPhoneIn = "" Then
  •         tDataOk = "no"
  •     End If
  •     'If data OK, store record in arrays.
  •     If tDataOk = "yes" Then
  •         'Data is OK.
  •         'Increment valid record counter.
  •         sValidRecordCount = sValidRecordCount + 1
  •         'Store in arrays.
  •         atNames(sValidRecordCount) = tNameIn
  •         asHappinessScores(sValidRecordCount) = sHappinessScore
  •         atPhones(sValidRecordCount) = tPhoneIn
  •     Even more new stuff
  •     Else
  •         'Data bad. Increment invalid record counter.
  •         sInvalidRecordCount = sInvalidRecordCount + 1
  •     End If
  • Loop
  • Close #1
Ray
Ray

Hey, buds. How does that look?

Adela
Adela

Looks good to me. You oughta test it.

Note

A good idea. Write a bit, test a bit, write a bit...

Ray
Ray

Thanks, Adela. But, wait, there's no output yet. Nothing for me to look at, check. Let me think...

Reflect

The program doesn't show any output. So how is Ray going to test that it's code is right?

(If you were logged in as a student, the lesson would pause here, and you'd be asked to type in a response. If you want to try that out, ask for an account on this site.)
Ray
Ray

Oh, right, put a breakpoint on the End Sub, and check the values in the Locals window... Hey! They're right!

Woohoo!

Marcus
Marcus

Nice, bro!

Processing

Ray
Ray

Time for processing. Let's see... need average happiness, and the dogs with the lowest and highest scores. For the average, I need a totaling variable. I've done that a bunch of times, with the accumulator pattern.

Also, I like to get the current record, and put the fields into temporary variables. Easier to think about. I'll do that.

Thinking about thinking

Ray pays attention to what makes his work easier. Like read a record into temporary variables, and then work with those variables. W00t for Ray!

  • 'Processing.
  • 'Init
  • sTotalHappiness = 0
  • 'Loop over records.
  • For sRecordNumber = 1 To sValidRecordCount
  •     'Get current record's fields.
  •     tName = atNames(sRecordNumber)
  •     sHappinessScore = asHappinessScores(sRecordNumber)
  •     tPhone = atPhones(sRecordNumber)
  •     'Add to total.
  •     sTotalHappiness = sTotalHappiness + sHappinessScore
  • Next sRecordNumber
  • 'Compute average.
  • sAverageHappiness = sTotalHappiness / sValidRecordCount
Ray
Ray

OK, now the happiest and saddest. That's in the min and max lesson. We made a pattern for it. Just remind myself of how that works.

Oh, yeah, that's right. Variable to keep track of highest so far... OK.

Hmmm... I'll also need variables to remember name and phone number...

Pattern

Find smallest/largest values

Loop across some records. Put an If in the loop, comparing the current value with the smallest/largest so far.

  • 'Processing.
  • 'Init
  • sTotalHappiness = 0
  • sScoreHappiest = asHappinessScores(1)
  • sScoreSaddest = asHappinessScores(1)
  • tNameHappiest = atNames(1)
  • tNameSaddest = atNames(1)
  • tPhoneHappiest = atPhones(1)
  • tPhoneSaddest = atPhones(1)'Loop over records.
  • For sRecordNumber = 1 To sValidRecordCount
  •     'Get current record's fields.
  •     tName = atNames(sRecordNumber)
  •     sHappinessScore = asHappinessScores(sRecordNumber)
  •     tPhone = atPhones(sRecordNumber)
  •     'Add to total.
  •     sTotalHappiness = sTotalHappiness + sHappinessScore
  •     'Compare to happiest/saddest so far.
  •     If sHappinessScore < sScoreSaddest Then
  •         tNameSaddest = tName
  •         sScoreSaddest = sHappinessScore
  •         tPhoneSaddest = tPhone
  •     End If
  •     If sHappinessScore > sScoreHappiest Then
  •         tNameHappiest = tName
  •         sScoreHappiest = sHappinessScore
  •         tPhoneHappiest = tPhone
  •     End If
  • Next sRecordNumber
  • 'Compute average.
  • sAverageHappiness = sTotalHappiness / sValidRecordCount
Ray
Ray

That's a lot of variables. Is that right?

Georgina
Georgina

Looks OK. You've got to do the highest and lowest, so there'll be two sets of variables. In each set, you need to track the score, and the name and phone number of the dog with the score.

So, yeah, it's a lot, but it looks right.

Ray
Ray

Thanks, Georgina. You know this stuff better than I do.

Output

Ray
Ray

Last thing, put the results in the cells. That should be easy.

  • 'Output
  • 'Record counts.
  • Cells(4, 2) = sValidRecordCount
  • Cells(5, 2) = sInvalidRecordCount
  • Cells(6, 2) = sNumberOfRecords
  • 'Average happiness.
  • Cells(8, 2) = sAverageHappiness
  • 'Happiest dog.
  • Cells(11, 2) = tNameHappiest
  • Cells(12, 2) = sScoreHappiest
  • Cells(13, 2) = tPhoneHappiest
  • 'Saddest dog.
  • Cells(16, 2) = tNameSaddest
  • Cells(17, 2) = sScoreSaddest
  • Cells(18, 2) = tPhoneSaddest
Ray
Ray

Test it...

It works! Right output. W00t!

W00t, indeed, my friend. W00t, indeed.

You can download Ray's solution.

Up next

Let's look at another worked example, with a range this time.