# Exercises

Exercise

Puffin length

Download a workbook with data on the length of puffins. It looks like this:

("Total obs" means "Total observations.")

Most puffins are between 24 cm and 34 cm in length. Puffins smaller than 24 cm or larger than 34 cm are outliers, possibly mutant puffins with superpowers. For example, the infamous super-villain The Dart was a short puffin. She sank 18 ships in the Atlantic before being taken out by Seal Team 3.

Seal Team 3

Write a VBA program that computes statistics. Here's the output:

Write your program so that it adjusts if the number of rows changes. So your code should work if there are 28 observations, 59 observations, or 1,322 observations.

Assume that all of the data is valid, that is, all lengths are numeric, and greater than zero.

Normal coding standards apply.

Exercise

• 1000,"Enrique","Burke",11,65.8
• 1001,"Trinity","Moran",17,56.6
• 1003,"Julia","Barton",47,79.6
• 1004,"Ryker","Mcintyre",39,73.2

The fields and validation rules are:

• Id number (number, 1000 or more)
• First name (must not be missing)
• Last name (must not be missing)
• Number of exercises completed (number, 0 to 50)
• Exam score (number, 0 to 100)

Write a VBA program to read the data, and (for valid data only) report statistics comparing exam scores for people who complete 30 or more exercises, with scores for people who complete less than 30. Use case-wise deletion.

The worksheet should look like this to start:

When the run button is clicked:

Write your code so that the number of records in the input file can vary.

If you were doing this analysis for real, would you add anything else? I took a statistics class, and I can think of some things.

Good point, Adela. I'd add at least a t test of the two means. However, comparing 30 or more with less then 30 sounds a bit arbitrary. I'd add a correlation analysis, as well.

Exercise

Extreme yoga

The Society for Extreme Yoga (SEY) has gathered data about extreme yoga poses from around the world. For each pose, they have recorded:

• The name of the pose, e.g., the rightward hog.
• The percentage of deaths each time the pose is used, e.g., 0.01%.
• The percentage of serious injuries each time the pose is used, e.g., 0.01%.

So, if you do the rightward hog, there is one chance in 10,000 that you will die, and one chance in 10,000 that you will be injured (only injuries that require hospitalization count).

Some poses are more dangerous. For example, if you do the downward mouse, there is more than one chance in 1,000 that you will die or be seriously injured. Since thousands of extreme yogists do the downward mouse every day, there are hundreds of downward mouse deaths every year.

Write a VBA program that identifies the most and least dangerous extreme yoga poses. The worksheet starts like this:

The data is in the file `yoga-poses.csv`. You can download the data file. Here's a sample:

• "Leftward Virus",.02,.06
• "Inferred Deer",.01,.02
• "Backwards Germ",.03,.06
• "Backwards Unicycle",.02,0

The fields and validation rules are:

• Pose name (must be present)
• The percentage of deaths each time the pose is used (numeric, between 0 and 10)
• The percentage of serious injuries each time the pose is used (numeric, between 0 and 10)

The danger rating of a pose is death rate * 2 + injury rate

After you run your program, you should see:

Write the code so that it works no matter how many poses there are in the data file.

Exercise

Milk for goat soap

Cthulhu's mother and sister sent it some goat soap, from Australia. It's made with goat milk. The soap, that is, not the continent. The continent is made of rocks, dirt, and beer.

Cthulhu likes the soap, and wants to make its own. However, it decides only to use milk from phlegmatic, kookie goats. It wants to know how much milk is available.

Cthulhu's minion, Pandush, is sent to gather the necessary data from Cthulhu's goaty friends. Write a program to figure out how much milk is available.

Here's some of the data Pandush gathers.

There are four fields for each goat:

• Name.
• Milk produced in gallons per day. Should be a number from 1 to 4. Anything else is invalid.
• Phlegmatism. Should be a number from 1 to 10. Anything else is invalid.
• Kookosity. Should be a number from 1 to 10. Anything else is invalid.

Unfortunately, Pandush is still on Dr Tooth Loosener's Blue Crystals of Fun. He might have made some mistakes when typing in the data.

Cthulhu will use milk from goats whose phlegmatism and kookosity scores are both seven or greater.

• The number of records, valid and invalid.
• The number of valid records, that is, the records that have valid data, whether or not they meet the phlegmatism and kookosity criteria.
• The number of records with invalid data.
• The number of valid records that meet the phlegmatism and kookosity criteria.
• The total milk from valid records that meet the phlegmatism and kookosity criteria (G14).

Cthulhu also wants one more count: the number of valid records that meet the phlegmatism and kookosity criteria, and are particularly kookie (have kookosity of 9 or more). Why does Cthulhu want to know this? It won't say.

Your validation code should use case-wise deletion. If you don't remember what that is, check the lesson Filtering: invalid data. Or ask Pandush, if he's compos mentis.

Here's the output your code should produce.

Use a `Range` object, and write the code so that it will work with no changes when new goat data is added. The usual coding standards apply.

Exercise

Dog happiness

The file dog-happiness.csv has data about dog happiness, collected from happiness-recording collars. For each dog, there's the dog's name, its happiness level, and phone number:

• "Sima",8.12,2486984674
• "Minnie",8.29,2487307245
• "Meg",9.98,2482451990

Happiness values should be numbers between 1 to 10. Sometimes the collars malfunction, because of excess drool, so you have to check the data.

Write a program that runs when the user clicks a Run button. The program gets the name of the CSV file from the worksheet. It shows an error if the file name is missing.

The program computes the average happiness and dog count, but only for valid data. That is, invalid happiness values are not included in the average and count.

The program also shows the happiest and saddest dogs. It shows their name, happiness, and phone number. Again, invalid data is excluded.

Here's the output:

If you want to reduce the number of decimal places in the output, check out the rounding tip.

Hint: treat phone numbers as strings.

Exercise

Receivables

Cthulhu does business with companies in the Midwest, selling them goating supplies. Most companies pay their bills on time, but some accounts are delinquent. Write an Excel VBA program that analyzes the data.

Cthulhu has given you a CSV file, with data on companies with delinquent accounts. Only delinquent accounts are in the file. The data is like this:

• "Runbam","MI",20179
• "Joyzarcity","IN",18891
• "Code-holding","MI",24465

The first field is the name of the company. The second is the state the company is in. The third is the balance owed.

There might be errors in the data. Exclude records with errors from the analysis. Here are the rules:

• Name: cannot be blank.
• State: must be one of IA, IL, IN, MI, or OH. Allow for case differences, and extra spaces at the start and end. So " iA " is OK.
• Balance: a number greater than zero.

Your worksheet should start like this:

The user is supposed to type in a file name, before pressing the button. If they do not, show an error message and stop the program:

Here's what the output should be:

(Hint: make sure error messages from prior runs are removed.)

The output shows the number of valid and invalid records, the total balances from each state, and the grand total. The totals only include data from valid records.

Cthulhu also wants to know the name of the company with the highest balance, and what that balance is. Why does it want to know? Best not to ask.

Wouldn't want to be them. Just sayin.