Archive for 4th September 2007

The Complexity Of Complexity

The problems I use aren’t for the most part mathematical brain teasers, because the real world problems I’m trying to teach them to solve aren’t for the most part brain teasers. Complexity, or if you prefer, difficulty, exists on several different levels. In other words, a mathematically simple problem can be highly complex. This is why I have a basic rule when introducing students to problems: Work from the familiar to the unfamiliar.

If I’m teaching statistics, I present the material in terms of the familiar — grades, for example — and then work toward the real world problems, which to students, are unfamiliar. When I’m teaching decision sciences, I start with familiar contexts, such as buying a car or selling football team T-shirts, using familiar variables, such as cost, revenue, and gross profit margins. I add unfamiliar variables, such as NPV, after students have a basic grasp of how to solve the problem, and work toward those unfamiliar real world problem contexts.

This building block approach to problem solving is unfashionable, but it works.

I also never miss a chance to make a point, or teach students a valuable lesson. Faculty parking stickers cost $300 a year. I (and everyone I know) tend to get riled when I go to campus early in the morning, only to find all of the spaces taken up, many by student vehicles with no stickers. So when we’re learning how to construct and solve simulations, I start with this problem.

The university strictly enforces parking policies on campus. The first parking violation costs $40. The second costs $60, and all successive violations cost $75. Each hour a vehicle is parked on campus, there is a 17% chance of its being ticketed. A bus pass costs $53.47. Create a simulation that models the costs incurred over a semester in parking violations, and run 1000 iterations of the simulation. Assume 30 hours of (illegal) parking per week (15 hours of classes, and an additional 15 hours for other reasons). There are 16 weeks in the semester. Is it cheaper to park illegally, or buy a buss pass?

I make as many connections to previous material as I can. It reinforces what they learned, and it makes the connections (and justifications) obvious to the students. So we revisit the problem later in the semester, when students are more skilled.

The university strictly enforces parking policies on campus. A first parking violation costs $40. A second costs $60, and all successive violations cost $75. A bus pass costs $53.47 per semester. The probability of being ticketed increases 20% over the base probability for every additional hour a vehicle is parked in the same lot. The base probability varies according to the season, as described in the table below:

Month
Weeks
Probability
AUG
1
21%
SEP
4
21%
OCT
4
21%
NOV
3
19%
DEC
3
17%
JAN
3
16%
FEB
4
16%
MAR
3
18%
APR
4
20%
MAY
2
21%

Create a simulation that models the costs incurred over a full school year in parking violations, and run 1000 iterations of the simulation. Use your class schedule in the model, using the data in the table above. Is it cheaper to park illegally, or buy a buss pass?

The first of the parking ticket simulations we do in class, as a class. I walk them through it. The second problem students work on individually, while I run around helping and answering questions. Run. Often literally. I’ve sprained an ankle several times teaching. (There is another “life lesson” problem listed below: The CCAmerica problem.)

Back to complexity. One thing I have noticed with, say, MBA students new to teaching is that they have a simplistic idea of complexity. One of the problems is that they are familiar with the problems and how to solve them. The other problem is that they see complexity solely in terms of mathematics.

Problem complexity can be textual, that is, a relatively simple problem can be made highly complex just by the way it is worded. Consider the following:

You have gotten a job in State College, Pennsylvania, the home of Penn State. Like most small college towns, property values in State College are high, but property values in the communities surrounding State College are notably cheaper. You have looked at two houses that you really like, one in State College, and the other thirty miles away, and you want to calculate an amortization table so you can compare the total costs of both houses. To calculate commuting costs, assume that you will work 48 weeks in the year, 5 days a week. Assume a 5% per year increase in gas per gallon per month. Note that you will not owe property taxes the first year—but you will every year after the first (property tax rates are included in the Excel file, as are mortgage and interest data, your downpayment, and the market prices of the two houses).

Open which_house.xls and use the information first to calculate the missing information for each of the two houses (each house is on its own worksheet; the first worksheet has all the information on it that applies to both). Which house would over twenty years be cheaper?

Wordy? Yes. But consider the first version that was submitted:

Compare the total costs over time of buying two houses, assuming a 48-week work year and a 5-day work week, and a 5% increase in gasoline prices per month. Property taxes are due from the second year. Answer the questions on the Excel worksheet.

The initial version is too terse. It gives the student minimal information (the missing crucial data is in the Excel worksheet, but the problem doesn’t tell the students that). It is worded so tersely that students aren’t sure what they’re supposed to do with it: “Compare the total costs” all by itself doesn’t mean much. “Due from the second year” is vaguely worded. So even though it may be short, it introduces additional complexity into an otherwise mathematically simple problem. That’s why the initially submitted problem was reworded. Of course, you could object to the conversational tone of the revised problem, but since no student has ever complained about informal wording, I don’t consider it a problem.

“Mathematically complex” itself can mean several different things. You can add mathematical complexity by introducing more variables, for example. Contrast the two problems below.

Leary Chemical manufactures three chemicals: A, B, and C. These chemicals are produced via two production processes: 1 and 2. Running process 1 for an hour costs $4 and yields 3 units of A, 1 unit of B, and 1 unit of C. Running process 2 for an hour costs $1 and yields 1 unit of A and 1 unit of B. To meet customer demands, at least 10 units of A, 5 units of B, and 3 units of C must be produced daily. Determine the daily production that minimizes Leary Chemical’s production costs.

The Monet Company produces four types of picture frames, which we label 1, 2, 3, and 4. The four types of frames differ with respect to size, shape, and materials used. Each type requires a certain amount of skilled labor, metal, and glass, as shown in Table A below. This table also lists the unit selling price Monet charges for each type of frame. During the coming week, Monet can purchase up to 4000 hours of skilled labor, 6000 ounces of metal, and 10,000 ounces of glass. The unit costs are $8.00 per labor hour, $0.50 per ounce of metal, and $0.75 per ounce of glass. Also, market constraints are such that it is impossible to sell more than 1000 type 1 frames, 2000 type 2 frames, 500 type 3 frames, and 1000 type 4 frames, and Monet does not want to keep any frames in inventory at the end of the week. What should the company do to maximize its profit for this week?

The two are very similar problems. The Monet problem, however, contains more variables (costs of different materials), and is therefore more mathematically complex. But mathematical complexity also arises in rather unlikely places. Compare either of the above two problems with the one below:

A customer requires during the next 4 months, respectively, 50, 65, 100, and 70 units of a commodity, and no backlogging is allowed (that is, the customer’s requirements must be met on time). Production costs are $5, $8, $4, and $7 per unit during these months. The storage cost from one month to the next is $2 per unit (assessed on ending inventory). It is estimated that each unit on hand at the end of month 4 can be sold for $6. Determine how to minimize the net cost incurred in meeting the demands for the next 4 months.

This problem seems on the surface to be of more or less the same mathematical complexity as the two preceding problems, but students find this one more difficult. This mystified me for a while, until after I had talked to quite a few students about why they found it so complex. Note this passage in the problem:

The storage cost from one month to the next is $2 per unit (assessed on ending inventory).

This seems to be merely one more cost variable. It turns out, however, that students find repeated calculations of the same type, such as we see in the either of the preceding problems (total material costs, etc.) significantly simpler than one, non-repeated calculation, such as the storage cost variable above. Students seem to interpret inventory as a time-related variable rather than a cost-related variable.

Adding more variables adds more calculations. The more variables and calculations, the more mathematically complex the problem is. Sometimes, I will make a mathematically complex problem a bit easier for students to digest (the academese for this is “reducing cognitive load”) by introducing familiarity wherever possible, such as the Pigskin problem:

The Pigskin Company produces footballs. Pigskin must decide how many footballs to produce each month. The company has decided to use a 6-month planning horizon. The forecasted demands for the next 6 months are 10,000, 15,000, 30,000, 35,000, 25,000, and 10,000. Pigskin must meet these demands on time, knowing that it currently has 5000 footballs in inventory and that it can use a given month’s production to help meet the demand for that month. (For simplicity, we assume that production occurs during the month, and demand is met at the end of the month.) During each month there is enough production capacity to produce up to 30,000 footballs, and there is enough storage capacity to store up to 10,000 footballs at the end of the month, after demand has been met. The forecasted production costs per football for the next 6 months are $12.50, $12.55, $12.70, $12.80, $12.85, and $12.95, respectively. The holding cost per football held in inventory at the end of any month is figured at 5% of the production cost for that month. (This cost includes the cost of storage and also the cost of money tied up in inventory.) The selling price for footballs is not considered relevant to the production decision because Pigskin will satisfy all customer demand exactly when it occurs—at whatever the selling price is. Therefore, Pigskin wants to determine the production schedule that minimizes the total production and holding costs. Determine this production schedule.

Mathematical complexity also arises from the interpretation of the results. In statistics, for example, students usually pick descriptive statistics up quickly. When you move from descriptive statistics to inferential statistics, however, you introduce a great deal of complexity. For whatever reason, students have a great deal of trouble wrapping their brains around uncertainty.

Consider the parking ticket simulation (I’ll repeat it below so you don’t have to scroll back up).

The university strictly enforces parking policies on campus. The first parking violation costs $40. The second costs $60, and all successive violations cost $75. Each hour a vehicle is parked on campus, there is a 17% chance of its being ticketed. A bus pass costs $53.47. Create a simulation that models the costs incurred over a semester in parking violations, and run 1000 iterations of the simulation. Assume 30 hours of (illegal) parking per week (15 hours of classes, and an additional 15 hours for other reasons). There are 16 weeks in the semester. Is it cheaper to park illegally, or buy a buss pass?

Students don’t have much trouble understanding the variables, setting up the problem, or “solving” it. But this is a simulation. It rests on uncertainty, or probability. You can’t set it up, run it, and get a black and white solution. You have to run multiple iterations (or repetitions) of the simulation, and because you get different results for every iteration, you have to do a statistical analysis of the results and interpret the statistics. This is a great big cognitive roadblock for students. And even when you think they’ve got it, even after they’ve been doing simulations in class for two weeks or more, a student will invariably raise his hand in class and ask, “Why are my results different from hers?”

The only thing to do is repeat that we’re dealing with probability — uncertainty — and although the specific results will differ from student to student and iteration to iteration, the statistics of those results (the means, standard deviations, confidence intervals, and so forth) should not significantly differ — and then show them. It takes time, but it will eventually sink in.

Eventually, you can work students up to doing comparatively complex simulations like this:

CCAmerica is a credit card company that does its best to gain customers and keep their business in a highly competitive industry. The first year a customer signs up for service typically results in a loss to the company because of various administrative expenses. However, after the first year, the profit from a customer is typically positive, and this profit tends to increase through the years. The company has estimated the mean profit from a typical customer to be as shown in column B.

For example, the company expects to lose $40 in the customer’s first year but to gain $87 in the fifth year— provided that the customer stays loyal that long.

For modeling purposes, we will assume that the actual profit from a customer in the customer’s nth year of service is normally distributed with mean shown in Column B and standard deviation equal to 10% of the mean.

At the end of each year, the customer leaves the company, never to return, with probability 0.15, the churn rate. Alternatively, the customer stays with probability 0.85, the retention rate.

The company wants to estimate the NPV of the net profit from any such customer who has just signed up for service at the beginning of year 1, at a discount rate of 15%, assuming that the cash flow occurs in the middle of the year.

The company wants to see how sensitive this NPV is to the retention rate. Do this by showing various retention rates: .75, .80, .85, .90, .95.

Or even much more complex problems which I won’t list here, because they take an average of 5-6 pages in a Word document to list all the variables, and so forth.

Interestingly, complexity pops up in some extremely unlikely places. Consider the problem I posted earlier today. Yes, I’ll be here when you get back.

This is an extremely simple problem, except that students really shriek when you give it to them. The first question is usually, “Is everything we need to know here?” or sometimes, “You forgot part of the problem, didn’t you?” When I say, “No, it’s all there,” I always get, “Where’s the data? How can we solve this without numbers?”

It’s the very simplicity of the problem that students find complex. There is only one variable here: Is the city within 1000 miles of another city or not? All students have to do is use a binary variable. One variable. Two or three calculations. That’s it. (The answer, by the way, is three hubs.) It doesn’t even really make any difference what values they use for that binary variable as long as they’re consistent. They could use 1 and 0, or 10 and 5, or whatever numerical values they like and they’ll get the same answer.

The moral of this story is that years of teaching has taught me that complexity is far more complex than I ever realized. I still run into things that students find complex but I do not. When students have trouble with the work you give them, sure, a lot of the time it’s going to be that they don’t have the basic skills they need, or they haven’t learned what they should have last week in class, or they didn’t do the reading, or they haven’t been coming to class, but don’t always assume that’s the problem. Always ask students why they find the work difficult, because it may be something that has never occurred to you. And listen closely, since students often have trouble telling you exactly what the problem is.

You can learn a lot by listening to your students, and make yourself a much better teacher.

Get Those Juices Flowing!

This should get all you math geeks back in the game after the summer. This is an in-class problem; projects and exams were case-based, that is, all of the problems on any one project or exam related to the same case, and built upon one another. This isn’t a difficult problem once you figure out how to do it — but figuring out how to do it seems to trip up a lot of people. Enjoy!

Republic Airlines will launch service in two years, but first, they have to figure out their hub system. Each hub is used to connect flights between cities within 1000 miles of one another. Republic will fly to Atlanta, Boston, Chicago, Denver, Houston, Los Angeles, New Orleans, New York, Pittsburgh, Salt Lake City, San Francisco, Seattle, and Portland. Republic Airlines must know the minimum number of hubs it will need to cover all these cities (each city must be within 1000 miles of at least one hub). Below are listed the cities, and which other cities are within 1000 miles.

  Cities within 1000 miles
Atlanta (AT)
AT CH HO NO NY PI
Boston (BO)
BO NY PI
Chicago (CH)
AT CH NY NO PI
Denver (DE)
DE SL
Houston (HO)
AT HO NO
Los Angeles (LA)
LA SL SF
New Orleans (NO)
AT CH HO NO
New York (NY)
AT BO CH NY PI
Pittsburgh (PI)
AT BO CH NY PI
Salt Lake City (SL)
DE LA SL SF SE
San Francisco (SF)
LA SL SF SE
Seattle (SE)
SL SF SE

Bonus: What will be the minimum number of hubs if the mileage is 750? 1500?

Monday Free Thread

Technorati:

Comment or trackback, as long as you link to here.