The Design of Software (CLOSED)

A public forum for discussing the design of software, from the user interface to the code architecture. Now closed.

The "Design of Software" discussion group has been merged with the main Joel on Software discussion group.

The archives will remain online indefinitely.

stats/probability question

Say you have invited 50 guests to your event.

-10 haven't replied yet, the probability of them attending is 50% each.
-30 have accepted, the probability of them attending is 90% each.
-10 have declined, the probability of them attending is 0% each.

The minimum number attending is 0
The maximum number attending is 40
The average (mean) number attending is 32 (10 x .5 + 30 x .9 + 10 x 0)

How would you work out the number of guests attending for a given probability? e.g. if there is 90% chance of at least x guests attending, what is x? I can only think to try this using a simple monte carlo approach, e.g. to use random numbers to simulate 100 events and ignore the lowest 10. Is there a cleaner or more rigorous approach that can be calculated quickly?
Andy Brice Send private email
Saturday, October 14, 2006
 
 
I think http://en.wikipedia.org/wiki/Confidence_interval might be what you're looking for. I'd give you more details, but it's a while ago since I wrote that exam.
Matthias Winkelmann Send private email
Saturday, October 14, 2006
 
 
Matthias,

Thanks for that. It looks on the tight track, but the formular requires s.d. I don't know if standard deviation is relevant to my example.
Andy Brice Send private email
Saturday, October 14, 2006
 
 
Okay, that wasn't really the right thing to look at. I've thought about it a bit longer, and I _know_ I've done stuff like that, but I can't find the correct terms.

Well, I think this is helpful:

The binomial distribution says that the probability of getting m hits
out of n trials where the probability of a hit is p is given by:

P =  (n choose m)*p^m*(1-p)^(n-m)

More cryptic but relevant info at http://en.wikipedia.org/wiki/Binomial_distribution

If you solve for P=0.9 you get the m you're looking for. However, you have different probabilities for different groups. You may be able to average them, but I'm not sure.
Matthias Winkelmann Send private email
Saturday, October 14, 2006
 
 
I am fairly sure you can't average the probabilities.
Andy Brice Send private email
Saturday, October 14, 2006
 
 
"I am fairly sure you can't average the probabilities."

Well, I wouldn't do it if I were working on x-ray software.

However, let's say you have two bags with equal numbers of marbles in them, black in one, white in the other. If you fill a third bag with all the marbles, it's 50/50, so averaging would have worked.

You have to use weighted averages of course.
Matthias Winkelmann Send private email
Saturday, October 14, 2006
 
 
You have to use two binomial distributions (one for guests who accepted and one for guests who didn't respond) and then enforce constraints on the number of actual guests who replied and those who didn't respond.

Let N1 be the number of guests that accepted the invitation. Let p1 be the probability that someone who accepts will actually attend. If X1 is the number of guests at the party who had accepted. The probability that X1 takes the value j is given by the binomial distribution:

P(X1=j) = (N1!/(j!(N1-j)!)) p1^j (1-p1)^(N1-j) =
(30!/(j!(30-j)!)) (0.9)^j (0.1)^(30-j)

Similarly, Let N2 be the number of people who didn't respond, and let k be the number of guests at the party who hadn't responded. Let p2 be the probability that someone who didn't respond will actually attend. Using the binomial distribution again

P(X2=k) = (N2!/(k!(N2-k)!)) p2^k (1-p2)^(N1-k) =
(10!/(k!(10-k)!)) (0.5)^k (0.5)^(10-k)

But you're interested in the total number of people who attend. If N people attend, you have to sum over all the possible combinations of j and k that add up to N that are consistent with N1 and N2.

Since P(P(X1=j) AND P(X2=k)) = P(X1=j)P(X2=k)) you get:

P(N=40)=P(X1=30)P(X2=10)
P(N=39)=P(X1=30)P(X2=9)+P(X1=29)P(X2=10)
P(N=38)=P(X1=30)P(X2=8)+P(X1=29)P(X2=9)+P(X1=28)P(X2=10)
...
P(N=0)=P(X1=0)P(X2=0)

You actually asked for the inverse of this: given a probability how many people will attend. I don't think you can give a simple formula for this. It's a discrete problem so not all probabilities are representd. I think you actually have to build the table I outlined above and do a reverse lookup. It would be a pain to do by hand, but in a program it's just a couple of nested loops.
Charles E. Grant
Saturday, October 14, 2006
 
 
I just looked at your web site, so I see this isn't just curiousity on your part. I'm reasonbly sure that the model I layed out for you is correct. The trouble is that it's usefulness depends entirely on how accurate the numbers p1 and p2 are. If you got them from folks who do a lot of event planning then the model might be worthwhile. If they are just your wild-assed guess, then the whole model is just a very fancy wild-assed guess and there is probably not much point in being so fancy.
Charles E. Grant
Saturday, October 14, 2006
 
 
>If they are just your wild-assed guess, then the whole model is just a very fancy wild-assed guess and there is probably not much point in being so fancy.

You are probably right. I just thought it was an interesting little problem! I am probably just going to go with a min, mean and max number of attendees. That is nice and simple to understand and implement, and probably all that the accuracy of the data warrants. Also I have visions of me trying to explain probability to event planners - something I could do without.

Thanks for taking the time to reply anyway.
Andy Brice Send private email
Saturday, October 14, 2006
 
 
Thanks Andy for the interesting question, thanks Charles for the detailed answer. There should be more threads like this.

... and I still think you could use weighted averages of the probabilities. I have a perfect proof of that, but, alas this text box is too small for it.
Matthias Winkelmann Send private email
Saturday, October 14, 2006
 
 
"The maximum number attending is 40"

If these are real people, then I wouldn't be so sure. People can mark the wrong box by mistake, people change their mind, etc. etc. etc.
Steve Hirsch Send private email
Sunday, October 15, 2006
 
 
And that's helpful to notice ...why?

Are you suggesting he include the notice: If you type in the wrong number of people, the number of seats in your table plan will be wrong?
j.a.i.i.
Sunday, October 15, 2006
 
 
I'm going to allow people to change the probabilities, according to how fickle their friends are. ;0)

I did a simple mean number attending calculation on a spreadsheet for my own wedding. I think its a useful thing to have. But not to be taken too seriously.
Andy Brice Send private email
Monday, October 16, 2006
 
 
"I think its a useful thing to have. But not to be taken too seriously"

That's perfect. However, when people get a number, especially one with a percent sign, and one not especially even, they tend to give it more authority than it deserves.
Steve Hirsch Send private email
Monday, October 16, 2006
 
 
I thought this was a fascinating discussion intellectually speaking. Many of my peers probably would too.

However, I'm not sure my girlfriend would be at all interested. The caterer would charge us for the number of meals prepared, regardless of how many guests actually show up and the number she would be most interested in would be the total number of invitations sent minus the number of definite cancellations, plus a slight hedge for wedding crashers.

The only time I could see the "90% chance X number of guests show up" metric being successful is if the night before the wedding, there's a major snowstorm, most of the guests cancel and we decide to go for a more intimate seating arrangement.

So no, probably not worth putting into the Perfect Table Plan, unless there's a professional edition in the works?
TheDavid
Monday, October 16, 2006
 
 
>However, I'm not sure my girlfriend would be at all interested

I've given up on confidence intervals and gone for a simple min.mean/max numbers and associated costs.

I think where it is useful is in keeping an eye on your bugdet and controlling costs per guest or guest numbers accordingly (e.g. only inviting some additional guests if it looks as if you are going to fall short).

Its tucked away in a dialog, so its not going to add a great deal of complexity.

>unless there's a professional edition in the works?

If I told you, I'd have to kill you. ;0)
Andy Brice Send private email
Tuesday, October 17, 2006
 
 
This could actually be calculated exactly using the binomial theorem.

In particular, the 30 who've accepted means that the 27 expected is so close to 30 that standard deviations don't quite capture what you want.


Oh, and in real life, estimate generously. Nothing worse than a party where the food runs out. Especially those little cocktail sausages on toothpicks and the crackers and clam dip!
frustrated
Tuesday, October 17, 2006
 
 
Trying to get the equations right is a pain, and easy to get wrong. The alternative is a simulation approach. Run through 100 or 1000 simulated weddings. For each guest, choose if he or she is coming, using random number generator and the  probability assigned to that guest. Repeat for each guest. Tally the results. Repeat for another simulated wedding. Pool the results. Simple programming. Simple logic. Snazzy name "Monte Carlo" method.
Harvey Motulsky Send private email
Wednesday, November 08, 2006
 
 

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics
 
Powered by FogBugz