Talk:Birthday paradox

From Citizendium
Revision as of 03:56, 19 October 2010 by imported>Sandy Harris (→‎different calculation: question)
Jump to navigation Jump to search
This article is developing and not approved.
Main Article
Discussion
Related Articles  [?]
Bibliography  [?]
External Links  [?]
Citable Version  [?]
 
To learn how to update the categories for this article, see here. To update categories, edit the metadata template.
 Definition The counterintuitive result that for any (random) group of 23 or more people it is more likely than not that two of them celebrate their birthday on the same day of the year. [d] [e]
Checklist and Archives
 Workgroup category Mathematics [Categories OK]
 Talk Archive none  English language variant British English

Table for data

Is there an easy way to put data, like in this article, into a table? --David W Gillette 14:55, 21 July 2007 (CDT)

I've just put your data into a table - have a look at how it's constructed. Anthony Argyriou 10:58, 22 July 2007 (CDT)
Thanks, it looks great.--David W Gillette 15:45, 22 July 2007 (CDT)

Is there a mathematician in the house?

I've just added an article birthday attack on a cryptographic application of the birthday paradox, with a link from here.

Part of the text there is: "In general, for a cryptographic primitive of size n bits, the attack cost is 2n/2." That's a well-known rule among crypto folk; it's in the standard references and is used in government standards, see the last paragraph of birthday attack. However, I've never seen a proof and do not know if it is a theorem or just a handy approximation. Can anyone clarify this? Sandy Harris 13:32, 1 November 2008 (UTC)

I don't know what a cryptographic primitive is, and the article does not explain it (hint!). But I guess it is something like a hash function. If you use n-bit hash values (each of them equally likely) and then there is a 50% chance of a hash collision if you compute the hash of about 1.18 · 2n/2 pieces of data.
With n bits there are N = 2n possibilities, so if you take k pieces of data, the probability of a hash collision is (the same formula as in the article):
Using Stirling's approximation for the factorial and assuming that k is large and N even larger, we find the approximation
Let me know if you're interested in the details of the computation. So the probability is 50% if
and solving this yields
For instance, if N = 365 (which is the case in the article) then you get k = 1.18 · N1/2 = 22.5, which is a pretty good approximation for the size of the group you need to get two people with the same birthday. -- Jitse Niesen 00:48, 2 November 2008 (UTC)
That's all I needed. Thanks. Sandy Harris 02:56, 2 November 2008 (UTC)

Misleading formulation

Just for the record: While the calculation is correct, it is badly explained -- persons have no probability. Needs rewrite. --Peter Schmitt 23:57, 8 July 2010 (UTC)

Please look now. Boris Tsirelson 06:35, 9 July 2010 (UTC)
But why the advanced theory of Jitse Niesen, 2008, :-) is outside the article? Boris Tsirelson 06:40, 9 July 2010 (UTC)

A different calculation

Another way to look at it is that with people, there are Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle n{n-1)/2} possible pairs. For example, with 23 people there are 23*22/2 = 253 pairs. If each pair has an independent 1/365 chance of a match, the overall chance is 253/365.

This appears to give somewhat different results than the calculation in the article. What is wrong? Sandy Harris 09:56, 19 October 2010 (UTC)