Jump to content
BrainDen.com - Brain Teasers
  • 0


Guest

Question

I will choose two real numbers using a function that will produce numbers from -infinity to +infinity. You have no idea what function I'm using. It doesn't have to normally distributed.

I will write those two numbers on two different pieces of paper. You will look at one of the numbers. You can then choose whether to stay with that number or switch. You goal is to come up with a method to end up with the higher number more often than 50%. As long as your method is 50.000000.....000001% (probably the best you can do), you have solved it. Enjoy!

Link to post
Share on other sites

16 answers to this question

Recommended Posts

  • 0

The question seems simple on the surface. Most people would say that the solution is simple; if number higher than zero, stick with it, if lower than zero, then choose the other one. The problem with this simplistic solution is the assumption that infinity is a distinct quantitative amount but this is not so. No matter what the number written on the paper, there is still an infinite number of choices higher and an infinite number lower so you cannot improve your odds.

Link to post
Share on other sites
  • 0

The question seems simple on the surface. Most people would say that the solution is simple; if number higher than zero, stick with it, if lower than zero, then choose the other one. The problem with this simplistic solution is the assumption that infinity is a distinct quantitative amount but this is not so. No matter what the number written on the paper, there is still an infinite number of choices higher and an infinite number lower so you cannot improve your odds.

Clearly, whatever the answer is, it will hit on philosophical questions and theoretical math. I talked long and hard and asked many questions of my math PHd friend. He is completely aware of the fact that there are infinite real numbers between any two real numbers, or above and below a single real number. He is still convinced you can do better than 50-50.

Link to post
Share on other sites
  • 0

The question seems simple on the surface. Most people would say that the solution is simple; if number higher than zero, stick with it, if lower than zero, then choose the other one. The problem with this simplistic solution is the assumption that infinity is a distinct quantitative amount but this is not so. No matter what the number written on the paper, there is still an infinite number of choices higher and an infinite number lower so you cannot improve your odds.

I actually might agree with the simplistic answer at first, because infinity=infinity, so if it is above 0, it might have a tiny difference in probability. I understand that infinity is not a "real" number, and that it is just a concept, but it might increase it by a little bit. I also understand that infinity - infinity = undefined. Plus, I can't find another solution. That doesn't mean that there isn't one though.
Link to post
Share on other sites
  • 0

I will choose two real numbers using a function that will produce numbers from -infinity to +infinity. You have no idea what function I'm using. It doesn't have to normally distributed.

I will write those two numbers on two different pieces of paper. You will look at one of the numbers. You can then choose whether to stay with that number or switch. You goal is to come up with a method to end up with the higher number more often than 50%. As long as your method is 50.000000.....000001% (probably the best you can do), you have solved it. Enjoy!

if we rephrase the question to make the function produce a number within a finite boundary, say [-b to b], then i we picked a negative number, n, the probability of the second number being greater than the one we picked is: (b+|n|)/(2b-1)

keeping in mind that n is negative, (b+|n|)/(2b-1) > (b-|n|)/(2b-1) (the right hand side is the probability of getting a number less than n)

we can rewrite the inequality as:

b/(2b-1) + |n|/(2b-1) > b/(2b-1) - |n|/(2b-1)

the first term in both sides is equal, so it has no effect on the inequality

as b gets bigger, the second term in both sides gets smaller and smaller, but the will still have a magnitude, and the left will always be +ve and the right -ve. therefore, the chances of the second number being positive is always bigger, the only difference is that the difference in magnitude will get smaller and smaller, but never 0 since infinity cannot be reached.

in mathematics, the term infinity is not defined, a person should use a variable and state that it approaches infinity lim(x -> infinity) this means that x will keep getting bigger and bigger which is possible, but will never reach a 'number' that is not defined, which impossible.

Link to post
Share on other sites
  • 0

I will choose two real numbers using a function that will produce numbers from -infinity to +infinity. You have no idea what function I'm using. It doesn't have to normally distributed.

I will write those two numbers on two different pieces of paper. You will look at one of the numbers. You can then choose whether to stay with that number or switch. You goal is to come up with a method to end up with the higher number more often than 50%. As long as your method is 50.000000.....000001% (probably the best you can do), you have solved it. Enjoy!

Is it possible for the 2 generated number to be the same?

Link to post
Share on other sites
  • 0

if we rephrase the question to make the function produce a number within a finite boundary, say [-b to b], then i we picked a negative number, n, the probability of the second number being greater than the one we picked is: (b+|n|)/(2b-1)

keeping in mind that n is negative, (b+|n|)/(2b-1) > (b-|n|)/(2b-1) (the right hand side is the probability of getting a number less than n)

we can rewrite the inequality as:

b/(2b-1) + |n|/(2b-1) > b/(2b-1) - |n|/(2b-1)

the first term in both sides is equal, so it has no effect on the inequality

as b gets bigger, the second term in both sides gets smaller and smaller, but the will still have a magnitude, and the left will always be +ve and the right -ve. therefore, the chances of the second number being positive is always bigger, the only difference is that the difference in magnitude will get smaller and smaller, but never 0 since infinity cannot be reached.

in mathematics, the term infinity is not defined, a person should use a variable and state that it approaches infinity lim(x -> infinity) this means that x will keep getting bigger and bigger which is possible, but will never reach a 'number' that is not defined, which impossible.

Create a function that will produce a single number, for simplicity, it can be normally distributed across all real numbers. Generate a number. Then look at one of my numbers. If the your number is lower, stick with the first number you looked at. If your number is higher, then switch.

Here is the reasoning why that makes you slightly better than 50-50. Let's assume that the two numbers I picked are 1,000 and 2,000. If you use this strategy, you will win 100% of the time you generate a number in between 1,000 and 2,000. While, if you generate a number higher or lower, you will be exactly 50-50. For example, if your number is 10,000, then you win if you look at 1,000 first, and if your number is 0, you win if you look at 2,000 first. So, you can simply argue, that given any two of my numbers, you are exactly 50-50, unless you pick a number in between my two numbers, in which case you are 100%. Therefore you must be better than 50-50.

I can't think of a way to refute the logic.

Link to post
Share on other sites
  • 0

Is it possible for the 2 generated number to be the same?

I would imagine it's possible. Though I would assume, given a reasonable function, that the chances of that happening is very close to 0% and wouldn't affect your odds very much.

Link to post
Share on other sites
  • 0

Create a function that will produce a single number, for simplicity, it can be normally distributed across all real numbers. Generate a number. Then look at one of my numbers. If the your number is lower, stick with the first number you looked at. If your number is higher, then switch.

Here is the reasoning why that makes you slightly better than 50-50. Let's assume that the two numbers I picked are 1,000 and 2,000. If you use this strategy, you will win 100% of the time you generate a number in between 1,000 and 2,000. While, if you generate a number higher or lower, you will be exactly 50-50. For example, if your number is 10,000, then you win if you look at 1,000 first, and if your number is 0, you win if you look at 2,000 first. So, you can simply argue, that given any two of my numbers, you are exactly 50-50, unless you pick a number in between my two numbers, in which case you are 100%. Therefore you must be better than 50-50.

I can't think of a way to refute the logic.

I think the fact that we can not produce a single percentage for the performance of this strategy is telling. It may be suggesting that the performance is approaching 50% from above, similar to how .9999999.... is approaching 1 from below.

It seems to be that this strategy is somewhat equivalent to saying that if we randomly generate 2 numbers between -∞ to ∞, then the two numbers, call them a and b, will divide the number line into (-∞, a), [a, b], and (b, ∞). We then generate a number C, and then our strategy is 100% if C is in [a, b], and 50% if C is not in [a, b]. However, the fact that we are dealing with the infinite line makes this problem less straight forward than it seems.

Let's simplify things for a moment and say that we sample C from A to B, where A < a, and B > b. Then the chance that C falls within [a, b] is equal to (b -a )/(B - A). However, if we let A -> -∞, and B -> ∞, it is easy to see that the chance that a randomly chosen number falling between [a,b] is approaching 0. It seems that we are then back to the 50% limit.

Link to post
Share on other sites
  • 0

Create a function that will produce a single number, for simplicity, it can be normally distributed across all real numbers. Generate a number. Then look at one of my numbers. If the your number is lower, stick with the first number you looked at. If your number is higher, then switch.

Here is the reasoning why that makes you slightly better than 50-50. Let's assume that the two numbers I picked are 1,000 and 2,000. If you use this strategy, you will win 100% of the time you generate a number in between 1,000 and 2,000. While, if you generate a number higher or lower, you will be exactly 50-50. For example, if your number is 10,000, then you win if you look at 1,000 first, and if your number is 0, you win if you look at 2,000 first. So, you can simply argue, that given any two of my numbers, you are exactly 50-50, unless you pick a number in between my two numbers, in which case you are 100%. Therefore you must be better than 50-50.

I can't think of a way to refute the logic.

I actually like the approach with zero better. It's the same idea, but it doesn't require you picking a random number every time, but rather fixing it at zero. Essentially, we don't know anything about the function you're using to pick 2 numbers, but if the function always yields 2 positive or 2 negative numbers then you have exactly 50/50 chance of winning. If the function ever gives one positive and one negative number, then your odds just shifted to something more favorable than 50/50.

Link to post
Share on other sites
  • 0

it seems to me it doesn't matter what the range is, it can even be a single value such as 0, and if you don't switch when above, and do switch when below, then you have a greater than 50% chance. in the case zero in particular, you have three possibilites: 1 positive, 1 negative, both negative, or both positive. if 1 positive 1 negative, then you have a 100% chance of winning if you have the strategy of switching whenever below zero. thus you have three relatively infinite possiblities one of which gives you a garenteed win, the other two give you 50/50 chance of winning.

thus i'd say you have a 2/3 chance of winning using this strategy. you can pick any value and the probability will be the same.

Link to post
Share on other sites
  • 0

I can refute your logic. Your whole premise is dependent on which paper you choose to look at. If you chose to look at the paper that read 1000, you would win. If you picked the 2000 paper, you would lose. Your odds are 50%.

Create a function that will produce a single number, for simplicity, it can be normally distributed across all real numbers. Generate a number. Then look at one of my numbers. If the your number is lower, stick with the first number you looked at. If your number is higher, then switch.

Here is the reasoning why that makes you slightly better than 50-50. Let's assume that the two numbers I picked are 1,000 and 2,000. If you use this strategy, you will win 100% of the time you generate a number in between 1,000 and 2,000. While, if you generate a number higher or lower, you will be exactly 50-50. For example, if your number is 10,000, then you win if you look at 1,000 first, and if your number is 0, you win if you look at 2,000 first. So, you can simply argue, that given any two of my numbers, you are exactly 50-50, unless you pick a number in between my two numbers, in which case you are 100%. Therefore you must be better than 50-50.

I can't think of a way to refute the logic.

Link to post
Share on other sites
  • 0

I actually like the approach with zero better. It's the same idea, but it doesn't require you picking a random number every time, but rather fixing it at zero. Essentially, we don't know anything about the function you're using to pick 2 numbers, but if the function always yields 2 positive or 2 negative numbers then you have exactly 50/50 chance of winning. If the function ever gives one positive and one negative number, then your odds just shifted to something more favorable than 50/50.

Some comment on this approach

Let's clarify things first, and let's p(x) be the distribution from which we are drawing the two initial numbers. p(x) can be ANY distribution, e.g., a gaussian distribution centered at -1 with standard deviation 1, a uniform distribution on [10, 20], or a binomial distribution with outcomes 100 and 101, etc.

The approach where we select 0 and then apply the methodology will work the best if our p(x) is symmetric around 0. For example, it works the best if our distribution function is gaussian centered at 0 with some standard deviation, or if p(x) is uniformly distributed on [-2, 2]. This is because then we are equally likely to select a positive number as to select a negative number. It is easy to see then that the methodology doesn't do much better than random guessing if our distribution function is asymmetric with respect to 0, e.g. p(x) is uniformly distributed on [1, 10], or p(x) is gaussian with center 10 and standard deviation 1.

To simplify things, lets say that if p(x) if symmetric to 0, then our methodology will do better than 50%, and if p(x) is sufficiently asymmetric, then the methodology is only as good as 50%. To calculate the performance of the methodology, we will need to consider the relative 'size' of the two classes. My claim is that the asymmetric class is infinitely larger than the symmetric class, so the expected performance of the methodology is approaching 50%.

Now, imagine that for every distribution function p(x) that is symmetric, we can easily shift the entire distribution left or right by a distance R and get a distribution that is asymmetric. That is, if p(x) is symmetric, then we can simply make up a real number R, and then p( x - R ) is asymmetric. There are an infinite number of such R, so there are an infinite number of asymmetric shifted distribution for every symmetric p(x).

For example, let's consider the symmetric uniform distribution on [-1, 1]. This p(x) has performance that is better than 50%. However, we can keep shifting the entire interval by 1 to the right, and get an asymmetric uniformly distributed interval each time. For instance, the intervals [0, 2], [1, 3], [2, 4], [3, 5], [4, 6], ... will each have performance = 50%. Repeat ad infinitum.

Note that this argument assumes that in the OP, the two numbers are generated independently from the same univariate p(x). If the two numbers are jointly generated via a bivariate distribution function, then we can simply extend the argument above to 2-dimension and it will work exactly the same.

Edited by bushindo
Link to post
Share on other sites
  • 0

@Bushindo

I agree with everything you said. The only thing I'm not sure about is why you are talking about distributions within limited intervals. The OP states that the function produces the numbers in the range (-inf,+inf). Of course, all the subintervals will fit that description, but I assumed that any real number can be picked even though the distribution is not given.

The solution of fixing at Zero doesn't work for ALL generating functions, but works for those that are capable of producing one positive and one negative number in the pair.

Link to post
Share on other sites
  • 0

@Bushindo

I agree with everything you said. The only thing I'm not sure about is why you are talking about distributions within limited intervals. The OP states that the function produces the numbers in the range (-inf,+inf). Of course, all the subintervals will fit that description, but I assumed that any real number can be picked even though the distribution is not given.

The solution of fixing at Zero doesn't work for ALL generating functions, but works for those that are capable of producing one positive and one negative number in the pair.

Please note that there is no requirement that p(x) be a distribution with limited support (finite domain). The examples of the uniform distribution and binomial distribution that I gave have finite domain, but the gaussian example has unbounded support (-infinity to infinity). The gaussian distribution can potentially generate any number from -infinity to infinity, though certain numbers tend to show up more often. This is just as well, because while the OP states that any number from -infinity to infinity may be generated by this mysterious density distribution, it doesn't say that all numbers are equally likely to show up. In fact, we would run into trouble if the OP requires that all numbers between -infinity and infinity be equally likely. Here is a discussion on why sampling uniformly from the infinite number line is not possible (text).

In post #9, I used uniform sampling on the infinite number line to show that this methodology will not improve beyond 50%. Further research into the topic has shown that uniform sampling on the infinite line is not possible, as it may lead to some logical contradiction. Please ignore post #9.

The fact that the OP is can successfully generate two numbers from the unknown distribution p(x) implies that this distribution is a well-defined probability distribution. And thus well-definedness is the only requirement we impose on p(x).

Edited by bushindo
Link to post
Share on other sites
  • 0

Please note that there is no requirement that p(x) be a distribution with limited support (finite domain). The examples of the uniform distribution and binomial distribution that I gave have finite domain, but the gaussian example has unbounded support (-infinity to infinity). The gaussian distribution can potentially generate any number from -infinity to infinity, though certain numbers tend to show up more often. This is just as well, because while the OP states that any number from -infinity to infinity may be generated by this mysterious density distribution, it doesn't say that all numbers are equally likely to show up. In fact, we would run into trouble if the OP requires that all numbers between -infinity and infinity be equally likely. Here is a discussion on why sampling uniformly from the infinite number line is not possible (text).

In post #9, I used uniform sampling on the infinite number line to show that this methodology will not improve beyond 50%. Further research into the topic has shown that uniform sampling on the infinite line is not possible, as it may lead to some logical contradiction. Please ignore post #9.

The fact that the OP is can successfully generate two numbers from the unknown distribution p(x) implies that this distribution is a well-defined probability distribution. And thus well-definedness is the only requirement we impose on p(x).

I'm definitely in over my head from a math theory perspective, I just found the answer to be interesting from a logic perspective. Clearly, the answer approaches 50%, but I also think it approaches from the high side, which is all the puzzles was looking for.You definitely wouldn't be looking to make any large bets at even money.

I was talking to another friend about this puzzle and he preferred to think about a uniform function that produces numbers from 0,1, where each number is as likely as any other. So two questions.

1. Let's say that you know the function and after I use the function to produce two numbers, you use the function to produce one number. You then look at the first of my two numbers and switch if it's smaller than your number and stay if it's bigger. What are your odds of ending up with the bigger of my two numbers?

2. Since you know that the function produces a smooth distribution from 0,1, you use the optimal strategy, which is to stay with the first number is it's larger than .5 and switch if it's smaller. If you use that strategy, what are your odds of ending up with the larger number?

Link to post
Share on other sites
  • 0

I'm definitely in over my head from a math theory perspective, I just found the answer to be interesting from a logic perspective. Clearly, the answer approaches 50%, but I also think it approaches from the high side, which is all the puzzles was looking for.You definitely wouldn't be looking to make any large bets at even money.

I was talking to another friend about this puzzle and he preferred to think about a uniform function that produces numbers from 0,1, where each number is as likely as any other. So two questions.

1. Let's say that you know the function and after I use the function to produce two numbers, you use the function to produce one number. You then look at the first of my two numbers and switch if it's smaller than your number and stay if it's bigger. What are your odds of ending up with the bigger of my two numbers?

2. Since you know that the function produces a smooth distribution from 0,1, you use the optimal strategy, which is to stay with the first number is it's larger than .5 and switch if it's smaller. If you use that strategy, what are your odds of ending up with the larger number?

Some comments

Clearly, the answer is approaching 50%. But from the mathematical perspective, this is indistinguishable from 50 itself. For instance, if you try to capture the performance in a finite number, say 50.000.....0001% as in the OP where I assume there is a fixed but unknown number of 0's represented by the dots, we can easily prove that the performance is smaller than that fixed number.

An analogy to this curious behavior is the set of rational numbers and irrational numbers. Let's say we look at the interval [0,1] and sample uniformly from it. What is our chance of sampling a rational number? Obviously it is larger than 0. However, because the set of irrational numbers is infinitely larger than the set of rational numbers, our probability of sampling a rational number is approaching 0 from the positive side, so we essentially treat that probability as 0.

Back to the instance where your friend preferred to think of the prior distribution as the uniform distribution on [0,1]. Notice that this changes the whole problem dramatically. Essentially, your friend is adding prior information to the problem, and this prior information can be used to achieve performance better than 50%. In the original problem, the only information we know is 1 of the 2 numbers, and that the generating function p(x) could be ANY function. The fact that I have no idea what this function p(x) is an integral part of the original problem.

In this modified problem, we know 1 of the 2 numbers, and the generating function p(x) = Uniform( 0, 1) itself. The latter is extra information because now, for instance, we immediately know if our number is above or below the median of p(x). Let's say that the two generated numbers are A and B, and we chose A. Given the prior information, we immediately known the probability that B < A, and probability B > A. We couldn't do that with the original problem. The extra performance then, shouldn't be a surprise.

So, back to your questions, lets assume p(x) = Uniform( 0, 1)

1) Let's say we sample A and B as our two numbers, and C as a third number. Let A be the number that is revealed. We arrange the numbers in increasing order, there are 6 possibilities that are equally likely:

A,B,C

A,C,B

B,A,C

B,C,A

C,A,B,

C,B,A

It's easy to see that we would win 2/3 of the time if we generate C uniformly from [0, 1].

2) Let's say we sample A and B as our two numbers, and C as a third number is equal to .5. Then, our strategy would win if min( A, B) < .5 < max( A, B). That is, our strategy would win 100% of the time if .5 is between A and B, and 50% otherwise. The chance that min( A, B) < .5 < max( A, B) is 1/2, so the expected performance is

1/2 * (100%) + 1/2 * (50%) = 75%.

Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Answer this question...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
  • Recently Browsing   0 members

    No registered users viewing this page.

×
×
  • Create New...