« The Kind of Project To Watch | Main | A Few Quick Links »

March 14, 2008

Comments

It's not too uncommon for people to describe themselves as uncertain about their beliefs. "I'm not sure what I think about that," they will say on some issue. I wonder if they really mean that they don't know what they think, or if they mean that they do know what they think, and their thinking is that they are uncertain where the truth lies on the issue in question. Are their cases where people can be genuinely uncertain about their own beliefs?

If a coin has certain gross physical features such that a rational agent who knows those features (but NOT any details about how the coin is thrown) is forced to assign a probability p to the coin landing on "heads", then it seems reasonable to me to speak of discovering an "objective chance" or "propensity" or whatever. These would be "emergent" in the non-buzzword sense. For example, if a coin has two headses, then I don't see how it's problematic to say the objective chance of heads is 1.

If a coin has certain gross physical features such that a rational agent who knows those features (but NOT any details about how the coin is thrown) is forced to assign a probability p to the coin landing on "heads", then it seems reasonable to me to speak of discovering an "objective chance" or "propensity" or whatever.

You're saying "objective chance" or "propensity" depends on the information available to the rational agent. My understanding is that the "objective" qualifier usually denotes a probability that is thought to exist independently of any agent's point of view. Likewise, my understanding of the term "propensity" is that it is thought to be some inherent quality of the object in question. Neither of these phrases usually refers to information one might have about an object.

You've divided a coin-toss experiment's properties into two categories: "gross" (we know these) and "fine" (we don't know these). You can't point to any property of a coin-toss experiment and say that it is inherently, objectively gross or fine -- the distinction is entirely about what humans typically know.

In short, I'm saying you agree with Eliezer, but you want to appropriate the vocabulary of people who don't.

(I'd agree that such probabilities can be "objective" in the sense that two different agents with the exact same state of information are rationally required to have the same probability assessment. Probability isn't a function of an individual -- it's a function of the available information.)

You're saying "objective chance" or "propensity" depends on the information available to the rational agent.

Apparently he is, but it can be rephrased. "What information is available to the rational agent" can be rephrased as "what is constrained". In the particular example, we constrain the shape of the coin but not ways of throwing it. We can replace "probability" with "proportion" or "fraction". Thus, instead of asking, "what is the probability of the coin coming up heads", we can ask, "what proportion of possible throws will cause the coin to come up heads." Of course, talking about proportion requires the assignment of a measure on the space of possibilities. This measure in turn can be derived from the geometry of the world in much the same way as distance, area, volume, and so on can be derived. That is to say, just as there is an objective (and not merely subjective) sense in which two rods can have the same length, so is there an objective (and not merely subjective) sense in which two sets of possibilities can have the same measure.

[nitpick]
That is to say, just as there is an objective (and not merely subjective) sense in which two rods can have the same length

Well, there are the effects of relativity to keep in mind, but if we specify an inertial frame of reference in advance and the rods aren't accelerating, we should be able to avoid those. ;)
[/nitpick]

I'm joking, of course; I know what you meant.

No, these sentences mean quite different things, which is how I can conceive of the possibility that my beliefs are false.

No, on both counts. The sentences do not mean quite different things, and that is not how you conceive of the possibility that your beliefs are false.

One is a statement of belief, and one is a meta-statement of belief. Except for one level of self-reference, they have exactly the same meaning. Given the statement, anyone can generate the meta-statement if they assume you're consistent, and given the meta-statement, the statement necessarily follows.

Caledonian: The statement "x is true" could be properly reworded as "X corresponds with the world." The statement "I believe X" can be properly reworded as "X corresponds with my mental state." Both are descriptive statements, but one is asserting a correspondence between a statement and the world outside your brain, while the other is describing a correspondence between the statement and what is in your brain.

There will be a great degree of overlap between these two correspondence relations. Most of our beliefs, after all, are (probably) true. That being said, the meanings are definitely not the same. Just because it is not sensible for us to say that "x is true" unless we also believe x (because we rarely have reason to assert what we do not believe), does not mean that the concepts of belief and truth are the same thing.

It is meaningful (if unusual) to say: "I believe X, but X is not true." No listener would have difficulty understanding the meaning of that sentence, even if they found it an odd thing to assert. Any highly reductionist account of truth or belief will always have difficulty explaining the content that everyday users of English would draw from that statement. Likewise, no normal user of English would think that "I believed X, but it isn't true," would necessarily mean, "X used to be true, but now it is false," which seems like the only possible reading, on your account.

Constant,

I agree with you that systems which are not totally constrained will show a variety of outcomes and that the relative frequencies of the outcomes are a function of the physics of the system. I'm not sure I'd agree that the relative frequencies can be derived solely from the geometry of the system in the same way as distance, etc. The critical factor missing from your exposition is the measure on the relative frequencies of the initial conditions.

In the case of the coin toss, we can say that if we positively, absolutely know that the measure on the relative frequencies of the initial conditions is insufficiently sharp, then thanks to the geometry of the system, we can make some reasonable approximations which imply that the relative frequency of the outcomes will be very close to equal.

The prediction of equal frequencies is critically founded on a *state of information*, not a state of the world. It's objective only in the sense that anyone with the same state of information must make the same prediction.

Relative frequency really is a different type of thing than probability, and it's unfortunate that people want to use the same name for these two different things just because they both happen to obey Kolmogorov's axioms.

I agree with you that systems which are not totally constrained will show a variety of outcomes and that the relative frequencies of the outcomes are a function of the physics of the system. I'm not sure I'd agree that the relative frequencies can be derived solely from the geometry of the system in the same way as distance, etc. The critical factor missing from your exposition is the measure on the relative frequencies of the initial conditions.

I haven't actually made a statement about frequencies of outcomes. So far I've only been talking about the physics and geometry of the system. The relevant aspect of the geometry is the proportions of possibilities, and talking about proportions as I said requires the assignment of a measure analogous to length or area or volume, only the volume-like measure in question that I am talking about is a "volume" in a phase space (space of possibilities) rather than normal space.

I do eventually want to say something about frequencies and how they relate to proportions of possibilities, but I didn't do that yet.

The prediction of equal frequencies is critically founded on a *state of information*, not a state of the world. It's objective only in the sense that anyone with the same state of information must make the same prediction.

Yes, but you're talking about the prediction of equal frequencies. Prediction is something that someone does, and so naturally it involves the state of information possessed by him. But there's more going on than people making predictions. There's also the coin's behavior itself. The coin falls heads-up with a certain frequency regardless of whether anyone ever made any prediction about it. If you toss a coin a few million times and it comes up heads about half the time, one question you might ask is this: what, if anything, caused the coin to come up heads about half the time (as opposed to, say, 3/4 of the time)? This isn't a question about whether it would be rational to predict the frequency. It's a question about a cause. If you want to understand what caused something to happen, look at the geometry and physics of it.

Probability isn't a function of an individual -- it's a function of the available information.
It's also a function of the individual. For one thing, it depends on initial priors and cputime available for evaluating the relevant information. If we had enough cputime, we could build a working AI using AIXItl.

Both are descriptive statements, but one is asserting a correspondence between a statement and the world outside your brain, while the other is describing a correspondence between the statement and what is in your brain.

Yes, but - and here's the important part - what's being described as "in my brain" is an asserted correspondence between a statement and the world. Given one, we can infer the other either necessarily or by making a minimal assumption of consistency.

Given one, we can infer the other either necessarily or by making a minimal assumption of consistency.

No. A belief can be wrong, right? I can believe in the existence of a unicorn even if the world does not actually contain unicorns. Belief does not, therefore, necessarily imply existence. Likewise, something can be true, but not believed by me (e.g., my wife is having an affair, but I do not believe that to be the case). Thus, belief does not necessarily follow from truth.

If all you are saying is that truth conditionally implies belief, and vice versa, I of course agree; I think most of our beliefs do correspond with true facts about the world. Many do not, however, and your theory has a difficult time accomodating that.

Also: what do you mean by a "minimal assumption of consistency?" It is hard for me to understand how this can be of use to you, if it means anything other than, "I assume that beliefs are never wrong." And you can't assume that, because that is what you were trying to show.

No. A belief can be wrong, right?
So can an assertion. Just because you assert "snow is white" does not mean that snow is white. It means you believe that to be the case.

Technically, asserting that you believe snow to be white does not mean you do - but it's a pretty safe bet.

Likewise, something can be true, but not believed by me (e.g., my wife is having an affair, but I do not believe that to be the case).

Yes, but you didn't assert those things. If you had asserted "my wife is having an affair", we would conclude that you believe that your wife is having an affair. If you asserted "I believe my wife is having an affair", we would conclude that you would assert that "my wife is having an affair" is true.

Constant,

I see that I misinterpreted your "proportion or fraction" terminology as referring to outcomes, whereas you were actually referring to a labeling of the phase space of the system. In order to figure out if we're really disagreeing about anything substantive, I have to ask this question -- in your view, what is the role of initial conditions in determining (a) the "objective probability" and (b) the observed frequencies?

Sebastian Hagen,

I'm a "logical omniscience" kind of Bayesian, so the distinction you're making falls into the "in theory, theory and and practice are the same, but in practice, they're not" category. This is sort of like using Turing machines as a model of computation even though no computer we actually use has infinite memory.

If we had enough cputime, we could build a working AI using AIXItl.

*Threadjack*

People go around saying this, but it isn't true:

1) Both AIXI and AIXItl will at some point drop an anvil on their own heads just to see what happens (test some hypothesis which asserts it should be rewarding), because they are incapable of conceiving that any event whatsoever in the outside universe could change the computational structure of their own operations. AIXI is theoretically incapable of comprehending the concept of drugs, let alone suicide. Also, the math of AIXI assumes the environment is separably divisible - no matter what you lose, you get a chance to win it back later.

2) If we had enough CPU time to build AIXItl, we would have enough CPU time to build other programs of similar size, and there would be things in the universe that AIXItl couldn't model.

3) AIXItl (but not AIXI, I think) contains a magical part: namely a theorem-prover which shows that policies never promise more than they deliver.

People go around saying this, but it isn't true: ...
I stand corrected.
I did know about the first issue (from one of Eliezer's postings elsewhere, IIRC), but figured that this wasn't absolutely critical as long as one didn't insist on building a self-improving AI, and was willing to use some cludgy workarounds. I hadn't noticed the second one, but it's obvious in retrospect (and sufficient for me to retract my statement).

in your view, what is the role of initial conditions in determining (a) the "objective probability" and (b) the observed frequencies?

In a deterministic universe (about which I presume you to be talking because you are talking about initial conditions), the initial conditions determine the precise outcome (in complete detail), just as the outcome, in its turn, determines the initial conditions (i.e., given the deterministic laws and given the precise outcome, the initial conditions must be such-and-such). The precise outcome logically determines the observed frequency because the observed frequency is simply a high-level description of the precise outcome. So the initial conditions determine the observed frequency.

But the initial conditions do not determine the objective probability any more than the precise outcome determines the objective probability.

Probability can be applied both to the initial conditions and to the precise outcome. Just as we can classify all different possible precise outcomes as either "heads up" or "tails up" (ignoring "on edge" etc.), so can we also classify all possible initial conditions as either "producing an outcome of heads up" (call this Class A) or "producing an outcome of tails up". And just as we can talk about the probability that a flipped coin will belong to the class "heads up", so we can also talk about the probability that the initial condition of the flip will belong to Class A.

"because they are incapable of conceiving that any event whatsoever in the outside universe could change the computational structure of their own operations."

Self-modifying systems are Turing-equivalent to non-self-modifying systems. Suppose you have a self-modifying TM, which can have transition functions A1,A2,...An. Take the first Turing machine, and append an additional ceil(log(n)) bits to the state Q. Then construct a new transition function by summing together the Ai: take A1 and append (0000... 0) to the Q, take A2 and append (0000... 1) to the Q, and so on and so forth (append different things to the domain and codomain Q when that particular state leads to self-modification). This non-self-modifying machine should replicate the behavior of the self-modifying machine exactly, as every computation is equivalent to the self-modifying machine.

Constant,

If I understand you correctly, we've got two different types of things to which we're applying the label "probability":

(1) A distribution on the phase space (either frequency or epistemic) for initial conditions/precise outcomes. (We can evolve this distribution forward or backward in time according to the dynamics of the system.)
(2) An "objective probability" distribution determined only the properties of the phase space.

I'm just not seeing why we should care about anything but distributions of type (1). Sure, you can put a uniform measure over points in phase space and count the proportion than ends up in a specified subset. But the only justification I can see for using the uniform measure -- or any other measure -- is as an approximation to a distribution of type (1).

Here's a new toy model: the phase space is the set of real numbers in the range [0,1]. The initial state is called x_0. The time dynamics are x(t) = (x_0)^(t+1) (positive time only). The coarse outcome is round[x(t)] at some specified t. What is the "objective probability"? If it truly does depend only on the phase space, I've given you everything you need to answer that question.

(For macroscopic model systems like coin tosses, I go with a deterministic universe.)

Tom, your statement is true but completely irrelevant.

"Tom, your statement is true but completely irrelevant."

There's nothing in the AIXI math prohibiting it from understanding self-reference, or even taking drugs (so long as such drugs don't affect the ultimate output). To steal your analogy, AIXI may be automagically immune to anvils, but that doesn't stop it from understanding what an anvil is, or whacking itself on the head with said anvil (ie, spending ten thousand cycles looping through garbage before returning to its original calculations).

Cyan - Here's how I see it. Your toy world in effect does not move. You've defined the law so that everything shifts left. But from the point of view of the objects themselves, there is no motion, because motion is relative (recall that in our own world, motion is relative; every moving object has its own rest frame). Considered from the inside, your world is equivalent to [0,1] where x(t) = x_0. Your world is furthermore mappable one-to-one in a wide variety of ways to intervals. You can map the left half to itself (i.e., [0,.5]) and map the right half to [.5,5000] without changing the rule of x(t) = x_0. In short, it has no intrinsic geometry.

Since it has no intrinsic geometry, there is no question of applying probability to it. Which is okay, because nothing happens in it. The probability of nothing hardly matters.

"The second 'bug' is even stranger. A heuristic arose which (as part of a daring but ill-advised experiment EURISKO was conducting) said that all machine-synthesized heuristics were terrible and should be eliminated. Luckily, EURISKO chose this very heuristic as one of the first to eliminate, and the problem solved itself."

I know it's not strictly comparable, but reading a couple of comments brought this to mind.

Constant,

You haven't yet given me a reason to care about "objective probability" in my inferences. Leaving that aside -- if I understand your view correctly, your claim is that in order for a system to have an "objective probability", a system must have an "intrinsic geometry". Gotcha. Not unreasonable.

What is "intrinsic geometry" when translated into math? (Is it just symmetry? I'd like to tease apart the concepts of symmetry and "objective probability", if possible. Can you give an example of a system equipped with an intrinsic geometry (and therefore an "objective probability") where symmetry doesn't play a role?)

Why does your reasoning not apply to the coin toss? What's the mathematical property of the motion of the coin that motion in my system does not possess?

I want to know the ingredients that will help me build a system that meets your standards. Until I can do that, I can't truly claim to understand your view, much less argue against it.

Why does your reasoning not apply to the coin toss? What's the mathematical property of the motion of the coin that motion in my system does not possess?

The coin toss is (or we could imagine it to be) a deterministic system whose outcomes are entirely dependent on its initial states. So if we want to talk about probability of an outcome, we need first of all to talk about the probability of an initial state. The initial states come from outside the system. They are not supplied from within the system of the coin toss. Tossing the coin does not produce its own initial state. The initial states are supplied by the environment in which the experiment is conducted, i.e., our world, combined with the way in which the coin toss is realized (i.e., two systems can be mathematically equivalent but might be realized differently, which can affect the probabilities of their initial states). When you presented your toy model, you did not say how it would be realized in our world. I took you to be describing a self-contained toy universe.

What is "intrinsic geometry" when translated into math? (Is it just symmetry?

You can't have symmetry without geometry in which to find the symmetry. By intrinsic geometry I mean geometry implied by the physical laws. I don't have any general definition of this, I simply have an example: our own universe has a geometry, and its geometry is implied by its laws. If you don't understand what I'm talking about I can explain with a thought experiment. Suppose that you encounter Flatland with its flatlanders. Some of them are triangles, etc. Suppose you grab this flatland and you stretch it out, so that everything becomes extremely elongated in one direction. But suppose that the physical laws of flatland accommodate this change so that nobody who lives on flatland notices that anything has changed. You look at something that to you looks like an extremely elongated ellipse, but it thinks it is a perfect circle, because when it regards itself through the lens of its own laws of physics, what it sees is a perfect circle. I would say that Flatland has an "intrinsic geometry" and that, in Flatland's intrinsic geometry, the occupant is a perfect circle.

Your toy model, considered as a self-contained universe, does not seem to have an intrinsic geometry. However, I don't have any general idea of what it takes for a universe to have a geometry.

Can you give an example of a system equipped with an intrinsic geometry (and therefore an "objective probability") where symmetry doesn't play a role?

I'm not sure that I can, because I think that symmetry is pretty powerful stuff. A lot of things that don't on the surface seem to have anything to do with symmetry, can be expressed in terms of symmetries.

This will be my last comment on this. I'm breaking two of Robin's rules - too many comments and too long.

There's a long story at the then of The Mind's Eye (or is it The Mind's I? in which someone asks a question:

"What colour is this book.?"

"I believe it's red."

"Wrong"

There follows a wonderfully convoluted dialogue. The point seems to be that someone who believes the book is red would say "It's red," rather than "I believe it's red."

I believe it's The Mind's I.

The comments to this entry are closed.

Less Wrong (sister site)

May 2009

Sun Mon Tue Wed Thu Fri Sat
          1 2
3 4 5 6 7 8 9
10 11 12 13 14 15 16
17 18 19 20 21 22 23
24 25 26 27 28 29 30
31