## October 29, 2008

Eliezer: have you given any thought to the problem of choosing a measure on the solution space? If you're going to count bits of optimization, you need some way of choosing a measure. In the real world solutions are not discrete and we cannot simply count them.

My (not so "fake") hint:

Think economics of ecologies. Coherence in terms of the average mutual information of the paths of trophic I/O provides a measure of relative ecological effectiveness (absent prediction or agency.) Map this onto the information I/O of a self-organizing hierarchical Bayesian causal model (with, for example, four major strata for human-level environmental complexity) and you should expect predictive capability within a particular domain, effective in principle, in relation to the coherence of the hierarchical model over its context.

As to comparative evaluation of the intelligence of such models without actually running them, I suspect this is similar to trying to compare the intelligence of phenotypical organisms by comparing the algorithmic complexity of their DNA.

Eliezer,

I'm afraid that I'm not sure precisely what your measure is, and I think this is because you have given zero precise examples: even of its subcomponents. For example, here are two optimization problems:

1) You have to output 10 million bits. The goal is to output them so that no two consecutive bits are different.

2) You have to output 10 million bits. The goal is to output them so that when interpreted as an MP3 file, they would make a nice sounding song.

Now, the solution space for (1) consists of two possibilities (all 1s, all 0s) out of 2^10000000, for a total of 9,999,999 bits. The solution space for (2) is millions of times wider, leading to fewer bits. However, intuitively, (2) is a much harder problem and things that optimized (2) are actually doing more of the work of intelligence, after all (1) can be achieved in a few lines of code and very little time or space, while (2) takes much more of these resources.

(2) is a pretty complex problem, but can you give some specifics for (1)? Is it exactly 9,999,999 bits? If so, is this the 'optimization power'? Is this a function of the size of the solution space and the size of the problem space only? If there was another program attempting to produce a sequence of 100 million bits coding some complex solution to a large travelling salesman problem, such that only two bitstrings suffice, would this have the same amount of optimization power?, or is it a function of the solution space itself and not just its size?

Without even a single simple example, it is impossible to narrow down your answer enough to properly critique it. So far I see it as no more precise than Legg and Hutter's definition.

Toby, if you were too dumb to see the closed-form solution to problem 1, it might take an intense effort to tweak the bit on each occasion, or perhaps you might have trouble turning the global criterion of total success or failure into a local bit-fixer; now imagine that you are also a mind that finds it very easy to sing MP3s...

The reason you think one problem is simple is that you perceive a solution in closed form; you can imagine a short program, much shorter than 10 million bits, that solves it, and the work of inventing this program was done in your mind without apparent effort. So this problem is very trivial on the meta-level because the program that solves it optimally appears very quickly in the ordering of possible programs and is moreover prominent in that ordering relative to our instinctive transformations of the problem specification.

But if you were trying random solutions and the solution tester was a black box, then the alternating-bits problem would indeed be harder - so you can't be measuring the raw difficulty of optimization if you say that one is easier than the other.

This is why I say that the human notion of "impressiveness" is best constructed out of a more primitive notion of "optimization".

We also do, legitimately, find it more natural to talk about "optimized" performance on multiple problems than on a single problem - if we're talking about just a single problem, then it may not compress the message much to say "This is the goal" rather than just "This is the output."

I take it then that you agree that (1) is a problem of 9,999,999 bits and that the travelling salesman version is as well. Could you take these things and generate an example which doesn't just give 'optimization power', but 'intelligence' or maybe just 'intelligence-without-adjusting-for-resources-spent'. You say over a set of problem domains, but presumably not over all of them given the no-free-lunch theorems. Any example, or is this vague?

I am not sure you are taking into account the possibility that an intelligence may yield optimal performance within a specific recource-range. Would a human mind given a 10x increase in memmory (and memmories) opperate even marginally better? Or would it be overwhelmed by an amount of information it was not prepared for? Similarly, would a human mind even be able to operate given half the computational resources? In comparing mind A with 40bits/1trillionFPO with the Mind B of 80bits/2trillionFPO may be a matter of how many resources are available, since we don't have any datapoints about how much they each yield given the other's resources.

So perhaps the trendy term of scalability might be one dimension of the intelligence metric you seek. Can a mind take advantage of additional resources if they are made available? I suspect that an intelligence A that can scale up and down (to a specific minimum) linearly may be thought of as superior to an intelligence B that may yield a higher optimization output for a specific amount of resources but is unable to scale up or down.

My parents just arrived for a week long visit so I've been distracted - have meant no disrespect to the reasonable question posed. Will respond ASAP.

The concept of a resource can be defined within ordinary decision theory: something is a resource iff it can be used towards multiple goals and spending it on one goal makes the resource unavailable for spending on a different goal. In other words, it is a resource iff spending it has a nontrivial opportunity cost. Immediately we have two implications: whether or not something is a resource to you depends on your ultimate goal and (2) diving by resources spent is useful only for intermediate goals: it never makes sense to care how efficiently an agent uses its resources to achieve its ultimate goal or to satisfy its entire system of terminal values.

If humanity was forced to choose a simple optimization process to submit itself to I think capitalism would be our best bet.

The comments to this entry are closed.

## May 2009

Sun Mon Tue Wed Thu Fri Sat
1 2
3 4 5 6 7 8 9
10 11 12 13 14 15 16
17 18 19 20 21 22 23
24 25 26 27 28 29 30
31