« Funding Bias | Main | Interpersonal Morality »

July 28, 2008


There is a good tradition of expecting intellectuals to summarize their positions. Even if they write long books elaborating their positions, intellectuals are still expected to write sentences that summarize their core claims. Those sentences may refer to new concepts they have elaborated elsewhere, but still the summary is important. I think you'd do well to try to write such summaries of your key positions, including this one.

You say morality is "the huge blob of a computation ... not just our present terminal values ... [but] includes the specification of those moral arguments, those justifications, that would sway us if we heard them." If you mean what would sway us to matching acts, then you mean morality is what we would want if we had thought everything through. But if you instead mean what would sway us only to assent that an act is "moral", even if we are not swayed to act that way, then there remains the question of what exactly it is that we are assenting.

I agree that it needs a summary. But I think it wiser to write first and summarize afterward - otherwise I am never quite sure what there is to summarize.

There needs to be a separate word for that subset of our values that is interpersonal, prosocial, to some extent expected to be agreed-upon, which subset does not always win out in the weighing; this subset is often also called "morality" but that would be confusing.

There needs to be a separate word for that subset of our values that is interpersonal, prosocial, to some extent expected to be agreed-upon, which subset does not always win out in the weighing; this subset is often also called "morality" but that would be confusing.

Are you maybe referring to manners/etiquette/propriety?

Eliezer: This actually kinda sounds (almost) like something I'd been thinking for a while, except that your version added one (well, many actually, but the one is one that's useful in getting it to all add back up to normality) "dang, I should have thought" insight.

But I'm not sure if these are equivalent. Is this more or less what you were saying: "When we're talking about 'shouldness', we mean something, or at least we think we mean something. It's not something we can fully explicitly articulate, but if we could somehow fully utterly completely understand the operation of the brain, scan it, and somehow extract and process all the relevant data associated with that feeling of 'shouldness', we'd actually get a definition/defining computation/something that we could then work with to do more detailed analysis of morality, and the reason that would actually be 'the computation we should care about' is that'd actually be, well... the very bit of us that's concerned with issues like that, more or less"?

If so, I'd say that what you have is useful, but not a full metamorality. I'd call it more a "metametamorality", the metamorality would be what I'd know if I actually knew the specification of the computation, to some level of precision. To me, it seems like this answers alot, but does leave an important black box that needs to be opened. Although I concede that opening this box will be tricky. Good luck with that. :)

Anyways, I'd consider the knowledge I'd have from actually knowing a bit more about the specification of that computation a metamorality, and the outputs, well, morality.

Incidentally, the key thing that I missed that you helped me see was this: "Hey, that implicit definition of 'shouldness' sitting there in your brain structure isn't just sitting there twiddling its thumbs. Where the heck do you think your moral feelings/suspicions/intuitions are coming from? it's what's computing them, however imprecisely. So you actually can trust, _at least as a starting point_ those moral intuitions as an approximation to what that implicit definition implies."

The most important part seems to be missing. You say that shouldness is about actions and consequences. I'm with you there. You say that it is a one-place function. I take that to mean that it encompasses a specific set of values, independent of who is asking. The part that still seems to be missing is: how are we to determine what this set of values is? What if we disagree? In your conclusion, you seem to be saying that the values we are aiming for are self-evident. Are they really?

It so happens that I agree with you about things like happiness, human life, and service to others. I even know that I could elaborate in writing on the ways and reasons these things are good. But - and this is the critical missing part that I can't find - I do not know how to make these values and reasons universal! I may define "right" in terms of these values, but if another does not, then how are they "wrong" in any sense beyond their disagreement with my values? For example, what could be the reason that another one's more ruthless pursuit of his own happiness is less right?

What claim could any person or group have to landing closer to the one-place function?

Eliezer, it's a pleasure to see you arrive at this point. With an effective understanding of the subjective/objective aspects supporting a realistic metaethics, I look forward to your continued progress and contributions in terms of the dynamics of increasingly effective evolutionary (in the broadest sense) development for meaningful growth, promoting a model of(subjective) fine-grained, hierarchical values with increasing coherence over increasing context of meaning-making, implemts principles of (objective) instrumental action increasingly effective over increasing scope of consequences. Wash, rinse, repeat...

There's no escape from the Red Queen's race, but despite the lack of objective milestones or markers of "right", there's real progress to be made in the direction of increasing rightness.

Society has been doing pretty well at the increasingly objective model of instrumental action known commonly known as warranted scientific knowledge. Now if we could get similar focus on the challenges of values-elicitation, inductive biases, etc., leading to an increasingly effective (and coherent) model of agent values...

- Jef

I second Robin's request that you summarize your positions. It helps other folks organize and think about your ideas.

I'm quite convinced about how you analyze the problem of what morality is and how we should think about it, up until the point about how universally it applies. I'm just not sure that 'humans different shards of god shatter' add up to the same thing across people, a point that I think would become apparent as soon as you started to specify what the huge computation actually WAS.

I would think of the output as not being a yes/no answer, but something akin to 'What percentage of human beings would agree that this was a good outcome, or be able to be thus convinced by some set of arguments?'. Some things, like saving a child's life, would receive very widespread agreement. Others, like a global Islamic caliphate or widespread promiscuous sex would have more disagreement, including potentially disagreement that cannot be resolved by presenting any conceivable argument to the parties.

The question of 'how much' each person views something as moral comes into play as well. If different people can't all be convinced of a particular outcome's morality, the question ends up seeming remarkably similar to the question in economics of how to aggregate many people's preferences for goods. Because you never observe preferences in total, you let everyone trade and express their desires through revealed preference to get a pareto solution. Here, a solution might be to assign them a certain amount of morality dollars to each outcome, let them spend as they wish, and add it all up. Like economics, there's still the question of how to allocate the initial wealth (in this case, how much to weigh the opinions of each person).

I don't know how much I'm distorting what you meant - it almost feels like we've just replaced 'morality as preference' with 'morality as aggregate preference', and I don't think that's what you had in mind.

2+2=4 no matter who's measuring. Right, for myself and my family, and right, for you and yours, may not always be the same.

If the child on the tracks were a bully who had been torturing my own child (which actions I had previously been powerless to prevent by any acceptable means afforded by my society, and assuming I had exhausted all reasonable alternatives), it might very well feel right to let the bully be annihilated by the locomotive.

Right is reducible as an aggregation of sympathetic conditioning; affection for a person, attachment to conceptualization or expected or desired course of events, and so on.

Wow, there's a lot of ground to cover. For everyone who hasn't read Eliezer's previous writings, he talks about something very similar in Creating Friendly Artificial Intelligence, all the way back in 2001 (link = http://www.singinst.org/upload/CFAI/design/structure/external.html). With reference to Andy Wood's comment:

"What claim could any person or group have to landing closer to the one-place function?"

Next obvious question: For purposes of Friendly AI, and for correcting mistaken intuitions, how do we approximate the rightness function? How do we determine whether A(x) or B(x) is a closer approximation to Right(x)?

Next obvious answer: The rightness function can be computed by computing humanity's Coherent Extrapolated Volition, written about by Eliezer in 2004 (http://www.singinst.org/upload/CEV.html). The closer a given algorithm comes to humanity's CEV, the closer it should come to Right(x).

Note: I did *not* think of CFAI when I read Eliezer's previous post, although I did think of CEV as a candidate for morality's content. CFAI refers to the supergoals of agents in general, while all the previous posts referred to a tangle of stuff surrounding classic philosophical ideas of morality, so I didn't connect the dots.

Bravo. But:

Because when I say "that big function", and you say "right", we are dereferencing two different pointers to the same unverbalizable abstract computation.

No, the other person is dereferencing a pointer to their big function, which may or may not be the same as yours. This is the one place it doesn't add up to normality: not everyone need have the same function. Eliezer-rightness is objective, a one-place function, but it seems to me the ordinary usage of "right" goes further: it's assumed that everybody means the same thing by, not just "Eliezer-right", but "right". I don't see how this metamorality allows for that, or how any sensible one could. (Not that it bothers me.)

I'm going to need some help with this one.

It seems to me that the argument goes like this, at first:

  • There is a huge blob of computation; it is a 1-place function; it is identical to right.
  • This computation balances various values.
  • Our minds approximate that computation.

Even this little bit creates a lot of questions. I've been following Eliezer's writings for the past little while, although I may well have missed some key point.

Why is this computation a 1-place function? Eliezer says at first "Here we are treating morality as a 1-place function." and then jumps to "Since what's right is a 1-place function..." without justifying that status.

What values does this computation balance? Why those values?

What reason do we have to believe that our minds approximate that computation?

Sorry if these are extremely basic questions that have been answered in other places, or even in this article - I'm trying and having a difficult time with understanding how Eliezer's argument goes past these issues. Any help would be appreciated.

"You will find yourself saying, "If I wanted to kill someone - even if I thought it was right to kill someone - that wouldn't make it right." Why? Because what is right is a huge computational property- an abstract computation - not tied to the state of anyone's brain, including your own brain."

Coherent Extrapolated Volition (or any roughly similar system) protects against this failure for any specific human, but not in general. Eg., suppose that you use various lawmaking processes to approximate Right(x), and then one person tries to decide independently that Right(Murder) > 0. You can detect the mismatch between the person's actions and Right(x) by checking against the approximation (the legal code) and finding that murder is wrong. In the limit of the approximation, you can detect even mismatches that people at the time wouldn't notice (eg., slavery). CEV also protects against specific kinds of group failures, eg., convince everybody that the Christian God exists and that the Bible is literally accurate, and CEV will correct for it by replacing the false belief of "God is real" with the true belief of "God is imaginary", and then extrapolating the consequences.

However, CEV can't protect against features of human cognitive architecture that are consistent under reflection, factual accuracy, etc. Suppose that, tomorrow, you used magical powers to rewrite large portions of everyone's brain. You would expect that people now take actions with lower values of Right(x) than they previously did. But, now, there's no way to determine the value of anything under Right(x) as we currently understand it. You can't use previous records (these have all been changed, by act of magic), and you can't use human intuition (as it too has been changed). So while the external Right(x) still exists somewhere out in thingspace, it's a moot point, as nobody can access it. This wouldn't work for, say, arithmetic, as people would rapidly discover that assuming 2 + 2 = 5 in engineering calculations makes bridges fall down.

This little accident of the Gift doesn't seem like a good reason to throw away the Gift
We've been "gifted" with impulses to replicate our genes, but many of us elect not to. I'm not as old as Steven Pinker is when he seemingly bragged of it, but I've made no progress toward reproducing and don't have any plans for it in the immediate future, though I could easily donate to a sperm bank. I could engage in all sorts of fitness lowering activities like attending grad-school, becoming a Jainist monk, engaging in amputative body-modification or committing suicide. People created by evolution do that every day, just as they kill, rob and rape.

Now let's say we want to make a little machine, one that will save the lives of children
Does it take into account the length, quantity or quality of lives when making tradeoffs between saving lives? If it seeks to avoid death will it commit to the apocalyptic imperative as anti-natalism would seem to me to suggest? Does it seek to save fetuses or ensure that a minimum of sperm and eggs die childless? Some of these questions a machine will have to decide and there is no decision simply coming from the axiom you gave. That's because there is no correct answer, no fact of the matter.

And then this value zero, in turn, equating to a moral imperative to wear black, feel awful, write gloomy poetry, betray friends, and commit suicide.
None of the moral skeptics here embrace such a position. I and many others deny ANY MORAL IMPERATIVE WHATSOEVER. That I don't wear all black, write any poetry, push my reputation among my friends below "no more reliable than average", or intentionally harm myself is simply what I've elected to do so far and I reserve the option of pursuing all those "nihilist" behaviors in the future for any reason as unjustified as a coin flip.

it certainly isn't a inescapable logical justification for wearing black.
There's none necessary. If you wear black you won't violate anyones rights. There are none to violate.

the ideal morality that we would have if we heard all the arguments, to whatever extent such an extrapolation is coherent.
You've questionably asserted the existence of "moral error". You also know that people have cognitive biases that cause them go off in crazy divergent directions when exposed to more and more of the same arguments. I would hypothesize that the asymptotic direction the human brain would go in about an unresolved positivist question in the absence of empirical evidence is way off, if simply because the brain isn't designed with singularities in mind. I wouldn't hold up as ideal behavior the output of a program given an asymtote of input. It's liable to crash. You might respond that the ideal computer the program would run on would have an infinite memory or disk, but that would be a different computer. Should I defer to another human being similar to me but with a seemingly infinite memory (you hear about these savants every once in a while)? I can't say. I do know that if the computer could genuinely prove that if I heard all the arguments I'd devote my life to cleaning outhouses, I'd say so much the worse for the version of me that's heard all the arguments. He's not in charge, I am.

Also, how many people can you name who engaged in seriously reprehensible actions and changed their ways because of a really good ethical argument? I know that the anti-slavery movement didn't count Aristotle among its converts, nor did Amnesty International convince Hitler or Stalin. We may like to imagine our beliefs are so superior that they would convince those old baddies, but I doubt we could. If Carlyle were brought to the present I bet he'd dismay at what a bleeding-heart he'd been and widen his circle of those fit for the whip.

you just mentioned intuitions, rather than using them
The intuition led to the belief. What is the distinction. "It is my intuition that removing the child from the tracks is repugnant" - "You just mentioned rather than used intuitions".

Tom McCabe:
humanity's Coherent Extrapolated Volition
I think Arrow's Impossibility Theorem would argue against that being meaningful when applied to "humanity".

I found this post a lot more enlightening than the posts that it's a followup to.

TGGP, as far as I understand, Arrow's theorem is an artifact of forcing people to send only ordinal information in voting (and enforcing IIA which throws away that information on the strength of preferences between two alternative which is available from rankings relative to third alternatives). People voting strategically isn't an issue either when you're extrapolating them and reading off their opinions.

"alternative" -> "alternatives"

I think lots of people are misunderstanding the "1-place function" bit. It even took me a bit to understand, and I'm familiar with the functional programming roots of the analogy. The idea is that the "1-place morality" is a closure over (i.e. reference to) the 2-place function with arguments "person, situation" that implicitly includes the "person" argument. The 1-place function that you use references yourself. So the "1-place function" is one's subjective morality, and not some objective version. I think that could have been a lot clearer in the post. Not everyone has studied Lisp, Scheme, or Haskell.

Overall I'm a bit disappointed. I thought I was going to learn something. Although you did resolve some confusion I had about the metacircular parts of the reasoning, my conclusions are all the same. Perhaps if I were programming an FAI the explicitness of the argument would be impressive.

As other commenters have brought up, your argument doesn't address how your moral function interacts with others' functions, or how we can go about creating a social, shared morality. Granted, it's a topic for another post (or several) but you could at least acknowledge the issue.

Too much rhetoric (wear black, miracle, etc.), you wandered off the point three too many times, you used incoherent examples, you never actually defended realism, you never defend the assertion of "the big computation", and for that much text there was so little actually said. A poor offering.

This argument sounds too good to be true - when you apply it to your own idea of "right". It also works for, say, a psychopath unable to feel empathy who gets a tremendous kick out of killing. How is there not a problem with that?

that subset of our values that is interpersonal, prosocial, to some extent expected to be agreed-upon, which subset does not always win out in the weighing

Can we just say that evolution gave most of us such an identifiable subset, and declare a name for that? Even so, a key question remains whether we are mistaken in expecting agreement - are we built to actually agree given enough analysis and discussion, or only to mistakenly expect to agree?

I agree with Andy Wood and Nick Tarleton. To put what they have said another way, you have taken the 2-place function


And replaced it with a certain unspecified unary rightness function which I will call "Eliezer's_big_computation( -- )". You have told us informally that we can approximate

Eliezer's_big_computation( X ) = happiness( X ) + survival( X ) + justice( X ) + individuality( X ) + ...

But others may define other "big computations". For example

God's_big_computation( X ) = submission( X ) + Oppression_of_women( X ) + Conquest_of_heathens( X ) + Worship_of_god( X ) + ...

How are we to decide which "big computation" encompasses that which we should pursue?

You have simply replaced the problem of deciding which actions are right with the equivalent problem of deciding which action-guiding computation we should use.

Your CEV algorithm is likely to return something more like God's_big_computation( - ) than Eliezer's_big_computation( - ), which is because God's_big_computation more closely resembles the beliefs of the 6 billion people on this planet. And even if it did return Eliezer's_big_computation( - ), I'm not sure I agree with that outcome. In any case, I don't think you said anything new or particularly useful here; I think that we all need to think about this issue more.

As a matter of fact myself and Richard Hollerith have independently thought of a canonical notion of goodness which is objective. He calls it "goal system zero", I call it "universal instrumental values".

Let me see if I get this straight:

Our morality is composed of a big computation that includes a list of the things that we value(love, friendship, happiness,...) and a list of valid moral arguments(contagion backward in time, symmetry,...).
If so, then how do we discover those lists? I guess that the only way is to reflect on our own minds, but if we do that, then how do we know if a particular value comes from our big computation, or is it just part of our regular biases?
And if our biases are inextricably tangled with The Big Computation, then what hope can we possibly have?

Anyway, I think it would be useful to moral progress to list all of the valid moral arguments. Contagion backward in time and symmentry seem to be good ones. Any other suggestions?

I second Behemouth and Nick- what do we do in the mindspace in which individual's feelings of right and wrong disagree? What if some people think retarded children absolutely *should* NOT be pulled off the track? Also, what about the pastrami-sandwich dilemma? (hat of those who would kill 1 million unknown people with no consequence to themselves for a delicious sandwich?

But generally, I loved the post. You should write another post on 'Adding Up to Normality.'

Just because I can't resist, a poem about human failing, the judgment of others we deem weaker than ourselves, and the desire to 'do better.' Can we?

"No Second Troy"
WB Yeats, 1916
WHY should I blame her that she filled my days
With misery, or that she would of late
Have taught to ignorant men most violent ways,
Or hurled the little streets upon the great,
Had they but courage equal to desire? 5
What could have made her peaceful with a mind
That nobleness made simple as a fire,
With beauty like a tightened bow, a kind
That is not natural in an age like this,
Being high and solitary and most stern? 10
Why, what could she have done being what she is?
Was there another Troy for her to burn?

P.S : My great "Aha!" moment from reading this post is the realisation that morality is not just a utility function that maps states of the world to real numbers, but also a set of intuitions for changing that utility function.

Added a section on non-universalizability:

If you hoped that morality would be universalizable - sorry, that one I really can't give back. Well, unless we're just talking about humans. Between neurologically intact humans, there is indeed much cause to hope for overlap and coherence; and a great and reasonable doubt as to whether any present disagreement is really unresolvable, even it seems to be about "values". The obvious reason for hope is the psychological unity of humankind, and the intuitions of symmetry, universalizability, and simplicity that we execute in the course of our moral arguments. (In retrospect, I should have done a post on Interpersonal Morality before this...)

If I tell you that three people have found a pie and are arguing about how to divide it up, the thought "Give one-third of the pie to each" is bound to occur to you - and if the three people are humans, it's bound to occur to them, too. If one of them is a psychopath and insists on getting the whole pie, though, there may be nothing for it but to say: "Sorry, fairness is not 'what everyone thinks is fair', fairness is everyone getting a third of the pie". You might be able to resolve the remaining disagreement by politics and game theory, short of violence - but that is not the same as coming to agreement on values. (Maybe you could persuade the psychopath that taking a pill to be more human, if one were available, would make them happier? Would you be justified in forcing them to swallow the pill? These get us into stranger waters that deserve a separate post.)

If I define rightness to include the space of arguments that move me, then when you and I argue about what is right, we are arguing our approximations to what we would come to believe if we knew all empirical facts and had a million years to think about it - and that might be a lot closer than the present and heated argument. Or it might not. This gets into the notion of 'construing an extrapolated volition' which would be, again, a separate post.

But if you were stepping outside the human and hoping for moral arguments that would persuade any possible mind, even a mind that just wanted to maximize the number of paperclips in the universe, then sorry - the space of possible mind designs is too large to permit universally compelling arguments. You are better off treating your intuition that your moral arguments ought to persuade others, as applying only to other humans who are more or less neurologically intact. Trying it on human psychopaths would be dangerous, yet perhaps possible. But a paperclip maximizer is just not the sort of mind that would be moved by a moral argument. (This will definitely be a separate post.)

Dear Eliezer,
First of all, great post, thank you, I truly love you Eli!! It was really the kind of beautiful endpoint in your dance I was waiting for, and it is very much in the lines of my own reasoning, just a lot more detailed. I also think this could be labeled metametamorality, therefore some of the justified complaints does not yet apply. But the people complaining about different moral preferences are doing so with their own morality, what else could they be using, and in doing so they are acting according to the arguments of this post. Metametamorality would be about the ontological reductionistic framework in which metamorality and morality takes place. Metamorality would be about simplifying and generalizing morality into certain values, principles and rules to which groups of different sizes would try to approximate, that we ought to follow. Morality would be about applying the metamorality. But it may also complicate things too much using this terminology, and I might even prefer Eliezers use. Metametamorality could also be called metamorality and metamorality, metaethics. Anyway, I found this amazingly summarizing of your viewpoints, and it helped me a lot in grasping this course in Bayescraft.

I have been working for a long time on my own metamorality or metaethics. Which you may take a peak at in this diagram http://docs.google.com/File?id=d4pc9b6_188cgj9zgwz_b.
It workes from the same metametamoral assumptions that eliezer does. And I have done my best in using my inbuilt moral computation to onstruct a coherent metamoral guide for myself and others who may be like me.
For me the basic principle is and has been for a couple of years now: There is no values outside humanity, so therefore everything has 0 value. But being an agent, with a certain set of feelings, morality and goals I may as well feel them and use them(some of them), because it is rather fantastic after all that there is anything at all. Although this amazement is a human feeling produced by my evolved psychology for curiosity it seems rather nice. It is just beautiful that there is something rather than nothing(especially in a Tegmark MUH universe), so I assign +1 to everything, including particles, energies, all events, everything that exists, every piece of information, it is good, it is beautiful because it exists, so hence I love everything!! But this was only the first level of morality, I can not and do not want to be left only with that, because that would leave me sitting still and doing nothing. So I let my evolved psychology discriminate on the highest possible general level. I let more feelings slip in gently, building up a hierarchy of morality and values on many different levels. A universe with three particles is better than one with only 1. Complexity, information is beautiful, diversity is, but simplicity is also. Beauty in general although elusive as a concept is a very important but comlicated terminal value for me consisting of a lot of parts, and I believe it is in some sense for everybody, although seldom explicit. Truth is also one of the great terminal values, and I think David Deautch expressed it nicely in a TED talk once. The Good is also of great importance, since it allows the expansion of beauty and knowledge about truth.
So important for me is:
* To construct a metamorality that is very general
* That is not tied only to experience, although the values may originate from our experience and psychology and may very often be the same as pleasure in our experience. This is mostly because of elegance and a stability concern, also because it may affect our belief in it more, i.e. stronger placebo.
* For me a universe with matter distributed randomly is uglier than an universe only consisting of a great book or intelligent, complex construction, even though nobody might experience it.
* Of course experience and intelligence is of the greatest and most beautiful things ever. So I value them extremely highly. Higher than anything else because it makes everything else so much more beautiful when there is someone to experience it.

The complete hierarchical structure of my value system is not complete and will never be, but I will try to continue to approximate it, and I find it valuable and moral to do so, as it might help myself and others in deciding on values, and choosing wisely. It might not be the value sstem of choice for most individuals, but some people might find it appealing.
Sorry for the Sketchy nature of this comment, I just needed it to get out, I hope I could get some comment from Eliezer, but I may as well wait until I get enough strength to make this thing better and to mail it then...

P.S. So my additinon is really, choose a stable value structure that feels right, try to maximize it, try to make it better and change so when you feel it is right. I have my own high-level suggestion of Beauty, Truth and the Good, and I later discovered Plato and a lot of others seem to argue for the same three...

There are some good thoughts here, but I don't think the story is a correct and complete account of metamorality (or as the rest of the world calls it: metaethics). I imagine that there will be more posts on Eliezer's theory later and more opportunities to voice concerns, but for now I just want to take issue with the account of 'shouldness' flowing back through the causal links.

'Shouldness' doesn't always flow backwards in the way Eliezer mentioned. e.g. Suppose that in order to push the button, you need to shoot someone who will fall down on it. This would make the whole thing impermissible. If we started by judging saving the child as something we should do, then the backwards chain prematurely terminates when we come to the only way to achieve this involving killing someone. Obviously, we would really want to consider not just the end state of the chain when working out whether we should save the child, but to evaluate the whole sequence in the first place. For if the end state is only possible given something that is impermissible then it wasn't something we should bring about in the first place. Indeed, I think the following back from 'should' is a rather useless description. It is true that if we should (all things considered) do X, then we should do all the things necessary for X, but we can only know whether we should do X (all things considered) if we have already evaluated the other actions in the chain. It is a much more fruitful account to look forward, searching the available paths and then selecting the best one. This is how it is described by many philosophers, including a particularly precise treatment by Fred Feldman in his paper World Utilitarianism and his book Doing the Best We Can.

(Note also that this does not assume consequentialism is true: deontologists can define the goodness of paths in a way that involves things other than the goodness of the consequences of the path.)

On reflection, there should be a separate name for the space of arguments that change our terminal values. Using "metaethics" to indicate beliefs about the nature of (ontology of) morality, would free up "metamorals" to indicate those arguments that change our terminal values. So I bow to Zubon and standard usage - even though it still sounds wrong to me.

Toby, the case of needing to shoot someone who will fall down on the button, is of course very easy for a consequentialist to handle; wrongness flows backward from the shooting, as rightness flows backward from the button, and the wrongness outweighs the rightness.

Nonetheless a great deal of human ethics can be understood in terms of deontological-type rules for handling such conflicts - at any particular event-node - and these in turn can often be understood in terms of simple rules that try to compensate for human biases. I.e., "The end doesn't justify the means" makes sense for humans because of a systematic human tendency to overestimate how likely the end is to be achieved, and to underestimate the negative consequences of dangerous means, especially if the means involves taking power for yourself and the end is the good is the tribe. I have always called such categorically phrased injunctions ethics, which would make "ethics" an entirely different subject from "metaethics". This Deserves A Separate Post.

There may also be moral values that only make sense when we value a 4D crystal, not a 3D slice - or to put it more precisely, moral values that only make sense when we value a thick 4D slice rather than a thin 4D slice; it's not as if you can have an instantaneous experience of happiness. "People being in control of their own lives" might only make sense in these terms, because of the connection between past and future. This too is an advanced topic.

It seems that despite all attempts at preparation, there are many other topics I should have posted on first.

As I've stated before, we are all morally obliged to prevent Eliezer from programming an AI. For according to this system, he is morally obliged to make his AI instantiate his personal morality. But it is quite impossible that the complicated calculation in Eliezer's brain should be exactly the same as the one in any of us: and so by our standards, Eliezer's morality is immoral. And this opinion is subjectively objective, i.e. his morality is immoral and would be even if all of us disagreed. So we are all morally obliged to prevent him from inflicting his immoral AI on us.

Suggested summary: "There is nothing else." That is the key sentence. After much discussion of morals and metas, it comes down to: "You go on with the same morals as before, and the same moral arguments as before." The insight offered is that there is no deeper insight to offer. The recursion will bottom out, so bite the bullet and move on.

Yet another agreement on the 1-Place and 2-Place problem, and I read it after the addition. CEV goes around most of that for neurologically intact humans, but the principle of "no universally compelling arguments" means that we still have right-Eliezer and right-Robin, even if those return the same values to 42 decimal places. If we shut up and multiply sufficiently large values, that 43rd decimal place is a lot of specks and torture.

(Lots of English usage sounds wrong. You know enough Japanese to know how wrong "I bow to Zubon" sounds. But maybe you can kick off some re-definition of terms. A century of precedent isn't much in philosophy.)

wrongness flows backward from the shooting, as rightness flows backward from the button, and the wrongness outweighs the rightness.

I suppose you could say this, but if I understand you correctly, then it goes against common usage. Usually those who study ethics would say that rightness is not the type of thing that can add with wrongness to get net wrongness (or net rightness for that matter). That is, if they were talking about that kind of thing, they wouldn't use the word 'rightness'. The same goes for 'should' or 'ought'. Terms used for this kind of stuff that can add together: [goodness / badness], [pro tanto reason for / pro tanto reason against].

If you merely meant that any wrong act on the chain trumps any right act further in the future, then I suppose these words would be (almost) normal usage, but in this case it doesn't deal with ethical examples very well. For instance, in the consequentialist case above, we need to know the degree of goodness and badness in the two events to know whether the child-saving event outweighs the person-shooting event. Wrongness trumping rightness is not a useful explanation of what is going on if a consequentialist agent was considering whether to shoot the person. If you want the kind of additivity of value that is relevant in such a case, then call it goodness, not rightness/shouldness. And if this is the type of thing you are talking about, then why not just look at each path and sum the goodness in it, choosing the path with the highest sum. Why say that we sum the goodness in a path in reverse chronological order? How does this help?

Regarding the terms 'ethics' and 'morality', philosophers use them to mean the same thing. Thus, 'metamorality' would mean the same thing as 'metaethics', it is just that no-one else uses the former term (overcoming bias is the top page on google for that term). There is nothing stopping you from using 'ethics' and 'morality' to mean different things, but since it is not standard usage and it would lead to a lot of confusion when trying to explain your views.

Eliezer: "if you were stepping outside the human and hoping for moral arguments that would persuade any possible mind, even a mind that just wanted to maximize the number of paperclips in the universe, then sorry - the space of possible mind designs is too large to permit universally compelling arguments."

- I disagree.

[THIS WOULD GET DELETED]Other than wasting time and effort, there's nothing wrong with reinventing the wheel. At least there's a useful result, even if it's a duplicated and redundant one.

But this is just reinventing the perpetual motion machine. Not only is it a massive falling-into-error, it's retracing the steps of countless others that have spent the last few centuries wandering in circles. Following in the footsteps of others is only laudable when those footsteps lead somewhere.

There's not a single operational definition here, just claims that refer to unspoken, unspecified assumptions and assertions of desired conclusions. Eliezer, you've missed your calling. You should have been a theologian. Making definitive proclamations about things you can't define and have a bunch of incoherent beliefs about is clearly your raison d'être.[/THIS WOULD GET DELETED]

Quick addition, in response to Roko's dissention:

Mathematicians routinely prove things about infinitely large sets. They do this by determining what properties the sets have, then seeing how those properties interact logically. The size of the space of all potential minds has nothing to do with whether we can construct universally compelling arguments about that space. It is in fact guaranteed that we can make universal arguments about it, because the space has a definition that determines what is and is not included within.

[THIS WOULD GET DELETED]The reason you are unable to make such arguments is that you're unwilling to do any of the rudimentary tasks necessary to do so. You've accomplished nothing but making up names for ill-defined ideas and then acting as though you'd made a breakthrough.

On the off-chance that you actually want to contribute something meaningful to the future of humanity, I suggest you take a good, hard look at your other motivations - and the gap between what you've actually accomplished and your espoused goals.[/THIS WOULD GET DELETED]

After thinking more about it, I might be wrong: actually the calculation might end up giving the same result for every human being.

Caledonian: what kind of motivations do you have?

Okay, for the future I'll just delete the content-free parts of Caledonian's posts, like those above. There do seem to be many readers who would prefer that he not be banned outright. But given the otherwise high quality of the comments on Overcoming Bias, I really don't think it's a good idea to let him go on throwing up on the blog.

Watching the ensuing commentary, I'm drawn to wishfully imagine a highly advanced Musashi, wielding his high-dimensional blade of rationality such that in one stroke he delineates and separates the surrounding confusion from the nascent clarity. Of course no such vorpal katana could exist, for if it did, it would serve only to better clear the way for its successors.

I see a preponderance of viewpoints representing, in effect, the belief that "this is all well and good, but how will this guide me to the one true prior, from which Archimedian point one might judge True Value?"

I see some who, given a method for reliably discarding much which is not true, say scornfully in effect "How can this help me? It says nothing whatsoever about Truth itself!"

And then there are the few who recognize we are each like leaves of a tree rooted in reality, and while we should never expect exact agreement between our differing subjective models, we can most certainly expect increasing agreement -- in principle -- as we move toward the root of increasing probability, pragmatically supporting, rather than unrealistically affirming, the ongoing growth of branches of increasing possibility. [Ignoring the progressive enfeeblement of the branches necessitating not just growth but eventual transformation.]

Eliezer, I greatly appreciate the considerable time and effort you must put into your essays. Here are some suggested topics that might help reinforce and extend this line of thought:

* Two communities, separated by a chasm
Would it be seen as better (perhaps obviously) to build a great bridge between them, or to consider the problem in terms of an abstract hierarchy of values, for example involving impediments to transfer of goods, people, ... ultimately information, for which building a bridge is only a special-case solution? In general, is any goal not merely a special case (and utterly dependent on its specifiability) of values-promotion?

* Fair division, etc.
Probably nearly all readers of Overcoming Bias are familiar with a principled approach to fair division of a cake into two pieces, and higher order solutions have been shown to be possible with attendant computational demands. Similarly, Rawles proposed that we ought to be satisfied with social choice implemented by best-known methods behind a veil of ignorance as to specific outcomes in relation to specific beneficiaries. Given the inherent uncertainty of specific future states within any evolving system of sufficient complexity to be of moral interest, what does this imply about shifting moral attention away from expected consequences, and toward increasingly effective **principles** reasonably optimizing our expectation of improving, but unspecified and indeed unspecifiable, consequences? Bonus question: How might this apply to Parfit's Repugnant Conclusion and other well-recognized "paradoxes" of consequentialist utilitarianism?

* Constraints essential for **meaningful** growth
Widespread throughout the "transhumanist" community appears the belief that considerable, if not indefinite progress can be attained via the "overcoming of constraints." Paradoxically, the accelerating growth of possibilities that we experience arises not with overcoming constraints, but rather embracing them in ever-increasing technical detail. Meaningful growth is necessarily within an increasingly constrained possibility space -- fortunately there's plenty of fractal interaction area within any space of real numbers -- while unconstrained growth is akin to a cancer. An effective understand of **meaningful** growth depends on an effective understanding of the subjective/objective dichotomy.

Thanks again for your substantial efforts.

Caledonian: uh... he didn't say you couldn't make arguments _about_ all possible minds, he was saying you couldn't construct an argument that's so persuasive, so convincing that every possible mind, no matter how unusual its nature, would automatically be convinced by that argument.

It's not a matter of talking about minds, it's a matter of talking _to_ minds.

Mathematicians figure out things about sets. But they're not trying to convince the sets themselves about those things. :)

You know, I think Caledonian is the only one who has the right idea about the nature of what's being written on this blog. I will miss him because I don't have the energy to battle this intellectual vomit every single day. And yet, somehow I am forced to continue looking. Eliezer, how does your metamorality explain the desire to keep watching a trainwreck?

Roko: You think you can convince a paperclip maximizer to value human life? Or do you think paperclip maximizers are impossible?

Eliezer: It's because when I say right, I am referring to a 1-place function

Like many others, I fall over at this point. I understand that Morality_8472 has a definite meaning, and therefore it's a matter of objective fact whether any act is right or wrong according to that morality. The problem is why we should choose it over Morality_11283.

Of course you can say, "according to Morality_8472, Morality_8472 is correct" but that's hardly helpful.

Ultimately, I think you've given us another type of anti-realist relativism.

Eliezer: But if you were stepping outside the human and hoping for moral arguments that would persuade any possible mind, even a mind that just wanted to maximize the number of paperclips in the universe, then sorry - the space of possible mind designs is too large to permit universally compelling arguments.

It's at least conceivable that there could be objective morality without universally compelling moral arguments. I personally think there could be an objective foundation for morality, but I wouldn't expect to persuade a paperclip maximizer.

Caledonian: He isn't using "too-big" in the way you are interpreting it.

The point is not: Mindspace has a size X, X > Y, and any set of minds of size > Y cannot admit universal arguments.

The point is: For any putative universal argument you can cook up, I can cook up a mind design that isn't convinced by it.

The reason that we say it is too big is because there are subsets of Mindspace that do admit universally compelling arguments, such as (we hope) neurologically intact humans.

Unknown: As I've stated before, we are all morally obliged to prevent Eliezer from programming an AI. For according to this system, he is morally obliged to make his AI instantiate his personal morality.

Unknown, do I really strike you as the sort of person who would do something that awful just because I was "morally obliged" to do it? Screw moral obligation. I can be nice in defiance of morality itself, if I have to be.

Of course this really amounts to saying that I disagree with your notion of what I am "morally obliged" to do. Exercise: Find a way of construing 'moral obligation' that does not automatically 'morally obligate' someone to take over the world. Hint: Use a morality more complicated than that involved in maximizing paperclips.

Allan: The problem is why we should choose it over Morality_11283.

You just used the word "should". If it doesn't mean Morality_8472, or some Morality_X, what does it mean? How do you expect to choose between successor moralities without initial morality?

I personally think there could be an objective foundation for morality, but I wouldn't expect to persuade a paperclip maximizer.

This just amounts to defining should as an abstract computation, and then excluding all minds that calculate a different rule-of-action as "choosing based on something other than morality". In what sense it the morality objective, besides the several senses I've already defined, if it doesn't persuade a paperclip maximizer?

Eliezer: You go on with the same morals as before, and the same moral arguments as before. There is no sudden Grand Overlord Procedure to which you can appeal to get a perfectly trustworthy answer.

'Same moral arguments as before' doesn't seem like an answer, in the same sense as 'you should continue as before' is not a good advice for cavemen (who could benefit from being brought into modern civilization). If cavemen can vaguely describe what they want from environment, this vague explanation can be used to produce optimized environment by sufficiently powerful optimization process that is external to cavemen, based on the precise structure of current environment. It won't go all the way there (otherwise, a problem of ignorant jinn will kick in), but it can really help.

Likewise, the problem of 'metamorality' is in producing a specification of goals that is better than vague explanations of moral philosophers. For that, we need to produce vague explanation of what we think morality is, and set an optimization process on these explanations to produce a better description of morality, based on current state of environment (or, specifically, humanity and human cognitive architecture).

These posts sure clarify something for the confused, but what is the content in the sense I described? I hope the above quotation was not a curiosity stopper.

Thank you for this post. "should" being a label for results of the human planning algorithm in backward-chaining mode the same way that "could" is a label for results of the forward-chaining mode explains a lot. It's obvious in retrospect (and unfortunately, only in retrospect) to me that the human brain would do both kinds of search in parallel; in big search spaces, the computational advantages are too big not to do it.

I found two minor syntax errors in the post:
"Could make sense to ..." - did you mean "Could it make sense to ..."?
"(something that has a charge of should-ness" - that parenthesis is never closed.

Unknown wrote:

As I've stated before, we are all morally obliged to prevent Eliezer from programming an AI.
Speak for yourself. I don't think EliezerYudkowsky::Right is quite the same function as SebastianHagen::Right, but I don't see a real chance of getting an AI that optimizes only for SebastianHagen::Right accepted as sysop. I'd rather settle for an acceptable compromise in what values our successor-civilization will be built on than see our civilization being stomped into dust by an entirely alien RPOP, or destroyed by another kind of existential catastrophe.

Suppose we were to write down all (input, output) pairs for the ideal "one-place function" described by Eliezer on a oblong stone tablet somewhere. This stone tablet would then contain perfect moral wisdom. It would tell us the right course of action in any possible situation.

This tablet would be the result of computation, but it's computation that nobody can actually do, as we currently only have access to approximations to the ideal Morality(X) function. Thus, as far as we're concerned, this tablet is just a giant look-up table. Its contents are a brute fact about the universe, like the ratio of the masses of the proton and the electron. If we are confronted with a moral dilemna, and our personal ideas of right and wrong contradict the tablet, this will always be a result of our own morality functions poorly approximating the ideal. In such a situation, we should override our instincts and go with the tablet every time.

In other words, according to Eliezer's model, in a universe where this tablet exists morality is given.

This is also true of a universe where the tablet does not exist (such as ours--it wouldn't fit!).

So Eliezer has just rediscovered "morality is the will of God", except he's replacing "God" with a giant block of stone somewhere in a hypothetical universe. It's not clear to me that this is an impovement.

It seems to me that the functional difference is that Eliezer believes he can successfully approximate the will of the Giant Hypothetical Oblong Stone Tablet out of his own head. If George Washington says "Slavery is sometimes just," Eliezer does not take this assertion seriously; he does not start trying to re-work his personal GHOST-approximator to take Washington's views into account. Rather he says, "I know that slavery is wrong, and I approximate the GHOST, so slavery is wrong," ignoring the fact that all men--including Washington--approximate the GHOST as best they can. Worse, by emphasizing the process of making, weighing and pondering moral "arguments", he privileges the verbally and quantitatively quick over the less intelligent, even though the correlation between being good with words and having a good GHOST-approximator is nowhere shown.

Everyone's GHOST-approximator is shaped by his environment. If the modern world encourages people to deny the GHOST in particular ways, and Eliezer indeed does so, then he would not be able to tell. His tool for measuring, his personal GHOST-finder, would have been twisted. His friends' and respected peers' GHOST-approximators might all be twisted in the same way, so nobody would point out his error and he would have no opportunity to correct it. He would use his great skill with words to try to convince everyone that his personal morality was correct. Him and people like him might well succeed. His assertion of moral progress would then merely be the statement that the modern world reflects his personal biases--or perhaps that he reflects the biases of the modern world.

I'm concerned that the metamorality described by Eliezer will encourage self-named rationalists to worship their own egos, placing their personal imperfect GHOST-approximators--all shaped by the moral environment of the modern world--at the same level as those in past ages placed the will of God. Perhaps this is not Eliezer's intention. But to do otherwise, to look beyond the biases of the present day, one would have to acknowledge that the GHOST-readers of our ancestors may have in some ways have been better than ours. This would require humility; and pride cures humility.

Calhedonian: [THIS WOULD GET DELETED]The reason you are unable to make such arguments is that you're unwilling to do any of the rudimentary tasks necessary to do so. You've accomplished nothing but making up names for ill-defined ideas and then acting as though you'd made a breakthrough.
On the off-chance that you actually want to contribute something meaningful to the future of humanity, I suggest you take a good, hard look at your other motivations - and the gap between what you've actually accomplished and your espoused goals.[/THIS WOULD GET DELETED]

This is NOT that bad a point! Don't delete that! If we're considering cognitive biases, then it makes sense to consider the biases of our beloved leader, who might be so clever as to convince all of us to walk directly off of a cliff... Who is the pirate king at the helm of our ship? What are your motivations is a good question indeed- though not one I expect answered in one post or right away.

Also, I found reading this post very *satisfying*, but that might just be because it's brain candy confirming my justness in believing what I already believed... It's good to be skeptical, especially of things that say, 'You can *feel* it's right! And it's ok that there's no external validation...'
Tell that to the Nazis who thought Jews were *not* part of the human species...

I'm still wrestling with this here -

Do you claim that the CEV of a pygmy father would assert that his daughter's clitoris should not be sliced off? Or that the CEV of a petty thief would assert that he should not possess my iPod?

The comments to this entry are closed.

Less Wrong (sister site)

May 2009

Sun Mon Tue Wed Thu Fri Sat
          1 2
3 4 5 6 7 8 9
10 11 12 13 14 15 16
17 18 19 20 21 22 23
24 25 26 27 28 29 30