Avoiding bad value drift is as important as solving value alignment

Written tractatus style: a sublist justifies the point immediately above it.

Note: this argument might, deep down, actually be a reductio for folk notions of human value.


  1. Avoiding bad value drift is as important as solving value alignment.

    1. Bad value drift is possible.

      1. Value drift is possible.

        1. Human values are a function of the contents or structure of human minds, and human minds can be altered in a way that changes human values.
      2. Value drift could occur in several plausible ways.

        1. Value drift could occur due to persuasion, propaganda, or warfare that lead to changes in the composition of human society and its beliefs. Narrow AI will accelerate this.

        2. Value drift could occur by use of neuromodulation technology, like a lithiated water supply, nootropics, or brain interfaces. The economic advantages to using such technologies will drive their rapid adoption.

        3. Value drift could occur by genetic alteration to human minds via synthetic biology.

      3. Value drift can result in bad values.

        1. Only unintentional value drift can result in bad values. Intentional value drift cannot, inasmuch as intentional value changes are aligned with current human values.

        2. Unintentional value drift that results in bad values is possible.

    2. Bad value drift will plausibly occur before strong AI is built.

      1. Technologies that lead to strong AI may also lead to any of the items listed in 1.1.2 occuring first.
    3. Allowing bad value drift to occur before building strong AI is tantamount to failing at value alignment.

      1. If strong AI is built before value alignment is solved, then value alignment has failed by definition.

      2. If strong AI is built after value alignment is solved, but also after bad value drift has occurred, the resulting AIs won’t possess current human values. The AIs will possess bad values, which means having failed at value alignment.


Definitions: