[A universal Turing machine from Yurii Rogozhin's article "Small universal Turing machines", in Theoretical Computer Science, 168(2):215-240, 20 November 1996. Thus, a very direct proof that brainfuck is Turing-complete. For i/o formats and so on, read below; for fuller detail, dig up the article.

If you just want a quick and complete test case, the input b1b1bbb1c1c11111d should produce the output 1c11111.

Daniel B. Cristofani (cristofdathevanetdotcom)
http://www.hevanet.com/cristofd/brainfuck/


This Turing machine achieves Turing-completeness not by simulating other Turing machines directly, but by simulating a Turing-complete class of tag-systems. A tag-system transforms strings over an alphabet A = {a[1], a[2], ... a[n], a[n+1]} as follows: a positive integer m is chosen, and so is a function P that maps each a[i] for 1<=i<=n to a string P(a[i]) over the alphabet A. Now:

1. if the string being transformed has fewer than m elements, the whole process stops now.
2. m elements are removed from the beginning of the string
3. call the first element removed a[k]; if k=n+1 the whole process stops now.
4. P(a[k]) is appended to the string.
5. steps 1-5 are repeated.

The particular class of tag-systems this Turing machine simulates is the class where m=2, the initial string has length at least 2, and all P(a[i]) where 1<=i<=n are of the form a[n]a[n]B[i] where B[i] is some string over the alphabet A (B[i] is the empty string if and only if i=n).

The input for this program is mildly complex, and there is no error checking.

-The representation of a symbol a[i] from the alphabet A is a string of 1s which is one element longer than twice the combined length of all P(a[j]) where 1<=j<i.

-a value like P(a[i]) = a[n]a[n]a[w]a[x]...a[y]a[z] is represented as follows:
b 1
b 111...(as many as required to represent a[z] as described above) b
b 111...(to represent a[y] as described above) b
.
.
.
b 111...(to represent a[x]) b
b 111...(to represent a[w]) b
b 111...(to represent a[n]) b
b 111...(as many as for a[n] as described above, MINUS the number of 1s that represent a[i]; and no final b)

-The function P is represented by listing all its outputs in the order P(a[n]), P(a[n-1]),...,P(a[2]),P(a[1]). The representation of P(a[n+1])=STOP is done for you.

-The initial string a[q]a[r]...a[s]a[t] to be transformed by the tag-system is represented as
111...(as many as required to represent a[q] as above) c
111...(to represent a[r]) c
.
.
.
111...(to represent a[s]) c
111...(to represent a[t]; note that there is no final c.)

-The input to this program is a function P as described above, then a single b, then the initial string to be transformed. Run all the 1s, bs, and cs together in one line with nothing between, followed by a linefeed, or a period if a linefeed is problematic for your implementation.

-The output format, if the program terminates, is the same as the input format for the initial string, and represents the final state of the transformed string, with a final a[n+1] appended due to a peculiarity of the Turing machine's algorithm.

-An example.
A tag-system over the alphabet A = {a[1], a[2], a[3], a[4]}, where m = 2 and:
P(a[1]) = a[3]a[3]a[2]a[1]a[4]
P(a[2]) = a[3]a[3]a[1]
P(a[3]) = a[3]a[3]
P(a[4]) = STOP

meets the criteria above; and when applied to the initial string a[2]a[1]a[1] it gives:
a[2]a[1]a[1]
        a[1]a[3]a[3]a[1]
                a[3]a[1]a[3]a[3]a[2]a[1]a[4]
                        a[3]a[3]a[2]a[1]a[4]a[3]a[3]
                                a[2]a[1]a[4]a[3]a[3]a[3]a[3]
                                        a[4]a[3]a[3]a[3]a[3]a[3]a[3]a[1]
                                                a[3]a[3]a[3]a[3]a[3]a[1]
and then it's done.

Now, the encoding:

a[1] is 1
a[2] is 11111111111
a[3] is 11111111111111111
a[4] is 111111111111111111111

P(a[1]) is b1 b111111111111111111111b b1b b11111111111b b11111111111111111b b1111111111111111
P(a[2]) is b1 b1b b11111111111111111b b111111
P(a[3]) is b1 b11111111111111111b b

the initial string is 11111111111c1c1

and so the whole input is

b1 b11111111111111111b b
b1 b1b b11111111111111111b b111111
b1 b111111111111111111111b b1b b11111111111b b11111111111111111b b1111111111111111
b
11111111111c1c1

which when run together for input to the program becomes

b1b11111111111111111bbb1b1bb11111111111111111bb111111b1b111111111111111111111bb1bb11111111111bb11111111111111111bb1111111111111111b11111111111c1c1

The output should be
11111111111111111 c
11111111111111111 c
11111111111111111 c
11111111111111111 c
11111111111111111 c
1 c
111111111111111111111

--that is, 11111111111111111c11111111111111111c11111111111111111c11111111111111111c11111111111111111c1c111111111111111111111


For those interested, the state table of the Turing machine itself is
10<L1    201L2    30cR1    40cL2
11<L1    210R2    311R3    410R4
1b>R1    2b>L3    3b<R4    4bcL2
1<0R1    2<>L2    3< H     4<   
1>bL1    2><R2    3>bR3    4><R4
1c0R4    2cbR2    3c1R1    4cbR4
where the initial state is 1, the blank symbol is "0", tape cells are set as per the input but with the termination code P(a[n+1])=STOP represented as a "< b" at the left end, and the head is initially at the first "1" in the code for the initial string. If and when the machine halts, the head is at the leftmost cell, a "<"; the representations of the rules are intact, in a form isomorphic to their original form (each "b" replaced with ">" and each "1" unchanged); they are followed by a series of "1" cells, then a "c" (the leftmost one at that time), then the cells representing the final state of the transformed string, then a "c" and a sequence of "1" cells representing a[n+1] as mentioned.


The minimal test case b1b1bbb1c1c11111 represents the tag-system where P(a[1]) = a[1]a[1] and P(a[2]) = STOP, applied to the string a[1]a[1]a[2]. This runs for 518 steps of the Turing machine, exercising all 23 Turing machine instructions, before halting with the output string a[1].


Here is the brainfuck program that implements this Turing machine. The basic memory layout is as follows.
Each Turing machine cell is represented by a brainfuck cell, with the symbols "0 1 b < > c" represented by 0, 1, 2, 3, 4, 5 respectively. The brainfuck cells representing the Turing machine cells are laid out contiguously from the beginning of the tape, except that:
The head of the Turing machine is represented by a gap of three brainfuck cells, just to the left of the brainfuck cell that represents the current Turing machine cell. At the start of each cycle, the rightmost of these three cells holds the Turing machine state, where states 1-4 are represented by 1-4 and "halt" (here treated as a separate state) is represented by 0. The other two cells hold zeroes.

Now to walk through the code:

+++>++>>>+[

Set up 3 2 0 0 1, representing "< b" and the Turing machine head, in the initial state 1; we can put this at the left end of the brainfuck array because the Turing machine will never go left from the "<".
Next, start the main input-processing loop. Each time through this loop, we begin at the rightmost tape cell that we have filled so far, or at the state cell of the Turing machine head if it is to the right of all tape cells (as it is initially). Each time through, we read a character; if it is "1" or "c", we add the appropriate code to the right end of the tape; if it is a "b", we not only add the code to the end of the tape but also move the head to the right of it, since the head must follow the rightmost "b" when the Turing machine starts; if the input character is a linefeed or other terminator, we add nothing to the tape but position the brainfuck pointer at the zero that follows the last filled tape cell, thus ending this loop.

>>,[>+++++<[[->]<<]<[>]>]

Read input, producing
... x x x 0 'i 0 ...
where "x" are already-filled tape cells and "i" is the input.
While the input lasts, run this loop:
-set up
x x x 0 'i 5 0 ... 
-If the input was six or greater, the [[->]<<] part will five times decrement both i and 5; the sixth time, it will only decrement i, and move to the cell left of i, producing 
... x x x '0 i 0 0 ...
following which the code <[>]> will restore the pointer to i:
... x x x 0 'i 0 0 ...
In short, while i is at least six, the net effect of each iteration of the loop [>+++++<[[->]<<]<[>]>] is to reduce i by 6; so repeated iterations will change i to i mod 6; call this j. Then the loop will be run once more.
Now legitimate input characters give the values 1, 2, 3, 4 when reduced mod 6; "1" gives 1, "b" gives 2, "c" gives 3, and linefeed and "." and "d" all give 4. On the last run through the loop [>+++++<[[->]<<]<[>]>] , the [[->]<<] part will decrement both j and 5 repeatedly until j is zeroed, i.e. it will zero j while reducing 5 by j, leaving
... x x x 0 '0 r 0 ...
where r is 5-j. The code <[>]> leaves this configuration unchanged, and the loop exits.

>-[<<+++++>>-[<<---->>-[->]<]]

If r was 1, i mod 6 was 4, meaning a terminator. So we don't fill any tape cell but leave
... x x x 0 0 '0 ...
If r was 2, i mod 6 was 3, meaning "c". So we set up
... x x x 5 0 '0 ...
If r was 3, i mod 6 was 2, meaning "b". In this case we set up 
... x x x 1 0 '0 ...
and skip the innermost [->] loop, then step left leaving
... x x x 1 '0 0 ...
(note the pointer position; only in this case is the pointer immediately to the right of a nonzero cell.)
If r was 4, i mod 6 was 1, meaning "1". In this case we set up
... x x x 1 0 '1 0 ...
and enter the inner [->] loop, resulting in
... x x x 1 0 0 '0 ...
after which we step left, producing
... x x x 1 0 '0 ...

<[+[<]>->[[<<<+>>>-]>]<+>]

Now we step left. If and only if i was "b", we enter this loop which will move the Turing machine head. First we set up
... x x x '2 0 0 0 ...
since 2 is the proper code for "b". We can describe this more fully, including the head which is somewhere off to the left:
x x x ... x 0 0 1 x x x ... x '2 0 0 0 ...
Now we scan left to the head and then clear out the state cell of the head, producing
... x 0 0 0 'x x x ... x 2 0 0 0 ...
and copy each cell right of the head left by three, effectively moving the head one cell right each time. The outcome is:
... x x x x ... x 2 0 0 0 '0 0 0 ...
then we reset the state cell of the head, producing
... x x x x ... x 2 0 0 1 '0 0 0 ...
and end the loop.

At this point there are four possibilities corresponding to the four possible inputs.
... x x x x 1 '0 ... if the input was a "1"
... x 2 0 0 1 '0 ... if the input was a "b"
... x x x x 5 '0 ... if the input was a "c"
... x x x x 0 '0 ... if the input was a terminator.

<]<[<]>[

Now we move left and close the loop. If the input was anything but a terminator, this puts us at the rightmost nonzero cell and we repeat the input-processing loop. If the input was a terminator, this puts us at the zero after the rightmost nonzero cell, and the input is already finished; then we scan left to the gap that represents the Turing machine head, and position the pointer at the state cell. Then we begin the main loop, which will be executed once for each step of the Turing machine's operation, stopping when the state cell holds 0 (representing the halt state).

-[>++++++<-]>[<+>-]

At the beginning of each iteration of the main processing loop, the configuration is
... w x 0 0 's y z ...
where w, x, y, z are tape cells, with y being the current tape cell, and s is the state (1-4). First we combine the current state with the current symbol; we add (6*(s-1)) onto y, then move the result back into the former location of s. Call the combination g; its values range from 0 to 23, one for each state-symbol combination.
... w x 0 0 '0 g z ...
... w x 0 0 g '0 z ...

Now we have to use g to select a new symbol, a new state, and a direction to move the head. We will provisionally move the head right one cell, and set a direction flag; if that flag is nonzero, we will move the head left two cells, resulting in a total movement of one cell to the left. That is, we want to use g in
... w x 0 0 g 0 z ...
to construct an appropriate
... w x y d 0 s z ...
where d is the direction flag, s is the new state, and y is the new symbol; then if d is nonzero ("move left after all") we want to shift this to produce
... w 0 0 s x y z ...

The way we use g to pick new values for s, y, and d is a very general scheme for mathematical functions of one variable, and this UTM is one place we need the generality. (See my rot13.b for a place where I didn't need the generality, and used this method anyway, leading to comically inconcise but fast code.) The basic idea is like
>set f(0)<[>set f(1)<-[>set f(2)<-[>set f(3)<-[...&c...]]]]
at some point which depends on the input value, the function's input will be decremented to zero; at that point it will skip to the block of right brackets, trail through the rest of them, and be done; and we make sure that when that happens, the right value is in the output space(s).

Here we get the outputs from the state table, naturally, reading down the columns. Recall that "01b<>c" map to 0 1 2 3 4 5, and that "d=1" means "left".

+<<<+++>+>
g==0; set s=1, y=3, d=1

[-
g==1; same values. There's some continuity in the table; only changes from one combination to the next will be commented hereafter.

[<<+>->-
g==2; y=4, d=0

[<<[-]>>-
g==3; y=0

[<<++>+>-
g==4; y=2, d=1

[<<-->->>+++<-
g==5; y=0, d=0, s=4 

[<<+>+>>--<-
g==6; y=1, d=1, s=2

[<<->->-
g==7; y=0, d=0

[<<++++>+>>+<-
g==8; y=4, d=1, s=3

[>-<-
g==9; s=2

[<<->->-
g==10; y=3, d=0

[<<->>-
g==11; y=2

[<<+++>>>-<-
g==12; y=5, s=1

[<<---->>>++<-
g==13; y=1, s=3

[<<++>>>+<-
g==14; y=3, s=4

[>[-]<-
g==15; s=0 (this is the halt condition; having it produce d=0 is useful, since moving left would take us outside the brainfuck array, and the capability of actually not moving the head has been omitted as unnecessary, given that we're only going to output the part of the tape that holds the final state of the transformed string.)

[<<->>>+++<-
g==16; y=2, s=3

[<<->>>--<-
g==17; y=1, s=1

[<<++++>+>>+<-
g==18; y=5, d=1, s=2

[<<[-]>->>++<-
g==19; y=0, d=0, s=4

[<<+++++>+>>--<-
g==20; y=5, d=1, s=2

[<->>++<[<<->>-]]
Here's a tricky part. g==21 is never produced by the Turing machine from correct input, so the remaining states to consider are g==22 and g==23, which should give (y=3, d=0, s=4) and (y=2, d=0, s=4) respectively. So we set up d=0, s=4 which are common to both, then we take the 2 or 3 that remains in g and subtract it from y to produce the right result for y.

]]]]]]]]]]]]]]]]]]]]

We arrive here when g has been zeroed. Again, the layout now is
... w x y d '0 s z ...

<[->>[<<+>>-]<<<[>>>+<<<-]<[>>>+<<<-]]

If d==1 we do:
... w x y '0 0 s z ...
... w x y s 0 '0 z ...
... w x '0 s 0 y z ...
... w '0 0 s x y z ...
to move the head left and leave the pointer at the leftmost cell of the head, where it would be if d had been 0 also.

>>]

Go to the state cell, and if it is nonzero (not the halt state) go through the main Turing machine loop again.

>[-[---[-<]]>]

Now the situation is 
4 0 0 '0 x x ... x x 0 0 0 ...
where each x is either 1, 4, or 5, and the leftmost 5 ("c") marks the start of the transformed string. So we scan through the tape looking for that 5, and incidentally clearing the tape as we go. When we find a 1, we decrement it, skip the loop [---[-<]], move right, and start fresh. When we find a 4, we decrement it down to 0, skip the loop [-<], move right, and start fresh. When we find a 5, we decrement it down to 0, move left, move right to the space the 5 occupied (now 0), and end this loop.

>[+++[<+++++>--]>]

Now we want to output the part of the tape that represents the transformed string; the situation is 
... 0 '0 x x x ... x x x 0 0 0 ...
where each x is either 1, representing "1", or 5, representing "c". To transform each to the right value we first add 3, producing 4 for "1" and 8 for "c", then multiply the resulting value by 5/2 while moving it left; this is safe because we know the value is even.  We scan through the string this way, after which it consists of 10s (for "1") and 20s (for "c"), and the whole string is shifted one cell left.

+<++[[>+++++<-]<]

Now we set two more cells, producing
... 0 x x x ... x x x '2 1 0 0 ...
and multiply each cell by five while moving it right, producing
... '0 0 x x x ... x x x 11 0 0 ...
where each x is either 50 (for "1") or 100 (for "c").

>>[-.>]

Now we scan right, reducing each by one and outputting it; the values are 49 (ASCII for "1"), 99 (ASCII for "c"), and 10 (ASCII for the final linefeed). Then the program terminates.]


The entire program again without comments:

+++>++>>>+[>>,[>+++++<[[->]<<]<[>]>]>-[<<+++++>>-[<<---->>-[->]<]]
<[+[<]>->[[<<<+>>>-]>]<+>]<]<[<]>[-[>++++++<-]>[<+>-]+<<<+++>+>
  [-
    [<<+>->-
      [<<[-]>>-
        [<<++>+>-
          [<<-->->>+++<-
            [<<+>+>>--<-
              [<<->->-
                [<<++++>+>>+<-
                  [>-<-
                    [<<->->-
                      [<<->>-
                        [<<+++>>>-<-
                          [<<---->>>++<-
                            [<<++>>>+<-
                              [>[-]<-
                                [<<->>>+++<-
                                  [<<->>>--<-
                                    [<<++++>+>>+<-
                                      [<<[-]>->>++<-
                                        [<<+++++>+>>--<-
                                          [<->>++<
                                            [<<->>-
]]]]]]]]]]]]]]]]]]]]]]<[->>[<<+>>-]<<<[>>>+<<<-]<[>>>+<<<-]]>>]
>[-[---[-<]]>]>[+++[<+++++>--]>]+<++[[>+++++<-]<]>>[-.>]