Context Navigation

utm.b

Last change on this file was 11, checked in by chronos, 13 years ago
Added: Some other examples from web.
File size: 18.4 KB

Line
1	[A universal Turing machine from Yurii Rogozhin's article "Small universal
2	Turing machines", in Theoretical Computer Science, 168(2):215-240, 20 November
3	1996. Thus, a very direct proof that brainfuck is Turing-complete. For i/o
4	formats and so on, read below; for fuller detail, dig up the article.
5
6	If you just want a quick and complete test case, the input b1b1bbb1c1c11111d
7	should produce the output 1c11111.
8
9	Daniel B Cristofani (cristofdathevanetdotcom)
10	http://www.hevanet.com/cristofd/brainfuck/
11
12
13
14	This Turing machine achieves Turing-completeness not by simulating other
15	Turing machines directly, but by simulating a Turing-complete class of
16	tag-systems (a computational model invented by Emil Post and named after the
17	children's game "tag"). A tag-system transforms strings over an alphabet A =
18	{a[1], a[2], ... a[n], a[n+1]} as follows: a positive integer m is chosen, and
19	so is a function P that maps each a[i] for 1<=i<=n to a string P(a[i]) over
20	the alphabet A. Now:
21
22	1. if the string being transformed has fewer than m elements, the whole
23	process stops now.
24	2. m elements are removed from the beginning of the string
25	3. call the first element removed a[k]; if k=n+1 the whole process stops now.
26	4. P(a[k]) is appended to the string.
27	5. steps 1-5 are repeated.
28
29	The particular class of tag-systems this Turing machine simulates is the class
30	where m=2, the initial string has length at least 2, and all P(a[i]) where
31	1<=i<=n are of the form a[n]a[n]B[i] where B[i] is some string over the
32	alphabet A (B[i] is the empty string if and only if i=n).
33
34	The input for this brainfuck program is mildly complex, and there is no error
35	checking. The complexity comes from the encoding of tag-systems in terms of
36	Turing machine tape configurations. Note that the set of initial tape
37	configurations that represent tag-systems from the above class is a small,
38	though Turing-complete, subset of the set of possible initial tape
39	configurations for this Turing machine; and the following brainfuck program is
40	only designed to accept inputs from that subset.
41
42	-The representation of a symbol a[i] from the alphabet A is a string of 1s
43	which is one element longer than twice the combined length of all P(a[j])
44	where 1<=j<i.
45
46	-a value like P(a[i]) = a[n]a[n]a[w]a[x]...a[y]a[z] is represented as follows:
47	b 1
48	b 111...(as many as required to represent a[z] as described above) b
49	b 111...(to represent a[y] as described above) b
50	.
51	.
52	.
53	b 111...(to represent a[x]) b
54	b 111...(to represent a[w]) b
55	b 111...(to represent a[n]) b
56	b 111...(as many as for a[n] as described above, MINUS the number of 1s that
57	represent a[i]; and no final b)
58
59	-The function P is represented by listing all its outputs in the order
60	P(a[n]), P(a[n-1]), ... P(a[2]), P(a[1]). The representation of P(a[n+1])=STOP
61	is done for you by the brainfuck program.
62
63	-The initial string a[q]a[r]...a[s]a[t] to be transformed by the tag-system is
64	represented as
65	111...(as many as required to represent a[q] as above) c
66	111...(to represent a[r]) c
67	.
68	.
69	.
70	111...(to represent a[s]) c
71	111...(to represent a[t]; note that there is no final c.)
72
73	-The input to this program is a function P as described above, then a single
74	b, then the initial string to be transformed. Run all the 1s, bs, and cs
75	together in one line with nothing between, followed by a linefeed, or a period
76	if a linefeed is problematic for your implementation.
77
78	-The output format, if the program terminates, is the same as the input format
79	for the initial string, and represents the final state of the transformed
80	string, with a final a[n+1] appended due to a peculiarity of the Turing
81	machine's algorithm.
82
83	-An example.
84	A tag-system over the alphabet A = {a[1], a[2], a[3], a[4]}, where m = 2 and:
85	P(a[1]) = a[3]a[3]a[2]a[1]a[4]
86	P(a[2]) = a[3]a[3]a[1]
87	P(a[3]) = a[3]a[3]
88	P(a[4]) = STOP
89
90	meets the criteria above; and when applied to the initial string a[2]a[1]a[1]
91	it gives:
92	a[2]a[1]a[1]
93	a[1]a[3]a[3]a[1]
94	a[3]a[1]a[3]a[3]a[2]a[1]a[4]
95	a[3]a[3]a[2]a[1]a[4]a[3]a[3]
96	a[2]a[1]a[4]a[3]a[3]a[3]a[3]
97	a[4]a[3]a[3]a[3]a[3]a[3]a[3]a[1]
98	a[3]a[3]a[3]a[3]a[3]a[1]
99	and then it's done.
100
101	Now, the encoding:
102
103	a[1] is 1
104	a[2] is 11111111111
105	a[3] is 11111111111111111
106	a[4] is 111111111111111111111
107
108	P(a[1]) is b1 b111111111111111111111b b1b b11111111111b b11111111111111111b
109	b1111111111111111
110	P(a[2]) is b1 b1b b11111111111111111b b111111
111	P(a[3]) is b1 b11111111111111111b b
112
113	the initial string is 11111111111c1c1
114
115	and so the whole input is
116
117	b1 b11111111111111111b b
118	b1 b1b b11111111111111111b b111111
119	b1 b111111111111111111111b b1b b11111111111b b11111111111111111b
120	b1111111111111111
121	b
122	11111111111c1c1
123
124	which when run together for input to the program becomes
125
126	b1b11111111111111111bbb1b1bb11111111111111111bb111111b1b111111111111111111111bb1bb11111111111bb11111111111111111bb1111111111111111b11111111111c1c1
127
128	The output should be
129	11111111111111111 c
130	11111111111111111 c
131	11111111111111111 c
132	11111111111111111 c
133	11111111111111111 c
134	1 c
135	111111111111111111111
136
137	--that is,
138	11111111111111111c11111111111111111c11111111111111111c11111111111111111c11111111111111111c1c111111111111111111111
139
140
141	For those interested, the state table of the Turing machine itself is
142	10<L1 201L2 30cR1 40cL2
143	11<L1 210R2 311R3 410R4
144	1b>R1 2b>L3 3b<R4 4bcL2
145	1<0R1 2<>L2 3< H 4<
146	1>bL1 2><R2 3>bR3 4><R4
147	1c0R4 2cbR2 3c1R1 4cbR4
148	where the initial state is 1, the blank symbol is "0", tape cells are set as
149	per the input but with the termination code P(a[n+1])=STOP represented as a
150	"< b" at the left end, and the head is initially at the first "1" in the code
151	for the initial string. If and when the machine halts, the head is at the
152	leftmost cell, a "<"; the representations of the rules are intact, in a form
153	isomorphic to their original form (each "b" replaced with ">" and each "1"
154	unchanged); they are followed by a series of "1" cells, then a "c" (the
155	leftmost one at that time), then the cells representing the final state of the
156	transformed string, then a "c" and a sequence of "1" cells representing a[n+1]
157	as mentioned.
158
159	The minimal test case b1b1bbb1c1c11111 represents the tag-system where P(a[1])
160	= a[1]a[1] and P(a[2]) = STOP, applied to the string a[1]a[1]a[2]. This runs
161	for 518 steps of the Turing machine, exercising all 23 Turing machine
162	instructions, before halting with the output string a[1].
163
164
165	Here is the brainfuck program that implements this Turing machine. The basic
166	memory layout is as follows.
167	Each Turing machine cell is represented by a brainfuck cell, with the symbols
168	"0 1 b < > c" represented by 0, 1, 2, 3, 4, 5 respectively. The brainfuck
169	cells representing the Turing machine cells are laid out contiguously from the
170	beginning of the tape, except that the head of the Turing machine is
171	represented by a gap of three brainfuck cells, just to the left of the
172	brainfuck cell that represents the current Turing machine cell. At the start
173	of each cycle, the rightmost of these three cells holds the Turing machine
174	state, where states 1-4 are represented by 1-4 and "halt" (here treated as a
175	separate state) is represented by 0. The other two cells hold zeroes.
176
177	Now to walk through the code:
178
179	+++>++>>>+[
180
181	Set up 3 2 0 0 1, representing "< b" and the Turing machine head, in the
182	initial state 1; we can put this at the left end of the brainfuck array
183	because the Turing machine will never go left from the "<".
184	Next, start the main input-processing loop. Each time through this loop, we
185	begin at the rightmost tape cell that we have filled so far, or at the state
186	cell of the Turing machine head if it is to the right of all tape cells (as it
187	is initially). Each time through, we read a character; if it is "1" or "c", we
188	add the appropriate code to the right end of the tape; if it is a "b", we not
189	only add the code to the end of the tape but also move the head to the right
190	of it, since the head must follow the rightmost "b" when the Turing machine
191	starts; if the input character is a linefeed or other terminator, we add
192	nothing to the tape but position the brainfuck pointer at the zero that
193	follows the last filled tape cell, thus ending this loop.
194
195	>>,[>+++++<[[->]<<]<[>]>]
196
197	Read input, producing
198	... x 0 'i 0 ...
199	where "x" is a nonzero cell and "i" is the input.
200	While i is nonzero, run this loop:
201	-set up
202	... x 0 'i 5 0 ...
203	-If the input was six or greater, the [[->]<<] part will five times decrement
204	both i and 5; the sixth time, it will only decrement i, and move to the cell
205	left of i, producing
206	... x '0 i 0 0 ...
207	following which the code <[>]> will restore the pointer to i:
208	... x 0 'i 0 0 ...
209	In short, while i is at least six, the net effect of each iteration of the
210	loop [>+++++<[[->]<<]<[>]>] is to reduce i by 6; so repeated iterations will
211	change i to i mod 6; call this j. Then the loop will be run once more.
212	Now legitimate input characters give the values 1, 2, 3, 4 when reduced mod 6;
213	"1" gives 1, "b" gives 2, "c" gives 3, and linefeed and "." and "d" all give
214	4. On the last run through the loop [>+++++<[[->]<<]<[>]>] , the [[->]<<] part
215	will decrement both j and 5 repeatedly until j is zeroed, i.e. it will zero j
216	while reducing 5 by j, leaving
217	... x 0 '0 r 0 ...
218	where r is 5-j. The code <[>]> leaves this configuration unchanged, and the
219	loop exits.
220
221	>-[<<+++++>>-[<<---->>-[->]<]]
222
223	If r was 1, i mod 6 was 4, meaning a terminator. So we don't fill any tape
224	cell but leave
225	... x 0 0 '0 ...
226	If r was 2, i mod 6 was 3, meaning "c". So we set up
227	... x 5 0 '0 ...
228	If r was 3, i mod 6 was 2, meaning "b". In this case we set up
229	... x 1 0 '0 ...
230	and skip the innermost [->] loop, then step left leaving
231	... x 1 '0 0 ...
232	(note the pointer position; only in this case is the pointer immediately to
233	the right of a nonzero cell.)
234	If r was 4, i mod 6 was 1, meaning "1". In this case we set up
235	... x 1 0 '1 0 ...
236	and enter the inner [->] loop, resulting in
237	... x 1 0 0 '0 ...
238	after which we step left, producing
239	... x 1 0 '0 ...
240
241	<[<-<[<]+<+[>]<<+>->>>]
242
243	Now we step left. If and only if i was "b", we will enter this loop which will
244	move the Turing machine head. Now the situation, including the head which is
245	somewhere off to the left, is
246	... 2 0 0 1 y ... y y y y '1 0 0 0 ...
247	where each y is a 1 (since that is the only symbol that occurs between one b
248	and the next b, in input strings that represent tag-systems).
249	Now we zero one y and scan left:
250	... 2 0 '0 1 y ... y y y 0 1 0 0 0 ...
251	set two more 1's, and scan right (these two, and the 1 that was formerly the
252	state cell of the head, now serve as tape symbols, so we label them y);
253	... 2 y y y y ... y y y '0 1 0 0 0 ...
254	next, we set up the head at its new position, and we set the cell to the left
255	of it to b, and move the pointer just right of the 1 which is the state:
256	... 2 y y y y ... y 2 0 0 1 '0 0 0 ...
257	and now we're to the right of the rightmost nonzero cell, as we should be, so
258	we end the head-moving loop.
259
260	(In the degenerate case where we received two b's in a row, the process goes:
261	...2 0 0 1 '1 0 0 0 ...
262	...2 0 '0 0 1 0 0 0 ...
263	...2 y y '0 1 0 0 0 ...
264	...2 2 0 0 1 '0 0 0 ...
265	as it should.)
266
267	After either performing or skipping that loop, there are four possibilities
268	corresponding to the four possible inputs.
269	... x x x x 1 '0 ... if the input was a "1"
270	... x 2 0 0 1 '0 ... if the input was a "b"
271	... x x x x 5 '0 ... if the input was a "c"
272	... x x x x 0 '0 ... if the input was a terminator.
273
274	<]<[<]>[
275
276	Now we move left and close the loop. If the input was anything but a
277	terminator, this puts us at the rightmost nonzero cell and we repeat the
278	input-processing loop. If the input was a terminator, this puts us at the zero
279	after the rightmost nonzero cell, and the input is already finished; then we
280	scan left to the gap that represents the Turing machine head, and position the
281	pointer at the state cell. Then we begin the main loop, which will be executed
282	once for each step of the Turing machine's operation, stopping when the state
283	cell holds 0 (representing the halt state).
284
285	-[>++++++<-]>[<+>-]
286
287	At the beginning of each iteration of the main processing loop, the
288	configuration is
289	... w x 0 0 's y z ...
290	where w, x, y, z are tape cells, with y being the current tape cell, and s is
291	the state (1-4). First we combine the current state with the current symbol;
292	we add (6*(s-1)) onto y, then move the result back into the former location of
293	s. Call the combination g; its values range from 0 to 23, one for each
294	state-symbol combination.
295	... w x 0 0 '0 g z ...
296	... w x 0 0 g '0 z ...
297
298	Now we have to use g to select a new symbol, a new state, and a direction to
299	move the head. We will provisionally move the head right one cell, and set a
300	direction flag; if that flag is nonzero, we will move the head left two cells,
301	resulting in a total movement of one cell to the left. That is, we want to use
302	g in
303	... w x 0 0 g 0 z ...
304	to construct an appropriate
305	... w x y d 0 s z ...
306	where d is the direction flag, s is the new state, and y is the new symbol;
307	then if d is nonzero ("move left after all") we want to shift this to produce
308	... w 0 0 s x y z ...
309
310	The way we use g to pick new values for s, y, and d is a very general scheme
311	for mathematical functions of one variable, and this UTM is one place we need
312	the generality. (See my rot13.b for a place where I didn't need the
313	generality, and used this method anyway, leading to comically inconcise but
314	fast code.) The basic idea is like
315	>set f(0)<[>set f(1)<-[>set f(2)<-[>set f(3)<-[...&c...]]]]
316
317	The argument (in this case, g) is gradually decremented; if it was (say) five,
318	the program will enter the five outermost loops of this nest but will skip the
319	sixth and those inside it, since by that point the argument will have been
320	zeroed. So we just make sure the output cells have the right values for f(5)
321	at that point, by setting them inside the fifth loop but outside the sixth.
322
323
324	We get the outputs from the state table, naturally, reading down the columns.
325	Recall that "01b<>c" map to 0 1 2 3 4 5, and that "d=1" means "left".
326
327	+<<<+++>+>
328	g>=0; set s=1, y=3, d=1
329
330	[-
331	g>=1; same values. There's some continuity in the table; only changes from one
332	combination to the next will be commented hereafter.
333
334	[<<+>->-
335	g>=2; y=4, d=0
336
337	[<<[-]>>-
338	g>=3; y=0
339
340	[<<++>+>-
341	g>=4; y=2, d=1
342
343	[<<-->->>+++<-
344	g>=5; y=0, d=0, s=4
345
346	[<<+>+>>--<-
347	g>=6; y=1, d=1, s=2
348
349	[<<->->-
350	g>=7; y=0, d=0
351
352	[<<++++>+>>+<-
353	g>=8; y=4, d=1, s=3
354
355	[>-<-
356	g>=9; s=2
357
358	[<<->->-
359	g>=10; y=3, d=0
360
361	[<<->>-
362	g>=11; y=2
363
364	[<<+++>>>-<-
365	g>=12; y=5, s=1
366
367	[<<---->>>++<-
368	g>=13; y=1, s=3
369
370	[<<++>>>+<-
371	g>=14; y=3, s=4
372
373	[>[-]<-
374	g>=15; s=0 (this is the halt condition; having it produce d=0 is useful, since
375	moving left would take us outside the brainfuck array, and the capability of
376	actually not moving the head has been omitted as unnecessary, given that we're
377	only going to output the part of the tape that holds the final state of the
378	transformed string.)
379
380	[<<->>>+++<-
381	g>=16; y=2, s=3
382
383	[<<->>>--<-
384	g>=17; y=1, s=1
385
386	[<<++++>+>>+<-
387	g>=18; y=5, d=1, s=2
388
389	[<<[-]>->>++<-
390	g>=19; y=0, d=0, s=4
391
392	[<<+++++>+>>--<-
393	g>=20; y=5, d=1, s=2
394
395	[<->>++<[<<->>-]]
396	Here's a tricky part. g==21 is never produced by the Turing machine from
397	correct input, so the remaining states to consider are g==22 and g==23, which
398	should give (y=3, d=0, s=4) and (y=2, d=0, s=4) respectively. So we set up
399	d=0, s=4 which are common to both, then we take the 2 or 3 that remains in g
400	and subtract it from y to produce the right result for y.
401
402	]]]]]]]]]]]]]]]]]]]]
403
404	We arrive here when g has been zeroed. Again, the layout now is
405	... w x y d '0 s z ...
406	and y, d, and s have the correct values.
407
408	<[->>[<<+>>-]<<<[>>>+<<<-]<[>>>+<<<-]]
409
410	If d==1 we do:
411	... w x y '0 0 s z ...
412	... w x y s 0 '0 z ...
413	... w x '0 s 0 y z ...
414	... w '0 0 s x y z ...
415	to move the head left and leave the pointer at the leftmost cell of the head,
416	where it would be if d had been 0 also.
417
418	>>]
419
420	Go to the state cell, and if it is nonzero (not the halt state) go through the
421	main Turing machine loop again.
422
423	>[-[---[-<]]>]
424
425	Now the Turing machine has halted. The situation is
426	4 0 0 '0 x x ... x x 0 0 0 ...
427	where each x is either 1, 4, or 5, and the leftmost 5 ("c") marks the start of
428	the transformed string. So we scan through the tape looking for that 5, and
429	incidentally clearing the tape as we go. When we find a 1, we decrement it,
430	skip the loop [---[-<]], move right, and start fresh. When we find a 4, we
431	decrement it down to 0, skip the loop [-<], move right, and start fresh. When
432	we find a 5, we decrement it down to 0, move left, move right to the space the
433	5 occupied (now 0), and end this loop.
434
435	>[+++[<+++++>--]>]
436
437	Now we want to output the part of the tape that represents the transformed
438	string; the situation is
439	... 0 '0 x x x ... x x x 0 0 0 ...
440	where each x is either 1, representing "1", or 5, representing "c". To
441	transform each to the right ASCII value we first add 3, producing 4 for "1"
442	and 8 for "c", then multiply the resulting value by 5/2 while moving it left;
443	this is safe because we know the value is even. We scan through the string
444	this way, after which it consists of 10s (for "1") and 20s (for "c"), and the
445	whole string has been shifted one cell left.
446
447	+<++[[>+++++<-]<]
448
449	Now we set two more cells to make the linefeed, producing
450	... 0 x x x ... x x x '2 1 0 0 ...
451	and multiply each cell by five while moving it right, producing
452	... '0 0 x x x ... x x x 11 0 0 ...
453	where each x is either 50 (for "1") or 100 (for "c").
454
455	>>[-.>]
456
457	Now we scan right, reducing each cell by one and outputting it; the values are
458	49 (ASCII for "1"), 99 (ASCII for "c"), and 10 (ASCII for the final linefeed).
459	After this loop there are no more commands, so the program terminates.]
460
461
462	The entire program again without comments:
463
464	+++>++>>>+[>>,[>+++++<[[->]<<]<[>]>]>-[<<+++++>>-[<<---->>-[->]<]]
465	<[<-<[<]+<+[>]<<+>->>>]<]<[<]>[-[>++++++<-]>[<+>-]+<<<+++>+>
466	[-
467	[<<+>->-
468	[<<[-]>>-
469	[<<++>+>-
470	[<<-->->>+++<-
471	[<<+>+>>--<-
472	[<<->->-
473	[<<++++>+>>+<-
474	[>-<-
475	[<<->->-
476	[<<->>-
477	[<<+++>>>-<-
478	[<<---->>>++<-
479	[<<++>>>+<-
480	[>[-]<-
481	[<<->>>+++<-
482	[<<->>>--<-
483	[<<++++>+>>+<-
484	[<<[-]>->>++<-
485	[<<+++++>+>>--<-
486	[<->>++<
487	[<<->>-
488	]]]]]]]]]]]]]]]]]]]]]]<[->>[<<+>>-]<<<[>>>+<<<-]<[>>>+<<<-]]>>]
489	>[-[---[-<]]>]>[+++[<+++++>--]>]+<++[[>+++++<-]<]>>[-.>]
490
491

Note: See TracBrowser for help on using the repository browser.

Context Navigation

source: trunk/Examples/Others/utm.b

Download in other formats: