Monday, January 22, 2007

A000002: Kolakoski Sequence
1, 2, 2, 1, 1, 2, 1, 2, 2, 1, 2, 2, 1, 1, 2, 1, 1, ...

As you may have guessed, this sequence consists entirely of 1s and 2s. Also A[1]=1, A[2]=2, and A[n] is the length of the nth run of consecutive identical terms. So since the 2nd run of terms is of length 2, A[3]=2 and A[4]!=2, i.e., A[4]=1. But since A[3]=2, the 3rd run has length 2, which makes A[5]=1 as well. And so on.

This is another fractal sequence! Apply run-length encoding to the sequence, assuming it starts with 1 and omitting value information, and we get back the original sequence. It carries a homework assignment, too - no one has proven whether the density of 1s is 1/2.

A007782 is a sequence that came out of attempts to prove just that. It is a count of the number of distinct consecutive subsequences of A000002 of length n. I'm not sure how it was generated, but when I looked it had only 25 terms and the tag "hard", which means more terms are desired. So I generated some more! There's probably a better way, but mine was: "generated by starting with a set containing '1', taking the first unused element X from that set, adding to that set the two strings resulting from interpreting X as a run-length encoding of a string starting with '1' or '2' and containing only '1's and '2's and any unadded substrings of those strings, and repeating 100000 times."

Of course, there's no real reason to just use '1' and '2'. Two symbols is more elegant than more, but how about '1' and '3'? Then we get
1, 3,3,3,1,1,1,3,3,3, 1,3,1, 3,3,3,1,1,1,3,3,3, 1,3,3,3,1, 3,3,3,1,1,1,3,3,3, 1,3,1, 3,3,3,1,1,1,3,3,3, 1, 3,3,3,1,1,1,3,3,3, 1, 3,3,3,1,1,1,3,3,3, 1,3,1, 3,3,3,1,1,1,3,3,3, 1,3,3,3,1, 3,3,3,1,1,1,3,3,3, 1,3,1, 3,3,3,1,1,1,3,3,3, 1, 3,3,3,1,1,1,3,3,3, 1,3,1, 3,3,3,1,1,1,3,3,3, 1, 3,3,3,1,1,1,3,3,3, 1,3,1, 3,3,3,1,1,1,3,3,3, 1,3,3,3,1, 3,3,3,1,1,1,3,3,3, 1,3,1, 3,3,3,1,1,1,3,3,3, 1, 3,3,3,1,1,1,3,3,3, 1, ...
Let's give these clumps names! A=1, B=333111333, C=131, and D=13331. Now what does the sequence look like? ABCBDBCBABABCBDBCBABCBABCBDBCBABA... Every second character is a B. Is there a pattern of non-Bs? ACDCAACDCACACDCAA... It's not clear from so few, but regardless, we can tell one thing. The density of '1's in this sequence is definitely lower than 1/2. One upper bound is the density of '1's in CBCBCBCBCB..., which is 5/12. This happens, of course, because 1 and 3 have the same parity. What about something like 1 and 4?
1, 4,4,4,4,1,1,1,1,4,4,4,4,1,1,1,1, 4,1,4,1, 4,4,4,4,1,1,1,1,4,4,4,4,1,1,1,1, 4,1,4,1, 4,4,4,4,1,4,4,4,4,1, 4,4,4,4,1,1,1,1,4,4,4,4,1,1,1,1, 4,1,4,1, 4,4,4,4,1,1,1,1,4,4,4,4,1,1,1,1, 4,1,4,1, 4,4,4,4,1,4,4,4,4,1, 4,4,4,4,1,1,1,1,4,4,4,4,1,1,1,1,4,1,1,1,1,4,4,4,4,1,1,1,1,4,4,4,4,1, 4,4,4,4,1,1,1,1,4,4,4,4,1,1,1,1, 4,1,4,1, 4,4,4,4,1,1,1,1,4,4,4,4,1,1,1,1, 4,1,4,1, 4,4,4,4,1,4,4,4,4,1, 4,4,4,4,1,1,1,1,4,4,4,4,1,1,1,1, 4,1,4,1, 4,4,4,4,1,1,1,1,4,4,4,4,1,1,1,1, 4,1,4,1, 4,4,4,4,1,4,4,4,4,1, 4,4,4,4,1,1,1,1,4,4,4,4,1,1,1,1,4,1,1,1,1,4,4,4,4,1,1,1,1,4,4,4,4,1, 4,4,4,4,1,1,1,1,4,4,4,4,1,1,1,1, 4,1,4,1, 4,4,4,4,1,1,1,1,4,4,4,4,1,1,1,1, 4,1,4,1, 4, 1,4,1,4, 1,1,1,1,4,4,4,4,1,1,1,1,4,4,4,4, ...
Relabeling with A=1, B=4444111144441111, C=4141, D=4444144441, we get ABCBCDBCBCD*BCBCDBCBCD*BCBC???...
That * is a sequence that doesn't quite match the others... and later when it is expanded we suddenly get a whole string of new patterns. The difference in parity means that some of our patterns have more 4s and some have more 1s, and it quickly becomes chaotic trying to figure out if one dominates the other. It turns out we're not going to solve that unsolved problem in the space of this post!

Monday, January 15, 2007

A006960: Missing the Palindrome
196, 887, 1675, 7436, 13783, 52514, 94039, 187088, 1067869, ...

Let's start with any easy sequence. 1, 2, 4, 8, 16, 77, 154, ... Why the jump from 16 to 77 when everything else is just multiplied by 2? Because this sequence is not A[n+1]=A[n]+A[n], it's actually A[n+1]=A[n]+ReverseTheDigitsOf(A[n]). Here are some other instances of it, starting with different initial values:
  • 1, 2, 4, 8, 16, 77, 154, ...
  • 3, 6, 12, 33, 66, 132, 363, ...
  • 5, 10, 11, 22, 44, 88, 176, ...
  • 7, 14, 55, 110, 121, 242, 484, ...
  • 9, 18, 99, 198, 1089, 10890, 20691, ...
  • 13, 44, 88, 176, 847, 1595, 7546, ...

Notice that in each, the sequence quickly hits a palindromic number - a number x such that x=ReverseTheDigitsOf(x). Actually most of them start with a palindomic number since every single digit number is palindromic. Say you have a number with N digits (we'll let N=2k). Once you know the first k digits, you know constr

Sunday, January 14, 2007

A005228: The Difference Sequence
1, 3, 7, 12, 18, 26, 35, 45, 56, 69, 83, ...

This self-generating sequence from Godel, Escher Bach: An Eternal Golden Braid is fairly cute. It starts with 1 and its terms are those terms whose forward differences are not the terms. The differences in the sequence are exactly the differences of the sequence from the natural numbers. 2 is not in the sequence, for instance, because if it was then 2-1=1 would be one of the differences, but 1 is in the sequence. Furthermore, it is the lexicographically earliest such sequence. Here's one way to construct it:
  1. initialize a variable k to 1
  2. if adding k to the sequence does not introduce a forward difference that is already in the sequence, add k to the sequence
  3. increment k and goto 2.

But there's another, better way. Call the sequence S, and call the sequence of differences (2, 4, 5, 6, 8, ...) T. Then S[n+1]=S[n]+T[n]. In other words, the next term in the sequence is formed by adding the next needed difference to the last term. Neat. The terms grow fast enough that there are always known differences available to add on. How fast do they grow?
  1. Consider the first k terms of S. A lower bound on the sum of the first differences is sum(i,i=1..k-1) = k^2/2-k/2, so S is Omega(n^2).
  2. Since S and T together make up the positive integers, at least one must be Theta(n).
  3. Thus T is Theta(n) and S[n+1]=sum(T[i],i=1..n), so S is Theta(n^2).

Another question is whether there is a monotonic sequence X which has that every positive integer is either in X or a difference of some pair of terms of X, exactly once. The difference sequence, but not limited to differences between consecutive terms. Does anyone know?

Tuesday, January 09, 2007

A003602: Fractal
1, 1, 2, 1, 3, 2, 4, 1, 5, 3, 6, 2, 7, 4, 8, 1, 9, ...

Fractals! Remove every second number, or alternately and equivalently, remove the first appearance of each positive integer. That's right, alternating terms are just the integers. What do you have left? 1, 1, 2, 1, 3, 2, 4, 1, ... The same sequence! Which means if you remove every second number again, you get the same sequence. Over and over. But this also means that if you look only at every fourth term, starting at the fourth number, you get the same sequence. And eighth, etc. Sk[n]=S[n*2^k]=S[n], as long as we start indexing from 1.

How fast does this grow? Well, it keeps going back to 1, so let's ask instead how fast its sum grows. Since every second term is the integers, the sum T[n=2k] is at least k^2/2+k/2. And since the other terms are the sequence, T[n=2k] = k^2/2 + k/2 + T[k], and T[n=2k+1] = k + 1 + T[2k].

To make it easy, we'll just find T[n=2^k]=(n/2)^2/2+(n/2)/2+T[n/2]. It happens to be 2^(k-1) + 2^(2k-1)/3 + 1/3. Divide by 2^k to get the average of the first n terms: Ave[n]=1/2+(1/2)^n/3+n/6~=n/6.

What if instead we said S[n*3^k]=S[n], filling in unused numbers with ascending integers as with A003602? We'd get 1,2,1,3,4,2,5,6,1,7,8,3,9,10,4,11,12,2,13,14,5,... with similar properties. Its average value? Ave[n=3k]~=((2k)^2/2/3k+Ave[k]/3=(2/3)(n/3)+(1/3)Ave[n/3]. That's (2/9)(9/8)n = n/4.

In general, then. For the S_z[n*z^k]=S_z[n] function, the average value is Ave[n=zk]=n*(z-1)^2/2z^2+Ave[n/z]/z=(n/2)(z-1)^2/(z^2-1).

Thursday, January 04, 2007

Number sequences are awesome, like Harlan Ellison awesome. First: the sequence of nonnegative integers? 0, 1, 2, 3, 4, 5, 6, ...? A001477?

Pretty elegant, wouldn't you say?