## Why array indexing starting at 0 is cleaner than starting at 1

> would work perfectly well if the range returned [1, 2, 3, 4] and the

> list was indexed starting with 1. Basically, range(4) has to produce a

> list of four items, we just differ on what those items should be.

If indexes started at 1, then maybe so should ranges. However,

having read the followups to this message, I think there are still

advantages of simplicity and regularity in having arrays (even if

one calls them lists:-) indexed from 0. My master thesis, lo that

many years ago, included a large program in Fortran IV (1-based

index only) and had to do a lot of +1/-1 twiddling because of

that. I didn’t understand why at the time (having not yet met

Koenig’s book “C traps & pitfalls”, which introduced me to the

many advantages of half-open ranges — array indexing being a

case of that), but now I think I do.

Suppose my 1-dimensional array/list needs at some point to

be ‘seen’ as composed of several adjacent subarrays, each of

length N — just for example. OK, what’s M, the index in the

array of element I of subarray K?

If everything starts from 0:

M = I+K*N

nice and simple. If everything starts from 1:

M = 1 + (I-1) + (K-1)*N

= I + (K-1)*N

darn — an unavoidable ‘-1’… 1-based indexing just isn’t as

arithmetically nice as 0-based when you start having to

compute your indices.

And the reverse, too — given M and N,

K, I = divmod(M,N)

with 0-based indexing throughout — nice, isn’t it…? What

about *one*-based indexing throughout…? Hmmm, looks

like we’ll have to do the -1 dance on M first, then the +1

one on both of the subresults…:

K, I = divmod(M-1, N)

K += 1

I += 1

Doesn’t it start to look as if indices WANT to be zero-based,

and coercing them to 1-based is simply a pretty artificial

choice requiring many +1’s and -1’s strewn around…?

OK, what about the “main diagonal” of the intrinsic 2D

array embedded in my 1D one — first element of first

subarray, second element of second subarray, etc. Can

I get the indices of that easily and naturally? What’s the

index M in the big array of the I-th element of this main

diagonal? Well, when 0-based indexing is being used

throughout, M=I*(N+1) seems right. When 1-based…:

M = 1 + (I-1)*(N+1)

Again we have to do a -1 on I to move it from 1-based

to 0-based for the computation, and +1 on the result

to move the natural 0-based one to 1-based. Just take

care to NOT do more or fewer -1’s and +1’s than needed

or a bug may emerge…:-).

OK, forget subarrays. Say we just have two very long

arrays A and B. We need to consider them starting

from indices IA and IB, and obtain a result by summing

corresponding elements — the first element in our

result array is A[IA]+B[IB], and so on. Simple, right?

OK, so, what’s the I-th element of our result?

C[I] = A[IA+I] + B[IB+I]

when all indices are 0-based. If 1-based, though:

C[I] = A[IA+I-1] + B[IB+I-1]

darn — once again the -1 emerges! Once again we

have to just about “translate” arithmetically-funny

1-based indices to the “natural” 0-based ones, and

so the -1 (or +1, depending).

You will no doubt find some counterexamples too, but

in general I think you’ll notice that anytime two indices

need to be added, or other kinds of arithmetic on

indices are required, 0-based indices tend to behave

better, 1-based ones need some ‘translation’ (-1, +1)

far more often than the reverse case.

Alex