Speed-up deque indexing by changing the deque block length to a power of two.
The division and modulo calculation in deque_item() can be compiled
to fast bitwise operations when the BLOCKLEN is a power of two.
Timing before:
~/cpython $ py -m timeit -r7 -s 'from collections import deque' -s 'd=deque(range(10))' 'd[5]'
10000000 loops, best of 7: 0.0627 usec per loop
Timing after:
~/cpython $ py -m timeit -r7 -s 'from collections import deque' -s 'd=deque(range(10))' 'd[5]'
10000000 loops, best of 7: 0.0581 usec per loop