(A) Schematic showing how we convert the sequence of action potentials into discrete ‘words’, that is, sequences of zeros and ones ,. As an example, at the top we show the stimulus and spike arrival times (red dots) in a 64 ms segment of the experiment. We may treat this as two successive segments of duration *T* = 32 ms, and divide these segments into bins of duration *τ* = 2, 8, or 32 ms. For sufficiently small *τ* (here, *τ* = 2 ms), each bin contains either zero or one spike, and so each neural response becomes a binary word with *T*/*τ* bits; larger values of τ generate larger alphabets, until at *τ* = *T* the response of the neuron is just the spike count in the window of duration *T*. Note that the words are shown here as non-overlapping; this is just for graphical convenience. (B) The distribution of words with *τ* = 1 ms, for various values of *T*; words are plotted in rank order. We see that, for large *T* (*T* = 40 or 50 ms) but not for small *T* (*T* = 20 ms), the distribution of words had a large segment in which the probability of a word is *P* ∝ 1/rank^{∝}, corresponding to a straight line on this double logarithmic plot. Similar behavior is commonly observed for words in English, with *α* = 1, which we show for comparison (solid line); this is sometimes referred to as Zipf's law . (C) The entropy of a *T* = 25 ms segment of the spike train, as a function of the time resolution *τ* with which we record the spikes. We plot this as an entropy rate, *S*(*T*,*τ*)/*T*, in bits/s; this value of *T* was chosen because this is the time scale on which visual motion drives motor behavior . For comparison we show the theoretical results (valid at small *τ*) for a Poisson process , and a Poisson process with a refractory period , with spike rates and refractory periods matched to the data. Note that the real spike train has significantly less entropy than do these simple models. In we showed that our estimation methods can recover the correct results for the refractory Poisson model using data sets comparable in size to the one analyzed here; thus our conclusion that real entropies are smaller cannot be the result of undersampling. Error bars are smaller than the data points. (d) The information content of *T* = 25 ms words, as a function of time resolution *τ*; again we plot this as a rate *R* _{info}(*T*,*τ*) = *I*(*T*, *τ*)/*T*, in bits/s.

## PubMed Commons