Data-driven classification of individual cells by their non-Markovian motion

We present a method to differentiate organisms solely by their motion based on the generalized Langevin equation (GLE) and use it to distinguish two different swimming modes of strongly confined unicellular microalgae Chlamydomonas reinhardtii. The GLE is a general model for active or passive motion of organisms and particles that can be derived from a time-dependent general many-body Hamiltonian and in particular includes non-Markovian effects (i.e., the trajectory memory of its past). We extract all GLE parameters from individual cell trajectories and perform an unbiased cluster analysis to group them into different classes. For the specific cell population employed in the experiments, the GLE-based assignment into the two different swimming modes works perfectly, as checked by control experiments. The classification and sorting of single cells and organisms is important in different areas; our method, which is based on motion trajectories, offers wide-ranging applications in biology and medicine.


Introduction
Classifying individual cells or organisms is a challenging task that has been approached in many different ways and has ample applications.Distinguishing different types of cancer cells [1,2], foodborne pathogens [3], sperm cells [4], or types of neurons [5] are just a few examples.Different techniques have been introduced to distinguish and classify organisms on the multi-cell down to the single-cell level.One approach involves markers that bind cell-specifically [2,6,7].Since individual cells contain unique genetic and epigenetic information, it is also possible to distinguish cells by their specific DNA or RNA content.Indeed, biotechnological advances enable single-cell RNA sequencing [8,9], which can be used in combination with machine-learning approaches [10,11] to efficiently distinguish single cells.However, RNA sequencing as well as usage of markers require cell perturbation or even destruction for data acquisition.In many cases it is desirable to classify cells without perturbing them, which requires label-free techniques such as spectroscopic approaches [3,12] or microscopy [13].In this way, cell information can be extracted almost instantaneously [14] from living organisms [1].Spectroscopic and microscopic images can be processed using machine learning to classify cells, however, the outcomes can be hard to interpret and require massive training data.One way to simplify the processing of cell-image data is to reduce the parameter space.
This can be achieved by feature selection [15] or by projection onto important parameters, for instance by principal component analysis (PCA) [11].A prime feature of mobile cells is their positional trajectory, which is relatively easy to obtain in experiments and contains hidden information on the motion-generating processes within the cell [16][17][18][19][20]. Machine-learning algorithms have been proposed to classify trajectories [21] and some have been applied to single cell trajectories [15].These approaches mostly focus on classification and not on the interpretation of the motion patterns.In order to yield a mechanistic interpretation, some specific model is usually assumed [22][23][24][25][26][27], which makes the interpretation modeldependent.The most general model that describes the motion of a particle in a complex environment and captures the stochasticity of its motion is the GLE, which has been shown to accurately describe the motion of different cell types [28][29][30][31][32].In fact, the GLE is not an ad-hoc model but can be derived from the underlying many-body Hamiltonian [33][34][35].Living cells are intrinsically out of equilibrium [36], a fact that can be properly accounted for by the GLE used to describe the cell motion [37].
Here, we present a method to classify individual organisms based on the GLE parameters extracted from their motion trajectories.We apply our methods to experimental trajectories of individual unicellular biflagellate algae CR [38] and find two distinct groups of swimmers, which are illustrated in Figs.1a and b.In contrast to other existing methods for cell sorting and classification, our method requires only trajectories as input, does not require any training of a network and avoids human bias in the selection of relevant features.Additionally, our approach allows us to interpret motion characteristics in terms of simple mechanistic models derived from the GLE parameters.In the case of CR cells, the data suggest some type of elastic coupling that presumably involves the anchoring of the flagella [38][39][40], as schematically depicted in Fig. 1c.Our approach is applicable to any kind of cell or organism motility data.

Results
Experimental trajectories: Our analysis is based on videos of 59 CR cells that are strongly confined between two glass plates separated by a distance similar to the cell diameter ∼ 10 µm, which resembles the natural habitat of CR in soil [38] and simplifies the recording of long twodimensional trajectories, as the cells cannot move out of the image plane of the microscope objective (see Methods 4.1 and 4.2 for experimental details).Videos that resolve the flagella motion are shown in SI Sec.I and reveal two different types of flagellar motion [41].In one type, the flagella move synchronously as in a breaststroke called 'synchro' (Fig. 1a and Video 1), in the other type, the flagella move asynchronously, which results in a wobbling cell motion called 'wobbler' (Fig. 1b and Video 2).The emergence of two different swimming modes reflects cell-size variation and constitutes the tactile cell response to the confining surfaces [41], where the synchros tend to exhibit slightly larger cell bodies and are therefore more strongly confined.Switching events between synchronous and asynchronous flagella motion are never observed and thus negligible.We use the classification into synchros and wobblers based on the flagella motion in the high-resolution video data as a test of our classification method that is based on the cell-center trajectories.
Theoretical trajectory model: The experiments yield two-dimensional trajectories x(t), y(t) for the cell center position, which we describe by the GLE equation of motion with an identical equation for y(t).Here ẍ(t) = v(t) denotes the acceleration of the cell position, Γ v (t) is a memory kernel that describes how the acceleration at time t depends on the cell velocity ẋ(t ′ ) = v(t ′ ) at previous times and therefore accounts for non-Markovian friction effects, and F R (t) is a random force that describes interactions with the surrounding and the interior of the cell.Since the experimental system is isotropic and homogeneous in space, no deterministic force term appears in the GLE.In fact, the GLE in eq. ( 1) is the most general equation of motion for an unconfined object and can be derived by projection at The white halo around the cells is typical for phase-contrast microscopy [42].(c) Sketch of a CR cell: the distal striated fiber (DSF) connects the two basal bodies [39], which anchor the flagella and are connected to the nucleus by nuclear basal-body connectors (NBBC) [38,40].time t 0 from the underlying many-body Hamiltonian even in the presence of non-equilibrium effects, which obviously are present for living organisms [33][34][35]37].
If the cell motion can be described as a Gaussian process, which for CR cells is suggested by the fact that the single-cell velocity distributions are perfectly Gaussian, as will be explained further below, the random force is a Gaussian process with correlations given by where B = ⟨v 2 ⟩ denotes the mean-squared cell velocity and the symmetric random-force kernel is denoted as Γ R (t).In this case the equation of motion is linear and there is no coupling between the motion in x and y direction, we thus average all cell-trajectory data over the two directions.For an equilibrium system, the fluctuation dissipation theorem (FDT) predicts Γ R (t) = Γ v (|t|) with the mean-squared velocity given by B = k B T /m according to the equipartition theorem, where m is the mass of the moving object and k B T denotes the thermal energy [33,34].For living cells, both FDT and equipartition theorem do not hold in general and thus there is no a priori reason why Γ v (|t|) and Γ R (t) should be equal [36,43].Nevertheless, one can construct a surrogate model with an effective kernel Γ(|t|) = Γ R (t) = Γ v (|t|) that exactly reproduces the dynamics described by the non-equilibrium GLE with Γ R (t) ̸ = Γ v (|t|).This can be most easily seen by considering the velocity autocorrelation function (VACF) defined by  1) and ( 2) as [30] Cvv where Γ+ v (ω) denotes the single-sided Fourier transform of Γ v (t) (see Methods 4.3 for the derivation).The VACF of the surrogate model with Γ(|t|) = Γ R (t) = Γ v (|t|) follows from eq. ( 4) as For each combination of Γ R (t) and Γ v (t) there is a unique Γ(t) that produces the same VACF, i.e., for which Csur vv (ω) = Cvv (ω) holds (see SI Sec.II for the derivation) and which can be uniquely extracted from trajectories via the VACF (as shown in Methods 4.4).Since the VACF completely determines the dynamics of a Gaussian system, as shown in Methods 4.5 and in more detail in SI Sec.III, this implies that the extracted effective kernel Γ(t) not only describes the VACF exactly but also characterizes the system completely [44].In fact, recent work, where the non-equilibrium GLE is derived from a time-dependent Hamiltonian, shows that for Gaussian non-equilibrium observables the condition Γ R (t) = Γ v (|t|) is actually satisfied [37], in line with our method that is based on extracting an effective kernel Γ(t).
Trajectories and velocity distributions: Due to the asynchronous flagella motion of the wobblers, the cells turn in the flagella-beating rhythm and exhibit monotonically forward-moving wiggly trajectories, as shown in Fig. 2a.In contrast, the synchronous flagella beating of synchros leads to fast switching between forward and backward motion, as shown in Fig. 2b.As a consequence, synchros exhibit much slower net-forward motion than wobblers, as seen in Figs.2a,b, where trajectories with total duration 2.2 s (wobbler) and 8.2 s (synchro) are compared.Synchros also exhibit a somewhat narrower instantaneous velocity distribution, as seen in Figs.2c,d.However, a differentiation of the two cell types solely based on their speed does not work, as we will show later.
Even though individual cells exhibit pronounced variations in their velocity distributions, as seen from the large spread in Figs.2c,d, their velocity distributions are Gaussian, as demonstrated in Fig. 2e: When subtracting from the cell velocities the mean velocity of each individual cell and dividing by the corresponding velocity standard deviation, (v ind − v ind )/σ ind , all individual velocity distributions collapse onto the Gaussian (normal) distribution (dashed line in Fig. 2e).In contrast, when subtracting from the cell velocities the cell-ensemble mean velocity and dividing by the cell-ensemble standard deviation, (v ind − v)/σ, the velocity distribution of all cells deviates strongly from a Gaussian (green circles in Fig. 2f).In contrast, the individually rescaled velocity distribution over all cells (black dots) perfectly agrees with the Gaussian normal distribution (dashed line in Fig. 2f).Thus, single cells exhibit perfectly Gaussian velocity distributions, which suggests that the GLE with Gaussian noise is appropriate to analyze experimental single-cell trajectories.
The GLE in eq. ( 1) features time-independent parameters and thus describes a stationary process.That the cell velocity distribution does not change over the observational time is demonstrated in Fig. 2g, where velocity distributions in three consecutive time intervals are compared (again subtracting the individual cell mean velocities and dividing by the corresponding velocity deviations of the entire trajectory).This suggests that the motion of individual CR algae can indeed be modeled by the GLE in eq.(1).In this context, it is to be noted that wobblers are relatively fast and tend to move out of the camera window more quickly than synchros, leading to slightly shorter wobbler trajectories, as shown in Fig. 2h.
Trajectory analysis and friction kernel extraction: Trajectories are standardly characterized by the VACF or by the mean-squared displacement (MSD) Although the VACF is simply the curvature of the MSD, dt 2 C MSD (t), the MSD and the VACF highlight different aspects of the trajectories.In fact, the different propulsion modes of wobblers and synchros lead to drastically different MSDs: The wobblers exhibit ballistic behavior C MSD (t) ∝ t 2 on both short and long time scales (Fig. 3a) with an intermediate crossover at t ∼ 0.04 s.In contrast, the forward-backward motion of the synchros leads to short-time diffusive behavior C MSD (t) ∝ t up to t ∼ 0.01 s followed by a long-time ballistic regime for t > 0.04 s (Fig. 3d).The transition to asymptotic diffusive behavior, expected for long times, is for the synchros observed for t > 2s in extended low-resolution microscopy data, whereas wobblers stay in the ballistic regime for the entire observation time of tens of seconds (see SI Sec.IV).As wobblers move faster than synchros, their absolute VACF values are higher compared to the synchros, as seen in Figs.3b,e.The synchros exhibit less variation of the VACF among individual cells, which leads to slowly decaying oscillations in the VACF averaged over all cells (black line Fig. 3e compared to Fig. 3b).These oscillations reflect the flagella beating cycle.
We extract effective friction kernels Γ(t) from the VACF of individual cells, as described in Methods 4.4.The results, shown as colored lines in Figs.3c,f, demonstrate that the algal motion deviates strongly from the simple persistent random walk model, which is widely used to describe the motion of organism [28,29,32] and which in the GLE formulation would correspond to Γ(t) exhibiting a delta peak at t = 0 and otherwise being zero.The extracted memory kernels also reveal a substantially higher friction for synchros compared to wobblers, in line with the fact that synchros are larger and thus interact more strongly with the confining surfaces.
Comparing Fig. 3b with Fig. 3c or Fig. 3e with Fig. 3f, one notes that the oscillation period of the friction kernel is substantially longer than that of the VACF.The complex relation between the kernel and the VACF is discussed in SI Sec.V, where it is shown that the extracted values of the kernel decay time and oscillation amplitude achieve high directionality and speed of CR cells.
For both wobblers and synchros, the friction kernels exhibit an initial sharp peak followed by a decaying oscillation and are well described by with δ-peak amplitude a, oscillation amplitude b, exponential decay time τ and oscillation frequency Ω.This is demonstrated in Figs.4c,f for one exemplary synchro and wobbler, where we compare the extracted memory kernels with fits according to eq. ( 7), see Methods 4.6 for details.From these fits, we thus obtain four memory kernel parameters for each cell.Before further analysis of the obtained individual cell parameters, we test whether the GLE eq. ( 1) actually describes the cell motion.We thus compare the experimental MSD of a single wobbler and synchro in Figs.4a,d (orange dots) with the analytical prediction based on the GLE using the fit result for the friction kernel and the mean squared velocity B = C vv (0) (black broken lines, the derivation of the analytical MSD expression is given in SI Sec.VI).The agreement is very good, meaning that the memory extraction works well, which demonstrates that the GLE is an accurate model for the motion of organisms.
The MSD and VACF calculated from the GLE neglect the finite experimental recording time step of 0.002 s, they also neglect the localization noise of the cell position, due to the finite spatial resolution of the microscopy images and the projection of a three dimensional object onto a two-dimensional point position [30] (see Methods 4.7 for details).The blue crosses in Figs.
4b,e represent fits of the GLE-based analytical expression for the VACF that includes localization noise and discretization effects, given in eq. ( 24), to the experimental data.The blue crosses in Figs.4a,d show the corresponding MSD results with the same parameters according to eq. (22).The agreement between experimental data and Fig. 3 Results for the MSD, C MSD (t) defined in eq. ( 6), the VACF, C vv (t) defined in eq. ( 3) and the friction kernel Γ(t), extracted according to eq. ( 15 24), blue crosses in (a), (d) denote the corresponding prediction for the MSD, eq. ( 22), using the same parameters as in (b), (e), blue crosses are connected by blue straight lines.
the discretized model is perfect, the fitted localization noise strength, defined in eq. ( 22), is of the order of σ loc ∼ 0.02 µm, similar to the pixel size, as expected (see discussion in SI Sec.VII).This means that temporal and spatial discretization effects in the experimental data can be straightforwardly incorporated in the GLE model.
Clustering of single-cell parameters: The GLE (1) in conjunction with the random force strength B defined in eq. ( 2) and the effective friction kernel eq. ( 7) has five parameters.This gives rise to 10 distinct two-dimensional projections, which are shown in Fig. 5.Each data point corresponds to a single cell.The parameters exhibit substantial spread among individual cells, but an unambiguous separation between wobbler and synchros, here colored in blue and red, is not obvious.As can already be seen in Figs.3c,f, the friction amplitudes a and b are larger for the synchros, while the mean squared velocity B is larger for wobblers, which leads to a separation of the two populations in Figs.5b,c.Each flagellar beating cycle leads to a net forward cell motion, which is reflected by the positive correlation between memory oscillation frequency Ω and mean-squared velocity B in Fig. 5g for each cell type.
We perform an unbiased cluster analysis using the X-means algorithm [46], which is a general version of the k-means algorithm [47] that self-consistently determines the optimal number of clusters (details are given in Methods 4.8).Applying our unbiased cluster analysis to the five dimensional parameter space, we obtain two distinct groups, which perfectly coincide with the assignment into wobblers and synchros from visual analysis of the flagellar motion in the video data [41].This means that we can classify the cells by just knowing their center of mass trajectories with an accuracy of 100 %.In comparison, a cluster analysis solely based on the mean-squared velocity B leads to an accuracy of only 69 %, while a cluster analysis using the first two PCA components leads to an accuracy of 90 % (see SI Sec.VIII).Using only the four friction kernel parameters for a cluster analysis without the mean squared velocity B still reaches an accuracy of 92 %.This clearly demonstrates that the GLE model, which parameterizes cell motion based on friction kernel parameters in eq. ( 7) together with the mean squared velocity B, allows for accurate classification of cells based only on their motion.

Discussion and Conclusions
We demonstrate that the rather complex motion of individual CR algae can be accurately described by the GLE (1) and extract all GLE model parameters for individual algae from their cell-center trajectories in a data-driven manner.Based on the extracted GLE parameters, we detect two distinct algae classes by an unbiased cluster analysis, this unsupervised clustering result is confirmed by comparison with a categorization based on visual inspection of the flagella beating patterns.Our method is applicable to any kind of cells and even higher organisms.Since the GLE (1) is not restricted to positional degrees of freedom, our approach can be applied to any observable, for instance cell extension or deformation.As the only input needed are trajectories, our method requires minimal interaction with the organisms and can be easily used as a stand-alone tool or to improve existing machine-learning algorithms for cell classification.
Moreover, our approach allows for a mechanistic interpretation of cell-motion characteristics.In fact, a friction kernel model that is very similar to eq. ( 7) and describes the data equally well can be derived from the equation of motion of two elastically coupled objects (see Methods 4.9 and SI Sec.IX for details).Without further experimental input our approach does not reveal what these objects are, so we can only speculate that the elastic coupling between the cell body and the flagella, which presumably involves the connection between the flagellar basal bodies by the distal striated fiber (schematically shown in Fig. 1c), causes the slowly decaying oscillations in the memory kernel.This seems in line with previous models for CR algae motion and flagella synchronization [48,49].Clearly, more experiments that resolve the relative motion of the cell center and the flagella are needed to resolve these issues.
Our extraction of GLE parameters from cell trajectories yields a single effective memory kernel and does not allow to detect the non-equilibrium character of cell motion, in agreement with recent general arguments [44].Conversely, based on an explicit non-equilibrium model for cell motion, it is rather straightforward to derive the GLE (1) and the functional form of the extracted kernel eq. ( 7), as shown in SI Sec.X.A similar GLE model can also be derived for motion in a confining potential to describe confined neurons [5] or bacteria moving in mucus [50] (see SI Sec.III for details).
As a final note, we mention that the angular orientation of CR algae can be accurately extracted from their cell-center trajectory, as shown in SI Sec.XI.Thus, the orientational cell dynamics is included in our GLE model.
In summary, our approach allows for cell classification by positional cell-center trajectories or any other kind of time series data and at the same time for interpretation of the motion pattern in terms of intracellular interactions.We anticipate numerous applications in biology and medicine that require the label-free distinction of individual cells and organisms.

Cell growth and sample preparation
Wild-type CR cell cultures (strain, CC-1690) are grown in Tris-Acetate-Phosphate (TAP+P) medium by alternating light:dark (12:12 hr) cycles for three days.We collect the cell suspension in their actively growing phase (between 3rd and 6th day of culture) 2-3 hours after the beginning of the light cycle and re-suspend it in fresh TAP+P medium.After 30 to 40 minutes of equilibriation to recover from the mechanosensitive shock during re-suspension [51], we inject the cells inside a rectangular quasi-2D microfluidic chamber of height 10 µm and area 18 mm × 6 mm.This chamber is assembled by using a glass slide and coverslip sandwiched with a 10 µm doublesided tape (Nitto Denko corporation) as spacer.
The glass surfaces are pre-cleaned and coated with a polyacrylamide brush to suppress non-specific adhesion of cell body and flagella [52].The chamber height is determined as 10.88 ± 0.68 µm across different samples [41].

Recording of trajectories
The cells in the chamber are placed under red light illumination (> 610 nm) to prevent phototaxis [53] and flagellar adhesion [54] of CR [41].We use high-speed video microscopy (Olympus IX83/IX73) at 500 frames per second with a 40X phase-contrast objective (Olympus, 0.65 NA, Plan N, PH2) connected to a metal oxide semiconductor (CMOS) camera (Phantom Miro C110, Vision Research, pixel size = 5.6 µm) for imaging the mid-plane between the confining glass plates.This setup enables us to simultaneously image cell position and flagellar shape.To capture very long trajectories to probe the long-time diffusive behavior in SI Sec.IV, we use a 10X bright-field objective (Olympus, 0.25 NA, PlanC N) connected to a high-speed CMOS camera of higher pixel length (pco.1200hs,pixel size = 12 µm) at 50 frames per second.We determine cell trajectories by binarizing the image sequences with appropriate threshold parameters and tracking their centers using standard MATLAB routines [45].

Velocity autocorrelation function
Fourier transformation of eq. ( 1) and eq.( 2) leads to with the single-sided Fourier transform defined as From the Fourier transform of the VACF we obtain by inserting eqs.( 8) and ( 9) Equating the non-equilibrium and the surrogate VACF in eqs.( 4) and ( 5) leads to ΓR (ω) (12) In SI Sec.II we show that for every correlation function C vv (t) we can determine a unique Γ(t).

Memory kernel extraction
Multiplying the GLE eq. ( 1) by ẋ(t 0 ), averaging over the random force and integrating from t 0 to t leads to where we used that ⟨ ẋ(t 0 )F R (t)⟩ = 0 [33][34][35], set t 0 = 0 and introduced the integral kernel In order to invert eq. ( 13), we discretize it.Since C vv (t) is even while G(t) is odd, we discretize G(t) on half steps and C vv (t) on full steps and obtain [30] The kernel Γ i is obtained by the discrete derivative with the initial value Γ 0 = 2G 1/2 /∆.

Two-point velocity distribution
A stationary Gaussian process is completely described by its two-point probability distribution as shown in SI Sec.III.Here we show that the joint and conditional velocity distributions only depend on the VACF.The joint probability to observe v 2 at time t 2 and v 1 at time t 1 can be written in terms of the velocity vector Using the normally distributed velocities we obtain the conditional probability that v(t Thus, the joint and conditional distributions only depend on the VACF C vv (t).

Fitting of friction kernel
The extracted friction kernel of each individual cell is fitted to eq. ( 7) by least-square minimization (using the curve fit function of python's scipy) of the first 0.2 s of the data.The δ-peak in eq. ( 7) leads to the initial kernel value Γ 0 = 2a/∆ + b where ∆ denotes the discretization time, see Sec. 4.4.Since single cells exhibit large variations of the friction kernels, we constrain a and b in eq. ( 7) to be between 0.1 % and 99.9 % of Γ 0 .The decay time τ is constrained to be between 0.05 s and 3 s and the frequency Ω is constrained between 20 s −1 and 250 s −1 .

Discretized VACF including localization noise
In order to test whether the cell motion is actually described by the friction kernel eq. ( 7) and not originating from the experimental finite time step or noise, we fit the experimental VACF of individual cells with an analytical model that accounts for finite time discretization and noise [30].Here we explain the fitting procedure, the analytical expression for the MSD using a friction kernel in the form of eq. ( 7) is derived in SI Sec.VI.
We denote discrete values of a function f (t) as f (i∆) = f i = f i and the discretization time step as ∆.After smoothing the data by averaging over consecutive positions to reduce the localization noise, as discussed in detail in SI Sec.XII, the velocities at half time steps follow as From the velocities the VACF defined by eq. ( 3) is calculated according to with N being the the number of trajectory steps.
In order to account for localization noise, we assume Gaussian uncorrelated noise of width σ loc at every time step, which gives the noisy MSD as [30] where C theo MSD (t) is the theoretical expression for the model MSD given in SI Sec.VI, eq.(S51), and δ t0 is the Kronecker delta reflecting the uncorrelated nature of the localization noise.Since the observed trajectories are sampled with a finite time step ∆, we discretize the relation which leads to Finally, fits are performed by minimizing the cost function with scipy's least squares function in python and using eq.( 24) to determine C fit vv (t) at discrete time points.As the MSD and VACF follow from the GLE (1) and the friction kernel eq. ( 7), the parameters to optimize are the kernel parameters a, b, τ , Ω, the mean squared velocity B and the localization noise width σ loc .Resulting fits are shown in Figs.4b,e, where the data is fitted up to 0.2 s in order to disregard the noisy part of the VACF (see SI Sec.VII for details).

Cluster analysis
The friction kernel in eq. ( 7) contains four parameters, together with the mean-squared velocity B each individual cell is characterized by five parameters.We perform an X-means cluster analysis [46], which is a generalization of the k-means algorithm [47].The k-means algorithm assigns unlabeled data to a predetermined number of k clusters by minimizing distances to the cluster centers.In the X-means algorithm the number of clusters is not predetermined, we allow cluster numbers from 2 to 20.The algorithm starts with the minimal number of clusters and finds the cluster centers using k-means.It then splits every cluster into two subclusters whose centers are again determined by k-means.New subclusters are accepted if they improve the clustering quality accounting for the increased number of parameters.For this we use the minimal noiseless description length (MNDL) criterion [55,56].We use an implementation of the X-means algorithm in python [57] and use individual cell parameters as initial cluster centers [58].The X-means algorithm can converge to different final results depending on the initial cluster-centers.Thus, we use all 59 • 58/2 possible combinations of initial cluster centers and use the result that occurs most often.We rescale each parameter by the median of its distribution.

Markovian embedding
A similar kernel to eq. ( 7), namely can be derived from a system of harmonically coupled particles.In fact, eq. ( 26) becomes equivalent to eq. ( 7) if the oscillation period 1/Ω is much smaller than the decay time τ , which is the case for the extracted algae kernels.The Hamiltonian describing the coupled particles takes the form where m i are the masses of the two particles and K = bm is the harmonic coupling strength.In the presence of friction, quantified by friction coefficients γ i for each component, and coupling the particles to a heat bath at temperature T , the coupled equations of motion are given by where F Rx (t) and F Ry (t) are random forces with zero mean and second moment ⟨F Ri (0)F Rj (t)⟩ = δ ij 2γ i k B T δ(t).In SI Sec.XIII it is shown that the coupled equations of motion (28) are equivalent to a GLE in the form of eq. ( 1) for the coordinate x(t) with a memory kernel Γ(t) given by eq. ( 26) [59].The friction of the first particle leads to the δ-contribution of Γ(t) and the harmonic coupling to the second particle leads to the oscillating exponentially decaying contribution.The parameters of eq. ( 28) translate into the parameters of eq. ( 26) as Supplementary information: Data-driven classification of single cells by their non-Markovian motion

II Effective kernel follows uniquely from correlation functions
We can write the GLE with an arbitrary potential U (x(t)) as the unique solution for Γ(t) in Laplace space is given by is given by Γ(q) = − 1 Ĉvv (q) q Ĉvv (q) − C vv (0) + Ĉv∇U (q) .(S4) The existence of a unique friction kernel for any given input correlation functions C vv (t) and C v∇U (t) assures that one can always find an effective friction kernel Γ(t) that, when employed in the GLE and using Γ(|t|) = Γ R (t) = Γ v (|t|), reproduces the two-point correlation functions.Thus, every non-equilibrium we arrive at the Gaussian form and . (S19) For simplicity we set t ′ = 0 as the position correlation function only depends on the time difference t−t ′ .Inserting eqs.(S18) (S19) into eq.(S10), we finally obtain the position Green's function as a Gaussian with mean and standard deviation Thus the Green's function is entirely described in terms of the two-point We next calculate the effective harmonic potential strength K eff .The condition to find an effective description of the GLE eq.(S1) arises from setting eq.(S17) equal to its effective version which assures that the effective model has the same Green's function as the non-equilibrium model, as shown by eqs.(S20) and (S21).Here, χeff (ω) is the effective response function and Γ(t) is the effective kernel.Inserting eq.(S9) into eq.(S22) leads to ΓR (ω) extended MSDs for synchros exhibit a transition from the ballistic to the long time diffusive regime at a time of about t ≈ 2s, whereas for the wobblers the ballistic regime extends over more than ten seconds.The prefactors of the MSD in the diffusive regimes, i.e. the diffusivity D, is for the synchros shown in Fig. S1.

V Relation between the oscillation periods of the friction kernel and the VACF
In Fig. 4 in the main text one can note a subtle difference between the oscillation period of the friction kernel and the oscillation period of the VACF when comparing Fig. 4b to 4c or Fig. 4e to 4f.Since we derive an analytical expression for the VACF in eq.(S52) for the friction kernel given in eq. ( 7), we can evaluate the dependence of the VACF oscillation frequency, denoted by ω vv , on the kernel frequency Ω.From eq. (S52) we know that there are three complex frequencies in the VACF that can lead to oscillations, where the frequency ω i with the smallest real part dictates ω vv .In Fig. S2a we show the dependence of ω vv on Ω for different b and find the VACF frequency ω vv to be constant for small kernel frequencies Ω followed by a transition to the two frequencies being the same at a threshold value Ω t = ω vv (Ω → 0), which is defined by the constant value ω vv (Ω → 0) for small kernel frequencies intersecting with ω vv = Ω.This threshold value depends on the kernel parameters a, b, τ .It can be described by which follows from Figs. S2a-c.Since the median kernel frequency of the CR cells is of the order Ω ∼ 100 τ −1 and the oscillation amplitude is of the order b ∼ 10 4 τ −2 , the data lies in the transition regime between the constant VACF frequency to the linear regime ω vv = Ω, as can be seen in Fig. S2a.From Fig. S2a we know that the VACF frequency is always greater or equal to the kernel frequency, i.e. ω vv ≥ Ω.Moreover, the oscillation amplitudes of CR usually fulfill b > a 2 /4 and the inverse decay time τ −1 is small compared to the other kernel parameters, such that a > τ −1 and b > τ −2 /4 are always fulfilled.Therefore, the cells exhibit parameters for which b > max(a 2 /4, τ −2 /4) holds.Eq. (S27) thus tells us that the threshold frequency is given by Ω t = √ b.Combining this knowledge with ω vv ≥ Ω leads to ω vv ≥ √ b.Since the CR data lies in the transition regime between ω vv = √ b and ω vv = Ω, the VACF frequency ω vv is strongly influenced by the oscillation amplitude b.In order to further estimate the exhibited parameter range, the reader is referred to Fig. 5 of the main text, which includes all parameters of all cells.
The fact that the VACF frequency ω vv increases with increasing kernel oscillation amplitude b can be interpreted in the context of the Markovian embedding model in Sec.XIII, where the coupling strength of the harmonically coupled parts is proportional to the oscillation amplitude b, as shown in eqs.(S75), (S77), which leads to faster oscillations for higher b.Since every oscillation leads to a net forward motion of the cell, it leads to fast net forward motion for cells if they exhibit a large kernel oscillation amplitude b compared to the squared inverse decay time τ −2 and the squared δ-amplitude a 2 , which is the case in our data set (see Fig. 5).By realizing that the decay time τ is related to how long a cell 'remembers' its past trajectory, one can conclude that a fast decay of the oscillation would lead to a cell not being able to maintain its direction for a long time.Thus, it is useful to achieve fast net forward For the decaying oscillation model friction kernel of eq. ( 7) and using the effective form Γ(|t|) = Γ R (t) = Γ v (|t|) one obtains the half sided Fourier transform of the kernel as (S36) Inserting the result of eq.(S36) into eq.(S29) we find the response function which by inserting into eq.(S35) leads to where the constants c i and k i are given by Interpreting the integrand in eq.(S38) as a sum of the three terms each proportional to k i , we see that all terms have the same poles, where the term proportional to k 1 has an additional double pole at ω = 0.The solutions of ω 2 to the equation define the remaining poles, which we denote by ± ω 2 i and which we compute by numerically solving eq.(S45).Next we use the partial fraction decompositions 1 to rewrite the fraction of eq.(S38) as a sum of terms proportional to (ω 2 −ω 2 i ) −1 and one term proportional to ω −2 .Using the solution of the integrals for t > 0 with the condition Re(ω 2 i ) < 0 ∨ Im(ω 2 i ) ̸ = 0 ∧ Im( ω 2 i ) ̸ = 0, we can rewrite the integral of the MSD eq.(S38) as (S51) The VACF can be computed from the MSD by using eq.( 23) as The calculation of the MSD for the friction kernel of eq. ( 26) proceeds in the same way as shown for the kernel eq. ( 7).The response function then takes the slightly different form The integral eq.(S35) resulting in the MSD can be written in the same form as before shown in eq.(S38), where the constants c 1 , c 2 , k 1 , k 2 take different values of Thus, the MSD of the GLE eq. ( 1) with the friction kernel eq. ( 26) can be described by the same form eq. (S51) as for the friction kernel of eq. ( 7), where only the constants c i and k i have slightly different values as seen by comparing eqs.(S39)-(S44) to eqs.(S54)-(S57).6)), VACF C vv (t) (eq.( 3)) and friction kernel Γ(t) extracted by eq. ( 15) of a single (a)-(c) wobbler and (d)-(f) synchro are displayed as orange dots.Blue crosses in the VACF figures represent the results from the localization noise fit according to eq. ( 24) and the blue crosses in the MSD figures result from inserting the fit parameters of the VACF into eq.( 22).The black dashed line in (c) and (f) is the direct fit of eq. ( 7) to the extracted kernels, which is used to obtain parameters for the cell classification, see Methods Sec

VII Localization noise fit
The fitting procedure including finite time step and localization noise described in Sec.4.7 of the main text does sometimes not converge to results that agree well with the VACF of the data.In Fig. S3 we show two examples for which the fit including the localization noise (Methods eq. ( 24)) does not agree well with the VACF data.For 14 out of the 59 cells, the fits of the VACF using eq.( 25) do not converge to a stable set of parameters.Still the direct fit of the friction kernel model eq.( 7) agrees perfectly with the extracted friction kernels, which is why we use this direct fit to extract parameters, that are later used in the cluster analysis.
A typical size of the localization noise width resulting from the fits is σ loc ≈ 0.02 µm.Considering the resolution of the microscope of roughly ∼ 0.5 µm and approximating the area of a cell as roughly 25π µm 2 , one can estimate the number of pixels per cell as 100π.Thus, the error of the mean position of a cell evaluated by all pixels is estimated by 0.5µm/ √ 100π ≈ 0.03 µm, which is very close to the fitted localization noise width of σ loc ≈ 0.02 µm.For a typical velocity around v = 100 µm/s, the displacement during one time step ∆ = 0.002 s is 0.2 µm.Therefore, the localization noise accounts for ∼ 10% of the cell displacement.This relatively small localization noise explains why the direct fit of the model eq.( 7) to the extracted friction kernels works so well and the inclusion of localization noise effects is not necessary for our data set.

VIII Cluster analysis in lower dimensions
Since the wobblers are found to exhibit higher average speeds than the synchros, an intuitive and simple approach would be to distinguish the cells solely by their mean squared velocity B. Applying the cluster analysis described in Sec.4.8 of the Methods in one dimension for B, results in a assignment agreeing only to 69 % with the full classification result, see Fig. S4a.Especially slow wobblers are wrongly assigned as synchrose, as indicated in Fig. S4a by the black crosses.
As discussed in the main text, high-dimensional data sets are often analyzed with principal component analysis (PCA), which determines the directions explaining most of the data variance.Applying a PCA to the five dimensional parameter space of the extracted friction kernel parameters eq. ( 7) and B shown in Fig. 5, we find the first two components of the PCA to explain 83 % of the total parameter variance.Here, we use the PCA tool implemented in sklearn for python.The normalized vectors of the shown two PCA components in the order (a, b, τ, Ω, B) are given by (−0.49, −0.29, 0.16, −0.01, 0.81) and (0.63, 0.41, −0.5, 0.05, 0.43), respectively.This indicates, that no parameter alone can describe the complete variance and therefore no parameter alone can explain all the differences between synchros and wobblers.The parameters most important for the variance and discriminability of the single CR cells are a, b and B. However, the distinction into the two groups of wobblers and synchros by our cluster analysis (Sec.4.8 in the Methods) works the best using the complete five dimensional set of parameters.Applying the cluster analysis to the first two PCA components results in an accuracy of 90 %, as shown in Fig. S4b.Six wobblers are assigned to belong to the cluster of synchros (defined by the classification of [2] that agrees with our cluster analysis of the complete parameter set Fig. 5) as they lie in the transition area between the two clusters and are indicated by the black crosses in Fig. S4b.15) of (a) a single wobbler and (b) single synchro shown as green data points.The black line shows the direct fit of eq. ( 7) to the data points and the red dotted line results from inserting the fitted parameters from the black dashed line into eq.( 26) derived from the Hamiltonian eq. ( 27) in Sec.XIII.The agreement of the black dashed line and the red dotted line means that in the parameter range exhibited by the CR cells, the two kernels eq. ( 7) and eq.( 26) describe the data equally well.
IX Comparing friction kernel expressions eq. ( 7) and eq.( 26) The friction kernel of eq. ( 7) in the main text used to extract parameters from the data is very similar to the friction kernel eq. ( 26), which is derived in Sec.XIII from a system of harmonically coupled particles.The term 1/(τ Ω) in front of the sine in eq. ( 26) becomes small when the decay time τ is longer than the oscillation period 1/Ω.Several oscillations occur before the friction kernel has decayed for the CR data in Fig. 3 in the main text and Fig. S5, thus, the term 1/(τ Ω) is indeed small and the two friction kernel models have very similar shapes, as we show in Fig. S5.Here, we insert the extracted parameters from the friction kernel eq. ( 7) of two example cells into eq.( 26), the deviations are seen to be very small.

X Exemplary non-equilibrium model describing CR motion
The effective friction kernel Γ(t) given by eq. ( 26) leads to the Fourier transform (S58) Inserting eq.(S58) into eq.( 11) yields the Fourier-transformed VACF Cvv (ω) In order to find a non-equilibrium model, that leads to the same VACF as the kernel in eq. ( 26), we assume a specific form of the velocity friction kernel which leads to the Fourier transform Γ+ v (ω) = a v .We insert this together with the result from eq. (S59) into eq.( 12) and solve for ΓR (ω).By Fourier back transformation the result can be written in the form with the constants c i and k i being We solve this Fourier transform by performing the same steps as for the calculation of the integral (S38) derived in Sec.VI, where ω 2 i are the roots of the denominator of eq.(S61) in ω 2 , which we obtain numerically, and we additionally use The friction kernel in time domain is then retrieved as This exemplary mapping shows, that a simple Markovian friction kernel given by eq.(S60) combined with a random force correlation function that contains additional oscillating components,given by eq.(S63), lead to the same correlation function as described by the GLE with the effective friction kernel given by eq. ( 26).We have thus derived one possible non-equilibrium model that describes CR cell motion.Furthermore, this non-equilibrium model with Γ R (t) given by eq.(S63) and Γ v (t) given by eq.(S60) corresponds to the coupled system of differential equations with ⟨ξ x (0)ξ x (t)⟩ = 2aBδ(t) and ⟨ξ R i (0)ξ R j (t)⟩ = 2a R i Bδ ij δ(t).Solving eq.(S66) for F i (t) leads to which yields ⟨F i (0)F i (t)⟩ = B a R i τ i e −t/τ i .Now defining the random force as F R (t) = ξ x (t) + and Γ v (t) = 2a v δ(t).Thus, the system of eqs.(S64)-(S66) is equivalent to the GLE eq. ( 1) with Γ R (t) given by eq.(S63) and Γ v (t) given by eq.(S60), when the parameters are given by (S70) to with n being the number of smoothing iterations.We show the effect of the smoothing level on the VACF and the localization noise fit in Fig. S7 and the effect on the friction kernel in Fig. S8.For high smoothing iterations, the cell speed is underestimated, which is reflected by a decreasing mean squared velocity B for increasing n in Fig. S7.At the same time, the fitted localization noise strength σ loc decreases with increasing n.This is reflected by the smaller difference between the first two data points of the VACF.The estimate of the localization noise strength for high smoothing iterations n is not reliable anymore because it becomes very small, this explains why it is not decreasing monotonically with growing n, see Fig. S7.Since the first smoothing iteration n = 1 simply adds two consecutive points according to eq. (S73), one in principle expects the localization noise width σ loc to be halved compared to the non-smoothed data.This indeed one can see for the synchro shown in Fig. S7.
Since the smoothing of the data decreases the amplitude of the VACF, as seen in Fig. S7, it consequently decreases the amplitudes of the friction kernel parameters a and b and therefore at the same time it reduces the kernel frequency Ω, as Ω exhibits a complex relation with the frequency of the VACF, which is explained in detail in Sec.V.Moreover, we find the dip in the friction kernel at the first time step to decrease for higher smoothing iterations shown in Fig. (S8), which indicates that the dip in the friction kernel originates from the localization noise.This is explained by eq. ( 15), since the first point of the VACF is overestimated and the second point of the VACF is underestimated

Figure S8
The friction kernel Γ(t) extracted according to eq. ( 15) from data at different smoothing iterations described by eq. ( S73) is given for a wobbler on the left and for a synchro on the right.The solid lines connecting the data points are guides to the eye.due to the localization noise, which then propagates into the friction kernel and leads to a dip at short times.
The VACF of the smoothed data is less influenced by the localization noise, nevertheless, with every smoothing iteration, the deviation of the smoothed VACF from the non-smoothed VACF becomes larger, as shown in Fig. S7.Thus, we choose n = 1 smoothing iterations for our data as a compromise between minimizing the effect of noise and keeping a good resolution that accurately describes the actual cell velocities.
The friction kernel of eq. ( 26) in the Methods is equivalent to the friction kernel of eq.(S89) with the parameters given by a = γ x m τ = 2m y γ y Ω = iω 0 .

Fig. 1
Fig. 1 Sequences of phase-contrast microscopy images of Chlamydomonas reinhardtii (CR) algae exhibiting (a) synchro and (b) wobbler-type flagellar motion.The white halo around the cells is typical for phase-contrast microscopy[42].(c) Sketch of a CR cell: the distal striated fiber (DSF) connects the two basal bodies[39], which anchor the flagella and are connected to the nucleus by nuclear basal-body connectors (NBBC)[38,40].

Fig. 2
Fig. 2 Exemplary cell-center trajectory [45] (a) of a wobbler of duration 2.2 s and (b) of a synchro of duration 8.2 s.The insets show trajectory fragments of duration 0.2 s each.Velocity distributions of (c) wobblers and (d) synchros, individual cells are distinguished by color.(e) Velocity distributions of individual cells rescaled by subtracting the mean velocity of individual cells v ind and dividing by their standard deviation σ ind for wobblers (cyan crosses) and synchros (red stars).The dashed line is the normal distribution.(f) Mean velocity distribution averaged over all cells (wobblers and synchros).For the green circles the cell velocities are rescaled by subtracting the ensemble mean velocity v and dividing by the ensemble standard deviation σ, for the black circles the cell velocities are rescaled by subtracting the mean velocity of individual cells v ind and dividing by their standard deviation σ ind as in (e).The normal distribution is indicated by a dashed line.(g) Individually rescaled velocity distributions of all cells for three different time windows.(h) Distribution of recorded trajectory lengths for wobblers and synchros.
Fig.3Results for the MSD, C MSD (t) defined in eq.(6), the VACF, C vv (t) defined in eq.(3) and the friction kernel Γ(t), extracted according to eq. (15), for wobblers in (a)-(c) and synchros in (d)-(f).Different colors represent results for individual cells, the black lines in (a), (b), (d), (e) denote the average over all cells.For the friction kernels the black line is computed from the average VACF.The dashed lines in (a), (d) indicate ballistic and diffusive scaling.Insets show the long-time behavior of the average quantities on a lin-log scale.

Fig. 4
Fig. 4 Results for the MSD, C MSD (t), VACF, C vv (t), and friction kernel Γ(t), of a single (a)-(c) wobbler and (d)-(f) synchro (orange dots).The black dashed lines in (c), (f) denote fits of eq.(7) to the extracted friction kernel determining the individual cell parameters shown in Fig. 5.The black dashed lines in (a), (d) denote the analytical result eq.(S51) obtained from the friction-kernel fit in (c), (f) and the mean-squared velocity B = C vv (0).Blue crosses in (b), (e) denote a fit of the discretized expression for the VACF including localization noise, eq.(24), blue crosses in (a), (d) denote the corresponding prediction for the MSD, eq.(22), using the same parameters as in (b), (e), blue crosses are connected by blue straight lines.

Fig. 5
Fig. 5 Scatter correlation plots of individual CR cell parameters, consisting of the friction kernel parameters (a, b, τ , Ω) defined in eq.(7) and the mean-squared velocity B. All parameters except B are presented on a linear scale.Synchros are shown in red and wobblers in cyan according to our cluster analysis, which perfectly matches a categorization based on the visual analysis of flagella motion.
uniquely from correlation functions III Green's function is given in terms of positional two-point correlation function IV Long-time MSDs V Relation between the oscillation periods of the friction kernel and the VACF VI Derivation of the analytical expression for the MSD VII Localization noise fit VIII Cluster analysis in lower dimensions IX Comparing friction kernel expressions eq.(7) and eq.(26) X Exemplary non-equilibrium model describing CR motion XI Information on the cell orientation is contained in the cell-center trajectories XII Effects of smoothing trajectory data XIII Markovian embedding of the friction kernel eq.(26) 1 arXiv:2311.16753v1[physics.bio-ph]28 Nov 2023 I Video captions Video 1.: A representative synchro.High-speed video microscopy at 500 frames per second obtained by phase-contrast imaging of a synchro CR cell showing the planar and synchronous breaststroke motion of the flagella.Between t ∼ 330 − 390 ms the synchronous beat of the flagella exhibits a phase slip, meaning the synchronicity of the flagella is disturbed in that short time interval.Video 2.: A representative wobbler.High-speed video microscopy at 500 frames per second obtained by phase-contrast imaging of a CR cell which paddles the flagella in an asynchronous and irregular manner resulting in the wobbling motion of the cell body.

Figure S1
Figure S1 MSDs from low-resolution data with 10x magnification and a time step of ∆ = 0.02 s of (a) wobblers and (b) synchros according to an approximate classification scheme explained in the text.The average of the colored single-cell MSDs is shown in black, the grey line represents the average MSD from Fig. 3 with higher temporal and spatial resolution of 40x and ∆ = 0.002 s.

Figure
Figure S2 (a) The dependence of the VACF frequency ω vv on the kernel frequency Ω of the kernel eq.(7) is shown for different kernel oscillation amplitudes b and fixed δ-peak amplitude a = 100τ −1 .The black line indicates the linear behavior ω vv = Ω.(b) The dependence of the VACF frequency ω vv on the kernel oscillation amplitude b is shown for varying kernel δ-peak amplitude a and Ω = τ −1 , which represents the low kernel frequency plateau value shown in (a) for small Ω.The horizontal lines represent ω vv = a/2 for the respective values of a, the diagonal line represents ω vv = √ b and the vertical lines represent the intersection of the other lines, which mark a minimum in the parameter b positioned at b = a 2 /4.(c) The dependence of the VACF frequency ω vv on the kernel decay time τ shown for different δ-peak amplitudes a and fixed b = 10 5 Ω 2 .The regime shown for τ /Ω −1 represents the low kernel frequency plateau value shown in (a) for small Ωτ < 100.The black line indicates the scaling ω vv = τ −1 /2.

Figure S3
Figure S3Two example cells for which the localization-noise fit to VACF does not agree well with the data.The MSD C MSD (t) (eq.(6)), VACF C vv (t) (eq.(3)) and friction kernel Γ(t) extracted by eq.(15) of a single (a)-(c) wobbler and (d)-(f) synchro are displayed as orange dots.Blue crosses in the VACF figures represent the results from the localization noise fit according to eq. (24) and the blue crosses in the MSD figures result from inserting the fit parameters of the VACF into eq.(22).The black dashed line in (c) and (f) is the direct fit of eq.(7) to the extracted kernels, which is used to obtain parameters for the cell classification, see Methods Sec.4.6.The dashed black lines in the MSD figures result from inserting the fitting result of the friction kernel (dashed lines in (c), (f) respectively) and the mean squared velocity B = C vv (0) into the analytical solution of the MSD eq.(S51).

. 4 . 6 .
Figure S3Two example cells for which the localization-noise fit to VACF does not agree well with the data.The MSD C MSD (t) (eq.(6)), VACF C vv (t) (eq.(3)) and friction kernel Γ(t) extracted by eq.(15) of a single (a)-(c) wobbler and (d)-(f) synchro are displayed as orange dots.Blue crosses in the VACF figures represent the results from the localization noise fit according to eq. (24) and the blue crosses in the MSD figures result from inserting the fit parameters of the VACF into eq.(22).The black dashed line in (c) and (f) is the direct fit of eq.(7) to the extracted kernels, which is used to obtain parameters for the cell classification, see Methods Sec.4.6.The dashed black lines in the MSD figures result from inserting the fitting result of the friction kernel (dashed lines in (c), (f) respectively) and the mean squared velocity B = C vv (0) into the analytical solution of the MSD eq.(S51).

Figure
Figure S4 (a) Mean squared velocities B for all cells; a cluster analysis solely based on B leads to the wrong assignment of 18 wobblers as synchros, indicated by black crosses.(b) Projection of the five dimensional extracted parameters shown in Fig. 5 on the first two components of the PCA, which explain 61 % and 22 % of the total parameter variance, respectively.The black crosses indicate six wobbler cells that are wrongly assigned as synchros when applying the cluster analysis only to the shown two PCA components.

Figure S5
Figure S5Friction kernel Γ(t) extracted by eq.(15) of (a) a single wobbler and (b) single synchro shown as green data points.The black line shows the direct fit of eq.(7) to the data points and the red dotted line results from inserting the fitted parameters from the black dashed line into eq.(26) derived from the Hamiltonian eq.(27) in Sec.XIII.The agreement of the black dashed line and the red dotted line means that in the parameter range exhibited by the CR cells, the two kernels eq.(7) and eq.(26) describe the data equally well.

Figure
Figure S6 (a) Trajectory of a synchro shown in black, the running average position defined by eqs.(S71), (S72) with a decay time T = 0.1 s is shown in red.The inset in the middle shows a zoom into the first 0.2 s of the trajectory.(b) The angle between the x-axis and the cell orientation directly extracted from video data is shown for the trajectory in (a) as black crosses.The angle between the x-axis and the running-average velocity ⃗ V (t) determined by eq.(S71) is shown in red.

Figure
Figure S7The VACF C vv (t) of a wobbler (left) and a synchro (right) is shown for different smoothing iterations n described by eq.(S73), where n = 0 denotes the original non-smoothed data.The mean-squared velocities B in units of µm 2 /s 2 and the fitted localization noise width σ loc in units of µm are both given in the color of the according smoothing iteration.The fitting result of the VACF including localization noise, described in Sec.VII, is shown as solid lines for n = 0 and n = 9 in the respective color.