Flap Dynamics in Pepsin-Like Aspartic Proteases: A Computational Perspective Using Plasmepsin-II and BACE-1 as Model Systems

The flexibility of β hairpin structure known as the flap plays a key role in catalytic activity and substrate intake in pepsin-like aspartic proteases. Most of these enzymes share structural and sequential similarity. In this study, we have used apo Plm-II and BACE-1 as model systems. In the apo form of the proteases, a conserved tyrosine residue in the flap region remains in a dynamic equilibrium between the normal and flipped states through rotation of the χ1 and χ2 angles. Independent MD simulations of Plm-II and BACE-1 remained stuck either in the normal or flipped state. Metadynamics simulations using side-chain torsion angles (χ1 and χ2 of tyrosine) as collective variables sampled the transition between the normal and flipped states. Qualitatively, the two states were predicted to be equally populated. The normal and flipped states were stabilized by H-bond interactions to a tryptophan residue and to the catalytic aspartate, respectively. Further, mutation of tyrosine to an amino-acid with smaller side-chain, such as alanine, reduced the flexibility of the flap and resulted in a flap collapse (flap loses flexibility and remains stuck in a particular state). This is in accordance with previous experimental studies, which showed that mutation to alanine resulted in loss of activity in pepsin-like aspartic proteases. Our results suggest that the ring flipping associated with the tyrosine side-chain is the key order parameter that governs flap dynamics and opening of the binding pocket in most pepsin-like aspartic proteases.

Unbinding using metadynamics: Plm-II-ligand complex Prior to metadynamics simulation on Plm-ligand complex (PDB ID: 4Y6M ), 1 we used the following procedures to prepare the system for simulation. The ligand ( Figure S18) was extracted from the complex and was parameterised by the general amber force field (GAFF). 2 The ligand was positively charged (+1) and charges were assigned using the AM1-BCC. 3 The protein was treated using the FF14SB force field. The complex was then neutralised using 5 sodium ions and immersed into a truncated octahedral box with TIP3P water mod- Figure S2: A simplified view of the COM CVs used in this study. A, B and C denotes centre of mass corresponds to flap (residue 58-88), coil (residue 282-302) and catalytic aspartic dyads respectively.

A B
C D E Figure S3: Coefficients corresponding first 5 TICs. The X axis corresponds to the features which are described in Table S3. One can see that TIC1 and TIC3 able to capture rotational degrees of freedom associated with Tyr side-chain. TIC1 gives higher weight to sinχ 2 which makes χ 2 the slowest degrees of freedom.  els such that no solute atom is within 1.0 nm of the box edge. Initial minimisation with steepest descent algorithm was performed for 5000 steps followed by restrained equilibration (a restrained of 500 kJ/mol was applied on protein C α and ligand heavy atoms) for 2 ns in the NPT ensemble with Berendsen barostat used to maintain the pressure at 1 bar. The temperature was fixed at 300 K in both cases using the velocity rescaling algorithm. All bonds lengths were constrained using the LINCS algorithm 4 to enable a time-step of 2 fs. A non-bonded cut-off of 1.2 nm was used, and long-range electrostatics were treated by PME using a grid spacing of 0.1 nm. The restrained equilibration was followed by an unrestrained equilibration for 5 ns in the NPT ensemble keeping all other simulation settings identical as before. Finally a snapshot with a volume near the average of this ensemble was used as a starting point for the metadynamics simulation. We have also performed 100ns unbiased MD simulation of the protein-ligand complex to check the stability of the ligand in the active site ( Figure S23).
In order to estimate the motion necessary for ligand unbinding, we have performed welltempered metadynamics using the distance between centre of mass of the active site residues 1 and the ligand heavy atoms as CV. This kind of CV was used recently by Dodda et al. 5 in order to study unbinding of a small molecule inhibitor from HIV NNRTI. The metadynamics simulation was performed at 300K with a gaussian height of 1.2 kJ/mol and a width at 0.011 nm, deposited every 1 ps. The simulation was stopped when CV distance reached ∼ 5 nm.

Reweighting with distance
To understand the conformational flexibility of flap and coil region region in apo Plm-II, we devised two distances (DIST2 and DIST3) which quantifies conformational flexibility across all simulations ( Figure S5). In case of apo BACE-1, C α distance between Asp-32 and Thr-72 is referred as DIST2 whereas the C α distance between Ash-228 and Gln-326 is referred as DIST3. The distances were not used as CVs. In our initial test calculations, we observed that 1 active site residues are: Tyr77, Asp34, Asp214, Ile123, Ser37, Ile32, Val78, Thr217.
a moving harmonic potential on the distances led to distorted flap conformations. However, we reweighted the free energy landscapes as a function of DIST2 and DIST3 as the distances gave a better quantitative view of the movement of the flap and the coil region. Reweighting the free energy surfaces on DIST2 and DIST3 also gives an understanding of which is the dominant movement between the flap and the coil region. Reweighting was also performed on several H-bond distances involving Tyr.
A reweighting algorithm has been used to calculate the unbiased probability distribution of variables in the well-tempered metadynamics simulations. 6 If a bias is acting on a system, it is constantly changing the biased probability distribution of the system. Using the reweighting algorithm, we can remove the effect of bias and recover the unbiased probability distribution along chosen degrees of freedom.

Computational tools
All simulations were performed using Gromacs 2018 7 patched with Plumed . 8 The tLeap module of Amber was used to generate topology and co-ordinate files. Acpype 9 was used to convert the Amber topology and co-ordinates to a Gromacs compatible version.

Normal state
The normal state can be divided into two parts: N − (− π 3 radian) and N + ( π 3 radian). N + conformation is unstable because the γ atom (CG) of tyrosine is in close contact with backbone CO group ( Figure S6). In BACE-1, H-bond to Lys was formed due to N + orientation of tyrosine. Figure S6: N + conformation of tyrosine.

Clustering
Clustering was performed using TIP3P MD simulation (denoted as R1-T3P) of apo Plm II.
We used K-centers clustering algorithm integrated with Python package enspara 10 (https: //github.com/bowman-lab/enspara) to generate 10000 cluster centers. Clusters were defined geometrically using backbone RMSD of residues 73-88. For each cluster centers we have calculated DIST2, DIST3, distance between Tyr-Trp and Tyr-Asp using Plumed.

Statistical analysis
We have calculated time dependent free energy difference between flipped and normal (− π 3 radian) state in Torsion-Metad simulations. We integrated multiple free-energy profiles in two basins defined by the following intervals on χ 1 space: flipped (−2.5 < χ 1 < −3.14); normal (−1.5 < χ 1 < −0.5). We have also calculated free energy difference between H-bond to Trp and Asp and different basins along DIST2 ( Figure S6, S7 and S8).
Free energy difference was also calculated between N − (− π 3 radian along χ 1 ) and N + ( π 3 radian along χ 1 ) conformations of the normal state. The block averages are calculated using reweighted free energy surface along χ 1 over a period of 50 ns and the precision is approximated by standard deviation in each block.
Statistical analyses was performed using a bootstrapping procedure with 100 repeats.      Table S8 in the Supplementary Information.