DNA and RNA binding residues generally possess electropositive atoms that interact with the DNA/RNA electronegative atoms or water oxygen atoms. In the absence of DNA/RNA or water, these DNA/RNA binding residues would be in an unfavorable electrostatic environment due to the electrostatic repulsion among the electropositive atoms and would therefore be energetically unstable. On the other hand, DNA/RNA binding residues within the same family are known to be highly conserved. They would be expected to preserve not only their physico-chemical features (i.e., aa type and solvent accessibility), but also their energetic features due to their critical functional roles. Hence, solvent-accessible residues that share the highest evolutionary conservation of aa type, as well as structural and energetic features within the same family are predicted to bind DNA/RNA.
Given al-residue DNA-binding protein structure, all Asp/Glu residues were deprotonated, while Arg/Lys residues were protonated; His residues were protonated or deprotonated depending on the availability of hydrogen-bond acceptors in the structure. Next, l mutant structures were generated by replacing Ala, Asn, Asp, Cys, Gly, Ser, Thr, or Val in the wild-type structure to Asp− and the other residues to Glu−. The side chain replacements were carried out using SCWRL, followed by energy minimization with heavy constraints on all heavy atoms using AMBER to relieve any bad contacts. Based on the wild-type/mutant structures, the gas-phase (e = 1) electrostatic energy of the wild-type (Eelecwt) or mutant (Eelecmut) protein in the folded state relative to that in an extended reference state (E′ elecwt or E′ elecmut) was computed using AMBER with the all-hydrogen-atom AMBER force field. In this extended reference state, the residues do not interact with one another; hence, the electrostatic energy difference between the wild-type (E′ elecwt) or mutant (E′ elecmut) unfolded protein is equal to the difference between the electrostatic energies of the native residue at position i (E′ eleci) and the corresponding mutant Asp−/Glu− (E′ elecD/E). The change in the gas-phase electrostatic energy ΔΔelec upon mutation of residue i to Asp−/Glu− is given by:
ΔΔeleci = (Eelecmut,i− Eelec) − (E′ elecD/E− E′ eleci) | (1) |
Next, the structure of protein X was aligned with that of each homologous protein representative using the MASPCI program to determine the correspondence between the N residues of protein X and the respective residues in the homologous proteins. N′ residues of the N residues of protein X were selected if their corresponding residues in any of the homologous proteins were also solvent accessible with Rankele + C ≥ Max. If N′ = 0, then the original N residues of protein X were chosen. The N′ or N residues were grouped according to their cleft number, and the cleft containing the most residues was predicted to be the DNA/RNA-binding site. If two or more clefts contained the same number of residues, then the residues comprising these clefts were predicted to bind DNA/RNA