Once the molecule file is fully loaded, the image at right will become live. At that time the "activate 3-D" icon will disappear.

BINF731 - Assignment 2. Protein modeling. .

a) Select a sequence of a human enzyme from one of the protein sequence databases that has a homolog in PDB with at least 30%, but not more than 60% of sequence identity. (Recommended length of the selected sequence is 150-250 residues).

I started by using Uniprot (TrEMBL) database and searching by the parameters of ‘Homo Sapiens’ and sequence length 150-250. To speed up the search I focused on transmembrane proteins. I would then confirm the Uniprot page didn’t have the PDB listed structure as well as PDB directly to confirm it’s of unknown structure. The Human protein selected is Claudin (Gene CLDN1). UniProtKB - A5JSJ9 (A5JSJ9_HUMAN)

>tr|A5JSJ9|A5JSJ9_HUMAN Claudin OS=Homo sapiens OX=9606 GN=CLDN1 PE=2 SV=1 MANAGLQLLGFILAFLGWIGAIVSTALPQWRIYSYAGDNIVTAQAMYEGLWMSCVSQSTG QIQCKVFDSLLNLSSTLQATRALMVVGILLGVIAIFVATVGMKCMKCLEDDEVQKMRMAV IGGAIFLLAGLAILVATAWYGNRIVQEFYDPMTPVNARYEFGQALFTGWAAASLCLLGGA LLCCSCPRKTTSYPTPRPYPKPAPSSGKDYV

Searching PDB using the sequence gives 9 hits ranging from 34% to 60% identity with four belonging to Homo Sapiens source organism. (Default search parameters with ‘Display results as: Polymer entities’ to show sequence identity match%).

b) Predict the three-dimensional structure for the selected sequence using at least two methods, e.g., fold recognition and homology modeling (Phyre2, I-TASSER, RaptorX, SwissModel, Modweb, etc.).

The amino acid sequence was used to predict the three-dimensional structure using Phyre2 and SwissModel.

For Phyre2 the modelling mode was set to ‘normal’ and the following 3D model was produced.
For SwissModel, the parameters were set to default, and 7kp4.1.A ended being selected for the homology model. And the follwoing model was produced.
c) Analyze the quality of your model using one of the structure validation or verification tools.

While Global model quality estimate (GMQE) is 0.67 +- 0.06 indicating a higher reliability, this quality estimating was estimated from a target-template alignment to reflect the accuracy of the built model. Therefore, I will resort to analyzing the quality through the QMEAN. The QMEAN is -5.05. The QMEAN z-score can used to analyze quality as it consists of other individual terms. A QMEAN score of -4.0 or below indicates our model is of low quality. After brief consultation of the QMEAN publication, this suggests that our model doesn’t agree with experimental structures of similar size. This isn’t too surprising considering it is a transmembrane protein and other known structure proteins are from other Claudin gene classes.

Benkert, P., Biasini, M., Schwede, T. Toward the estimation of the absolute quality of individual protein structure models. Bioinformatics 27, 343-350 (2011).” https://swissmodel.expasy.org/docs/help#GMQE

The difference between the models is highlighted here in green.

d) Evaluate separately one of the biologically important parts of the structure (e.g., an active site or binding site)

Claudins are transmembrane proteins of the tight junction, they play a role in cell to cell contact between two plasma membranes. They are supposed to regulate selective movement of solutes across membranes. As seen in the ‘tunnelview’ buttons, both models show the 4 alpha-helices which is contains a high density of hydrophilic residues confirming its transmembrane presence. There is a slight difference in both generated models. And that is present in the extracellular loops, highlight by clicking ‘Diff-Pyhre2’ and ‘Swiss-Dif’. This variation also plays in a role in these ‘extra cellular loops’ which are thought to play a role in homophilic interactions that play a role in tight junction function(1).
e) Introduce residue substitution(s) at one or more positions of your original sequence, which would likely affect the structure and/or function of this protein. Model the structure of this mutant and compare it with the structure of the wild type model.
Glutamine can be highlighted in the wildtype by clicking ‘GLN’ for the SwissModel and ‘GLN-Phyre2’ for the Phyre2 model.
The mutation introduced Q57E, changes Glutamine to Glutamic acid which is going from neutral to a negatively charged amino acid. The introduced mutation is visualazied through the ‘Q57E’ buttons. The Q57 residues of the claudin protein are highly conserved throughout evolution. This mutation in the paracellular space part of CLDN19 has been reported to be associated with renal failure and Renal Magnesium wasting(2), hence suggesting the conserved domains I should note additionally, I tried introducing other several mutations focusing on making the change to a significantly different amino acid property. Aswell as introduced 1-3 mutations in significant regions that I thought should affect folding, however no significant alteration to secondary structures was visibly caused, however the difference side chains could affect the channel function behavior, such as preventing selected solutes from passing through due interference from charged sidechains within the tunnel.
1. Krause G, Winkler L, Mueller SL, Haseloff RF, Piontek J, Blasig IE. Structure and function of claudins. Biochim Biophys Acta. 2008 Mar;1778(3):631-45. doi: 10.1016/j.bbamem.2007.10.018. Epub 2007 Oct 25. PMID: 18036336

2.Konrad M, Schaller A, Seelow D, et al. Mutations in the tight-junction gene claudin 19 (CLDN19) are associated with renal magnesium wasting, renal failure, and severe ocular involvement. Am J Hum Genet. 2006;79(5):949-957. doi:10.1086/508617

-Mohammed El Allam

Page skeleton and JavaScript generated by the Export to Web module of Jmol 14.30.2 2020-04-03 17:32 on Nov 19, 2021.
Based on a template by A. Herráez and J. Gutow

If your browser/OS combination is Java capable, you will get snappier performance if you use Java

This will be the viewer still of jmol image