1 - PSI-BLAST tutorial | ||||||||||||||||||||||||||
Read and try the PSI-BLAST tutorial of the MyHits web site, where the PSI-BLAST program is used in conjunction with other pieces of software (e.g. Jalview). The NCBI also provides a tutorial for its own web interface. | ||||||||||||||||||||||||||
2 - Iterative Training | ||||||||||||||||||||||||||
Execute four cycles of PSI-BLAST using the sequence below as initial query
Swiss-Prot database. Simply launch the next iteration with the ...next cycle
option. At every cycle, record the number of matches equal or below the threshold and the E-values produced by the protein
ERCC5_XENLA , FEN1_HUMAN , DIN7_YEAST . Complete the table below and explain what you observe.
|
||||||||||||||||||||||||||
3 - Building a "Model" for the Thioredoxin Domain | ||||||||||||||||||||||||||
Using PSI-BLAST, retrieve all homologs of the human
thioredoxin protein THIO_HUMAN in Swiss-Prot . The
"brute force" approach is the following one:
Using MAFFT, re-align the matched sequences. Indeed the alignment of the sequences obtained
from the automated iterative mode of PSI-BLAST has accumulated many small errors of alignment. Using Jalview2, trim the extremities of the MSA, i.e. remove these highly gapped regions where the alignment
quality is poor. They are just not interesting. You now have in your hand an MSA made of a set of sequences that are (hopefully) representative of the diversity of the thioredoxin sequence. You can view it as a kind of "model" of the thioredoxin domain. Save it on your side (in FASTA format) for future use. At this stage, you may think about improving your model by adding or removing sequences, or by manually editing the alignment. This however requires some biological expertise. |
||||||||||||||||||||||||||
4 - Exploiting the Thioredoxin "Model" | ||||||||||||||||||||||||||
To look at all the human proteins with a thioredoxin domain found in
Now pay attention to the graphics that appear in the output. You should be able to recognize
the thioredoxin active site |