Almost every page of the MyHits site contains an example that can be cut and pasted, as well as brief instructions or links to documentation. Hence, how to do something should not be a problem.
On the other hand, we expect some users to have troubles of the type "What the hell am I doing?". We designed MyHits as a toolbox and because of this, it can be used to rapidly produce lots of nice-looking but meaningless output.
Although we cannot replace any bioinformatics course, the few principles given below and the examples from our collection might help you find your way.
See
General scheme of the web site
Protein, motif, match... these are what Hits is all about.
Protein |
- A database entry containing information about a protein, i.e. a sequence and annotations.
|
Motif |
- Any structure (not necessarily tertiary) of known or unknown function that is found in several proteins and can be recognized from its sequence. By extension, a prediction tool is used to detect a motif in a protein sequence.
|
Match/Hit |
- The combination of a motif and a protein, i.e. the information that a motif M is present in protein sequence S from position P1 to P2.
|
Database codes and identifers |
- The convention adopted for the designation of proteins and motifs consists of a database code followed by a colon and by an identifier or an accession number.
- For example, sw:HBB_PHORU is the SwissProt entry (sw) that describes the hemoglobin beta chain (HBB) of the greater flamingo (Phoenicopterus ruber, PHORU), and prf:GLOBIN is the Prosite profile (prf) for globins that "hits" the flamingo protein. The general documentation has more details about the available databases.
|
Hubs |
- A special web page used to transfer results of one tool to the query form of another tool or viewer.
|
Tools |
- Any web page or service that takes a sequence or a motif as the input data and produces results that can be forwarded to a Hub. E.g., BLAST, MAFFT.
|
Viewer |
- Any software or web page used to display results in a user-friendly manner (very often graphically). E.g., SEView, Dotlet, Jalview.
|
Similarity among proteins |
- Hits deals with similarity between proteins at the motif level rather than at the residue level (e.g. BLAST or FastA). The advantage is that homology can be detected even if similarity is low; the drawback is that no homology can be detected between sequences that do no contain motifs. In this case you must use a tool to define your own motifs (i.e., PSI-BLAST). Fortunately the number of known motifs is already large and is growing.
|
Redundancy among motifs |
- The same protein domain is often recognized by more than one motif, as for example the SH2 domain which is usually "hit" by prf:SH2 and pfam:SH2. Such motifs are generally not equivalent in the sense that there are a few proteins that contain only one of these motifs. This is due to the fact that the different databases of motifs were built using different approaches - for both technical and biological reasons.
|