|
|
Part 1: Finding your Gene
- In your web browser, go to http://www.wormbase.org/. You may find it convenient to open WormBase in one window and this tutorial in another window.
- Near the top of the page, you should see a search box called "Find". Select "Any Gene" in the dropdown, and type daf-7 in the search box. Hit "Search".
- You should be presented with a page of information regarding the gene. This page often includes most of what an investigator will want to know about a gene.
- Scroll down the page and look at the various categories on the left side in yellow. We will start with the identification category. We provide an annotated version of the screen here to help you find some of the key features.
- The identification category provides a bit of history on the gene and some of its synonymous names. Since gene names are often where confusion arises, we need to spend some time here.
The main name daf-7 is based on the mutant phenotype. This name indicated that the gene was identified because mutations resulted in abnormal dauer larva formation. The sequence name B0412.2 is the corresponding DNA sequence. This name indicates that the gene was found on cosmid B0412 when the genome was being assembled and sequenced. Furthermore, this was the second predicted gene on that cosmid and hence B0412.2; these numbers after the decimal were assigned serially but refinements to gene annotations means that they may not always correspond to the left to right (or right to left) order on the cosmid. daf-7 has both gene names, so we know both a mutant phenotype and a DNA sequence for this gene.
Not all genes identified by mutant phenotype have been connected to a DNA sequence, so not all genes have a corresponding sequence name yet. Similarly, many genes identified by DNA sequence do not have a gene name based on a mutant phenotype, so it is helpful to know both gene name and sequence name. The final identification field WB Gene ID is the unique identifier used by WormBase and is not widely needed for most genetics applications.
- The identification field also includes a concise description of the function of the gene. For daf-7, this includes information on the protein product (a member of the TGF-β superfamily), the biological function (dauer formation), and even where the gene is expressed (ASI neurons). A lot of experimental results are captured in this one sentence. The NCBI KOG is the group of genes from other eukaryotes that are the most likely orthologues of daf-7, as determined computationally. (The asterisk next to NCBI KOG leads to a more complete definition of the KOG project.) Click on some of the other links to see what additional information can be easily accessed from this page.
- The gene model provides information on the structure of the gene. The predicted structure of the daf-7 gene was confirmed from cDNAs, and is 1937 base pairs long to encode a transcript of 1053 nucleotides. The name of the protein is given, as well as its predicted size, 350 amino acids. All of these are links to other information in the database. Notice also that the Gene Model identifier has a footnote. Click on the small white cross next to the word "Footnotes" to see what the footnote refers to.
As it turns out, the DAF-7 protein is similar to other transforming growth factors, as identified by the pfam project that describes and characterizes protein families. Similarly, the History field can be expanded to reflect any changes that have occurred as this gene has been further studied-for example, changes to the predicted structure or name of the gene. The structure of daf-7 has not been changed since 2004, which is when it was entered into WormBase from prior databases. This structure has been well-established for some time.
- Other categories in yellow on the left side of the gene summary page can be explored in this same way. Scroll down the page and follow along as we describe each one. The other categories include
- the location, both the genetic location based on mapping data (linkage group III at position -25.85) and the physical location based on the genome sequence (chromosome III, nucleotides 811821 to 813757). The genetic location indicates that the gene maps 25.85 map units to the left of an arbitrarily assigned reference point on the chromosome. (The minus sign tells us that the gene maps to the left of the reference point, rather than to the right.)
- the expression pattern based on a reporter gene GFP, as well as the time of transcription.
- the function. This is an extensive set of information on experiments that indicate the known or putative function of the gene. Each of these is a link that leads to an extensive body of other research on the function of daf-7. Note that it includes information on mutant phenotypes, on RNAi experiments, on both genetic and physical interactions, on microarray expression data, and so on. All of the information in this category is worth taking some time to explore on your own.
-
gene ontology, describing the known or suspected biological and cellular process. The Gene Ontology project is described in more detail on pages 156 and 338 in the book.
-
genetics, including the different alleles for the gene, the sequence information on known mutations, and any naturally occurring polymorphisms. This also includes what genetic strains with the mutated gene are available from the Caenorhabditis Genetics Center (CGC).
-
homology, based on sequence comparisons with genes in other nematodes and other organisms. These searches have been done by the scientists at WormBase and identify the most likely orthologous genes in other nematode species, making it possible to perform a quick evolutionary comparison without having to log on to the BLAST server yourself.
-
reagents, which are some of the experimental tools that are available to study the gene including transgenic strains and PCR primers. This is very handy since it tells the investigator what genetic strains are available and even what primers to use to amplify the gene using PCR.
-
bibliography, providing links to all references to daf-7 in published journals and meeting abstracts
Don't be too worried if you don't understand all of the terms. Most of them have explanatory links which you can follow, and many of them are described in other chapters in the textbook. daf-7 has been extensively studied for a long time, so there is a lot of information that can be accessed. Not all of these fields are this complete for other genes.
|