How does the search work?

Simply, this resource is a single table of mouse-derived gene homology associations taken from cancer screens with some meta data for context. The effort put into building a tool in this format is entirely meant to aid the process of accessing a set of potential genes of interest from such findings.

Filters

There are 4 parameters to filter the dataset by. All of these filters brought together will perform a join of the full table to produce a single, filtered set of genes.

  • Species
    • This table contains gene homology derived from mouse genes for several species. Mouse and Human are shown by default. You can search for genes from others, and the equivalent homologs will still appear.
    • Important!!: This filter determines which set of gene data the Genes input searches for! It will search within a specific species model even if it isn’t visible in the table.
    • It’s worth repeating - all gene associations are derived from data developed in Mice. The homology shown in the table is only intended to draw linkage to possible candidates in other species models.
    • Options: Mouse, Human, Rat, Drosophila, Zebra Fish, S. Cerevisiae
  • Study
    • Each study is given a brief description of its relevance and interpretation on the References page.
    • The link for a particular study in the table will point to that description with a link to the PubMed manuscript entry.
  • Cancer
    • Narrow results according to the reported disease model.
  • Genes
    • Filter using NCBI gene names or IDs. Case sensitivity isn’t an issue, and names can be comma or line separated.
    • Important!!: Again, the species filter determines which set of gene data the Genes input searches for! It will search within a specific species model even if it isn’t visible in the table.

Visibility

  • To account for screen dimensions, we don’t list every table column by default. Instead, we offer the option to select hidden columns, which is done with the Column visibility button.
  • All columns are included in the download regardless of selection.
  • Other species gene homologs are included as hidden columns as well.

How do I export data from the CCGD?

The whole table can be downloaded as a csv file directly at the link below or on the search page by clicking the Download button without any filters. Alternatively, any filters applied on the search page will be reflected in the export.

All columns are included in the export regardless of which are visible.

What do the fields in this database mean?

SpeciesName

Official NCBI gene symbol identifying the candidate cancer gene. Official symbols are assigned by the MGI group and maintained by the Jackson Labs. The complete set of gene symbols can be downloaded from the MGI ftp site. If the study reported the gene using an unofficial symbol/alias, the gene name was converted to the MGI official gene symbol.

SpeciesId

The official NCBI Entrez GeneID identifying the mouse candidate cancer gene.

homologId

Official NCBI HomologeneID identifying the candidate cancer gene

CISAddress

Mouse genome coordinates for the CIS in the format: chromosome:start address - end address. All genome coordinates have been mapped to the most current genome build. If coordinates were originally published using an earlier genome build, the coordinates have been converted using the UCSC utility LiftOver. If the candidate cancer gene was identified using a gene-centric statistical analysis, and no CIS genome coordinates were reported, the start and end genome coordinates are based on the start and end coordinates of the gene. If the CIS was identified using a method that identifies a peak location, then this location will be designated as both the start and end location of the CIS region.

Study

Citation for the study reporting the forward genetic screen. First author, year, and number (if multiple studies were published by the same author in the same year).

Cancer

Cancer type is the tissue of origin of the cancer as reported by the study. These cancers originated in mice and there is not always a one-to-one correlation with human disease.

Rank

Relative rank is a letter grade (A, B, C or D) based on the relative rank of the CIS in the study. Rank is generally based on the number of insertions in a given CIS or the p-value assigned to the CIS. The letter grades are as follows: A = Top 10%, B = 11 to 25%, C = 26 to 50% and D = Bottom 50%. For example, if a study identified 100 CISs, the first 10, based on the study’s method of ranking, will get an A. CISs identified in screens that did not include insertion numbers or p-values are denoted as Not Ranked.

Effect

The predicted effect is either Gain, Loss, or N/A. Predicted effect is based on what is reported in the study. Different studies may use different methods for predicting effects, and some studies make no predictions regarding gain or loss of function.

Studies

The number of screens in the CCGD database that have identified this gene as a candidate cancer gene.

COSMIC

True indicates there is a reported somatic mutation in this gene in the COSMIC database. False indicates there are no mutations reported in COSMIC.

CGC

True indicates the gene is listed on the Sanger Institute’s Cancer Gene Census. False indicates the gene is not listed.

How is the relative rank value calculated?

Relative rank is a letter grade (A, B, C or D) based on the relative rank of the CIS in the study. Rank is generally based on the number of insertions in a given CIS or the p-value assigned to the CIS. The letter grades are as follows: A = Top 10%, B = 11 to 25%, C = 26 to 50% and D = Bottom 50%.

For example, if a study identified 100 CISs, the first 10, based on the study’s method of ranking, will get an A. CISs identified in screens that did not include insertion numbers or p-values are denoted as Not Ranked.

Where can I find official NCBI gene identifiers or symbols?

See the NCBI Genome site and the NCBI HomoloGene site.

Why isn’t the search page responding?

This application is made with a server-side processing library which allows for queries that outstrip the size of our current data set. If you have submitted a query and some aspect of the app is not responding, it is likely an issue on our end that we would like to know about. Please contact us at .

How often does the CCGD update?

The data file for the CCGD is rebuilt daily with source pulls from NCBI for the most updated version of gene data. References to Sanger data sources are updated weekly.

Study content is updated periodically as new findings are manually curated and uploaded.

Can I see the source code for the CCGD?

Yes.

This application underwent a complete rebuild in 2019 where git version control was utilized. You can find the content of the source code at the link below.

The updated version of the CCGD was developed almost entirely with R, Markdown, and Shiny. This website is the product of work by Christopher Tastad, Ken Abbott, Eric Nyre, and Juan Abrahante - members of the Starr Lab at the University of Minnesota. All considerations for ownership follow the discretion of Tim Starr and the Starr Lab at the extension of the policies of the University of Minnesota.


MIT License

Copyright (c) 2019 University of Minnesota

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

Can I submit my study for inclusion in the database?

Yes!

Send your information and publication citation to . We will be happy to add your study to the database.

Do you have a bibliography of studies referenced?

A list of all publications referenced in the content of this database and a description of their relevant findings are below.

In addition, there is a description of relevance and interpretation for the included studies on the references page.

How should I cite this work in publication?

Please use the following when citing the CCGD:

Kenneth L. Abbott, Erik T. Nyre, Juan Abrahante, Yen-Yi Ho, Rachel Isaksson Vogel, Timothy K. Starr, The Candidate Cancer Gene Database: a database of cancer driver genes from forward genetic screens in mice, Nucleic Acids Research, Volume 43, Issue D1, 28 January 2015, Pages D844–D848, https://doi.org/10.1093/nar/gku770

Did you URL address change?

With our overhaul in 2019 we also moved to a new hosting arrangement. OIT policies require that we have a redirect in place for the original site address. As a result, there are several site addresses that can be used to navigate here. Do not be alarmed if the address in the browser bar changes as navigation takes place.

Possible URLs:

All 3 of these will take you to the same place.

I have other questions.

Please feel free to contact us at .