How does the search work?
Simply, this resource is a single table of mouse-derived gene homology associations taken from cancer screens with some meta data for context. The effort put into building a tool in this format is entirely meant to aid the process of accessing a set of potential genes of interest from such findings.
There are 4 parameters to filter the dataset by. All of these filters brought together will perform a join of the full table to produce a single, filtered set of genes.
Genes
input searches for! It will search within a specific species model even if it isn’t visible in the table.Genes
input searches for! It will search within a specific species model even if it isn’t visible in the table.Column visibility
button.How do I export data from the CCGD?
The whole table can be downloaded as a csv file directly at the link below or on the search page by clicking the Download
button without any filters. Alternatively, any filters applied on the search page will be reflected in the export.
All columns are included in the export regardless of which are visible.
What do the fields in this database mean?
Official NCBI gene symbol identifying the candidate cancer gene. Official symbols are assigned by the MGI group and maintained by the Jackson Labs. The complete set of gene symbols can be downloaded from the MGI ftp site. If the study reported the gene using an unofficial symbol/alias, the gene name was converted to the MGI official gene symbol.
The official NCBI Entrez GeneID identifying the mouse candidate cancer gene.
Official NCBI HomologeneID identifying the candidate cancer gene
Mouse genome coordinates for the CIS in the format: chromosome:start address - end address. All genome coordinates have been mapped to the most current genome build. If coordinates were originally published using an earlier genome build, the coordinates have been converted using the UCSC utility LiftOver. If the candidate cancer gene was identified using a gene-centric statistical analysis, and no CIS genome coordinates were reported, the start and end genome coordinates are based on the start and end coordinates of the gene. If the CIS was identified using a method that identifies a peak location, then this location will be designated as both the start and end location of the CIS region.
Citation for the study reporting the forward genetic screen. First author, year, and number (if multiple studies were published by the same author in the same year).
Cancer type is the tissue of origin of the cancer as reported by the study. These cancers originated in mice and there is not always a one-to-one correlation with human disease.
Relative rank is a letter grade (A, B, C or D) based on the relative rank of the CIS in the study. Rank is generally based on the number of insertions in a given CIS or the p-value assigned to the CIS. The letter grades are as follows: A = Top 10%, B = 11 to 25%, C = 26 to 50% and D = Bottom 50%. For example, if a study identified 100 CISs, the first 10, based on the study’s method of ranking, will get an A. CISs identified in screens that did not include insertion numbers or p-values are denoted as Not Ranked.
The predicted effect is either Gain, Loss, or N/A. Predicted effect is based on what is reported in the study. Different studies may use different methods for predicting effects, and some studies make no predictions regarding gain or loss of function.
The number of screens in the CCGD database that have identified this gene as a candidate cancer gene.
True
indicates there is a reported somatic mutation in this gene in the COSMIC database. False
indicates there are no mutations reported in COSMIC.
True
indicates the gene is listed on the Sanger Institute’s Cancer Gene Census. False
indicates the gene is not listed.
How is the relative rank value calculated?
Relative rank is a letter grade (A, B, C or D) based on the relative rank of the CIS in the study. Rank is generally based on the number of insertions in a given CIS or the p-value assigned to the CIS. The letter grades are as follows: A = Top 10%, B = 11 to 25%, C = 26 to 50% and D = Bottom 50%.
For example, if a study identified 100 CISs, the first 10, based on the study’s method of ranking, will get an A. CISs identified in screens that did not include insertion numbers or p-values are denoted as Not Ranked.
Where can I find official NCBI gene identifiers or symbols?
See the NCBI Genome site and the NCBI HomoloGene site.
Why isn’t the search page responding?
This application is made with a server-side processing library which allows for queries that outstrip the size of our current data set. If you have submitted a query and some aspect of the app is not responding, it is likely an issue on our end that we would like to know about. Please contact us at ccgd@umn.edu.
How often does the CCGD update?
The data file for the CCGD is rebuilt daily with source pulls from NCBI for the most updated version of gene data. References to Sanger data sources are updated weekly.
Study content is updated periodically as new findings are manually curated and uploaded.
Can I see the source code for the CCGD?
Yes.
This application underwent a complete rebuild in 2019 where git version control was utilized. You can find the content of the source code at the link below.
The updated version of the CCGD was developed almost entirely with R, Markdown, and Shiny. This website is the product of work by Christopher Tastad, Ken Abbott, Eric Nyre, and Juan Abrahante - members of the Starr Lab at the University of Minnesota. All considerations for ownership follow the discretion of Tim Starr and the Starr Lab at the extension of the policies of the University of Minnesota.
MIT License
Copyright (c) 2019 University of Minnesota
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
Can I submit my study for inclusion in the database?
Yes!
Send your information and publication citation to ccgd@umn.edu. We will be happy to add your study to the database.
Do you have a bibliography of studies referenced?
A list of all publications referenced in the content of this database and a description of their relevant findings are below.
In addition, there is a description of relevance and interpretation for the included studies on the references page.
How should I cite this work in publication?
Please use the following when citing the CCGD:
Kenneth L. Abbott, Erik T. Nyre, Juan Abrahante, Yen-Yi Ho, Rachel Isaksson Vogel, Timothy K. Starr, The Candidate Cancer Gene Database: a database of cancer driver genes from forward genetic screens in mice, Nucleic Acids Research, Volume 43, Issue D1, 28 January 2015, Pages D844–D848, https://doi.org/10.1093/nar/gku770
Did you URL address change?
With our overhaul in 2019 we also moved to a new hosting arrangement. OIT policies require that we have a redirect in place for the original site address. As a result, there are several site addresses that can be used to navigate here. Do not be alarmed if the address in the browser bar changes as navigation takes place.
Possible URLs:
All 3 of these will take you to the same place.
I have other questions.
Please feel free to contact us at ccgd@umn.edu.