Research Guides: Biology (New York/Shanghai/Abu Dhabi): Resource Identification

Uniquely Identifying Resources

Resource identification refers to the unambiguous reporting of research resources such as genes, organisms, tools, and reagents. These resources should be reported within publications with enough information that reviewers and subsequent researchers can identify the exact strain or reagent used. Preferably, authors should provide the full, descriptive name of the resource, its source and a unique identifier. Doing so allows for:

Better evaluation of the methods and interpretation of results
Reproducibility of the research
Machine readability and potential text mining applications

Many reporting standards will include guidance on how to identify such resources. This page provides further information and sources for unambiguous identifiers.

Use RRIDs when available! Research Resource Identifiers (RRIDs) are persistent and unique identifiers for organisms, cell lines, antibodies, and software tools available through The Resource Identification Initiative. They can be found through the portal below.

Resource Identification Portal
RRIDs for organisms, cell lines, antibodies and tools.

Organism and Strain Identification

Report:

For trangenic animals: source, species, strain, sex, age, husbandry and inbred and strain characteristics.
For identifying model organisms: species IDs can be found in NCBI Taxonomy. Strain information can be found in model organism databases and reported using their unique identifiers. For example, for a strain of C. elegans, use the WormBase ID.

Alternatively, RRIDs are available for some species.

RRIDs: Organisms
Use this site to locate Research Resource IDs (RRIDs) for organisms.

Species Identifiers

NCBI Entrez Taxonomy
Provides a unique taxonomy ID for major species. Use the taxonomy ID. For example, Bactrian Camel: NCBI:txid9837
Integrated Taxonomic Information System
Taxonomic information on plants, animals, fungi, and microbes of North America and the world. Use taxonomic serial number. For example, bactrian camel: ITIS taxonomic Serial No.: 625026

Model Organism Databases (for strain info)

International Mouse Strain Resource
Searchable online database of mouse strains, stocks, and mutant ES cell lines available worldwide, including inbred, mutant, and genetically engineered strains. Use the official IMSR name.
Mouse Genome Informatics (MGI)
Rat Genome Database (RGD)
WormBase
Zebrafish Model Organism Database (ZFIN)
FlyBase
Plant Model Organism Databases
Collected by Plant Metabolic Network

Reagent & Cell Line Identification

Report:

Full, descriptive name of the resource, including host species, if relevant
Source of the resource, as in the vendor or lab
Unique ID (could be a catalog number, accession number, CAS ID).

Again, RRIDs are available for many antibodies and cell lines and should be used when available.

RRIDs: Antibodies
RRIDs: Cell Lines
Antibody Registry
Integrates information from the Resource Identification Portal with Journal of Comparative Neurology's Antibody Database. Another way to search for antibodies and find RRIDs.
Eagle-i Repository
Biomedical resources available at universities. Includes reagents, cell lines, organisms and more.

Sequence & Variant Identification

Genes discussed in the literature often have multiple names, symbols, and IDs associated with them. This can cause confusion when reporting on genes or sequences and also when searching for them. Gene nomenclature committees and molecular sequence databases provide standardized names and identifiers as well as known synonyms.

Approved gene symbol: There are species-specific rules about naming genes. Nomenclature committees that assign gene names and symbols exist for a variety of organisms (see below). If you are publishing a report on a gene that does not have an assigned symbol, you can contact these committees to request that they assign one prior to publication. Some journals will require this step.

Accession IDs: Molecular sequence databases assign unique accession numbers to sequences. The International Nucleotide Sequence Database Collaboration (DDBJ/EMBL-EBI/NCBI) assigns accessions in a specific format. In NCBI, for example, you may find a record with an accession like: EF212037.2. While the "EF212037" is the Genbank (direct submission) accession number. The ".2" is the version number and indicates this is the second version of this record. It is, therefore, important to include the dot and version number when reporting accession IDs from these sources.

+ GenBank Accession Number Reference Sheet (includes RefSeq)
+ More about types of reference sequences

Reporting variants: HGVS is the standard for reporting human gene variants. Model organism nomenclature committees include species-specific rules for unambiguously naming variants.

The Human Genome Variation Society (HGVS) provides detailed recommendations for the unambiguous naming of sequence variants. HGVS notation includes a reference sequence, type of reference sequence (DNA, RNA, protein...), nucleotide/residue number, and type of variation (substitution, deletion, duplication...). The same variant can be named in various ways depending on the reference sequence selected. For example, allele 17, the "ultra-rapid metabolizer," of the CYP2C19 gene may be named:

NG_008384.2:g.4195C>A
NM_000769.2:c.-806C>A
NC_000010.11:g.94761900C>A (GRCh38)
NC_000010.10:g.96521657C>A (GRCh37)

You may also see the allele name used, CYP2C19*17, or the rs number from dbSNP, rs12248560.

+ Abbreviation meanings in HGVS

Humans

HGNC GeneNames
Human. Curated online repository of HGNC-approved gene nomenclature, gene families and associated resources. Currently, over 39,000 gene symbols are included
HGVS Guidelines
Human Genome Variation Society guidelines for naming sequence variants.
Mutalyzer
Suite of tools to help create and check sequence variant names in HGVS format.
Locus Reference Genomic
LRG sequences provide a stable genomic DNA framework for reporting human variants with a permanent ID and core content that never changes.

Other Organisms

Converting Identifiers

bioDBnet: Database to Database Conversions
Allows for conversions of identifiers from one database to other database identifiers or annotations.

More Information

For more information on how to report research resources unambiguously - particularly when no unique identifier is available, the following articles and sites provide guidance and some examples.

On the reproducibility of science: unique identification of research resources in the biomedical literature
Vasilevsky NA, Brush MH, Paddock H, Ponting L, Tripathy SJ et al. (2013) On the reproducibility of science: unique identification of research resources in the biomedical literature. PeerJ 1:e148 http://dx.doi.org/10.7717/peerj.148
Reporting research antibody use: how to increase experimental reproducibility
Helsby MA, Fenn JR, Chalmers AD. (2013) Reporting research antibody use: how to increase experimental reproducibility [v2; ref status: indexed, http://f1000r.es/1np] F1000Research 2013, 2:153 (doi: 10.12688/f1000research.2-153.v2)
Journal of Comparative Neurology Author Guidelines
JCN requires more precise reporting of resources than many other journals. See the section for Materials and Methods.