TCGA collects clinical and biospecimen information for all qualified patients participating in the study. This information is submitted by the TCGA Biospecimen Core Resource (BCR) in a clinical and biospecimen XML file for each patient. The XML files are converted into tab-delimited text files or "biotabs". Whereas there are discrete XML files per patient, the biotabs contain collated information for patients and biospecimens. Each biotab file archive and the archive's contents are organized by a particular cancer type. The following is a description of the types of biotab files that are generated from individual patient XML files.

Descriptions of the clinical data elements can be obtained at: https://tcga-data.nci.nih.gov/docs/dictionary/TCGA_BCR_DataDictionary.xml.

Enrollment forms can be obtained at the BCR website: http://www.nationwidechildrens.org/biospecimen-core-resource-for-the-cancer-genome-atlas.

Biospecimen (patient sample information):

Clinical (patient information):

Important note

Follow-up data (in biotab format) for TCGA patients are contained in the 'clinical_follow_up' files for each cancer type. The different versions of the follow-up files represent changes or new data added to follow-up forms over time. Multiple follow-up files for a single patient often represent a series of follow-ups over a period of time. However, multiple instances of the same follow-up file can also represent multiple new tumor events within the same time period. To obtain all available disease progression information, please use ALL of the follow_up files in your analyses, not just the latest version.