The current web application only accepts processed and normalized PCL files.

Preprocessing method


We recommend the default MAS5.0 normalization steps with Entrez BrainArray Custom CDF. Final log-transformation is also recommended.


We recommend the Bowtie and Tophat alignment algorithm with NCBI's transcript reference.

Quantile transformation

We use quantile transformation in order to compute hgu133plus2-like expression values. The hgu133plus2 reference was constructed from 1000 random samples. This step is automatically taken after submission.

Submission PCL file format

URSA(HD) expects a two column text file where the first column has Entrez ids or HGNC gene names and the second column has the corresponding quantified expression values. Refer to example files:

HG-U133 plus 2.0 example: GSM100888
HG-133A example: GSM74404
Illumina HiSeq 2000 example: ERX011182

Manual Curation Annotation

In order to utilize the tissue relationships, gene expression experiments were annotated to a term or terms in the Brenda Tissue Ontology.  After an initial substring text-mining of sample descriptions in GEO, term-to-experiment pairs were manually verified based on their sample descriptions and associated publication(s) to exclude incorrect or ambiguous pairs.  The associated publication (original paper) was examined only when the sample descriptions were ambiguous. Sample annotations were then propagated based on the tissue ontology.  Note that experiments weren’t necessarily annotated to their most specific term in the ontology although such attempts were made.

Manual tissue annotations are available here: manual_annotations_ursa.csv