Main help page on how to search RNA Editing sites in REDIportal and use the embedded JBrowse.
Searching RNA Editing sites
Searching RNA Editing per sample
Browsing RNA Editing sites per gene
Browsing RNA Editing sites at genomic level
Searching Edited dsRNAs
Searching CLAIRE (Cell Line A-to-I Rna Editing database)
Searching into REDIportal is quite straightforward and also users with no bioinformatics skills can perform accurate searches across the database. RNA editing sites are stored according to their genomic positions and can be retrieved providing a genomic locus (“Genomic Region” field) or a known gene symbol (“Gene Name” field). Both fields are mutually exclusive. Genomic loci can be interrogated entering chromosome coordinates in the format Chr:start-end (for example chr4:158101247-158308846).
RNA editing events in known genes can be retrieved entering the gene symbol in the “Gene Name” field. To avoid editing sites in intergenic regions surrounding the entered gene name, the “Extact Match” check box must be selected. The “Gene Name” field allows the autocomplete function to facilitate the selection of the right gene.
Users can select the Organism Name using the dropdown menu.
Or select the Genome Assembly Version per each organism using another dedicated dropdown menu.
Once the genomic region or gene name has been entered (as well as the Organism and Genome Assembly), the search can be refined using additional select menus. The following options are admitted:
Menu | Name | Option |
---|---|---|
Location |
Location menu allows the selection of RNA editing sites residing in Alu elements (ALU) or repetitive elements non-Alu (REP) or non repetitive regions (NONREP). |
|
Genic Region |
This menu allows the selection of RNA editing sites residing in specific genic regions such as: untranslated regions (UTR) or intronic regions or coding/non-coding exons or intergenic regions. Reported classification has been carried out by ANNOVAR. |
|
AA Change |
This menu allows the selection of RNA editing sites residing in protein coding regions and affecting codon integrity. Reported classification has been carried out by ANNOVAR. |
|
Tissue |
This menu allows the selection of RNA editing sites residing in specific human tissues. More than one tissue can be selected per each search. Tissue names are according to GTEx. |
|
Body Site |
This menu allows the selection of RNA editing sites residing in specific human/mouse body sites. More than one body site can be selected per each search. Body site names are according to GTEx. |
|
TCGA Study |
This menu allows the selection of RNA editing sites residing in specific human TCGA studies. More than one study can be selected per each search. TCGA study names are according to the TCGA project. |
A search example can be performed clicking on "Example" button. All searches, instead, are activated by clicking the "Search" button. The search form can also be reset by clicking the "Clean" button.
Once a search has been performed, results will be displayed in a sortable table. For human RNA editing sites, the output table will include the following columns:
Column Name | Meaning |
---|---|
Accession | A unique Accession Number |
Chr | Chromosome Name |
Position | Chromosome Coordinate |
Ref | Reference Nucleodite |
Ed | Edited Nucleotide |
Strand | Strand (+ or -) |
Location | Location of RNA Editing in repetitive or non-repetitive regions. |
Repeats | Class and family of repeat including the RNA editing position. |
Gene | Gene Symbol linked to GeneCards |
Region | Genic Region according to ANNOVAR |
EditedIn | The number of Samples in which the specific position appears to be edited. It is showed by two progression bars indicating GTEx and TCGA samples, respectively. The percentage of edited samples per each project is shown by mousing over. |
FurtherAnn | A colored flag with two values, coding and non-coding. The Coding flag, if activated, opens a pop-up with external links to UniProt accessions. The Non-Coding flag, instead, if activated, opens a pop-up with external links to RNAcentral accessions. |
ExFun | Exonic function limited to synonymous and non-synonymous positions. A colored flag is used to indicate if a site is synonymous (green) or non-synonymous (red). Click on to open a pop-up with details. |
ProtSupp | A colored flag indicating the support by Proteomic Data from the PRIDE database. Click on to open external links to PRIDE. |
Pvalue | A colored flag reporting the REDInet pvalue indicating the mean editing probability according to a deep learning model. Click on to open a pop-up with prediction pvalues for each body site. |
For each human position, REDIportal provides additional info by clicking on blue arrow in the first column. This will cause the opening of five tabs. The first tab named "GTEx Heat-Map/BoxPlot" displays an RNA Editing heat-map in which mean editing level per GTEx body site is reported as well as a BoxPlot with RNA Editing levels per each GTEx body site. Mouse over each body site to open a tooltip showing the average editing level or other relevant values for the boxplot.
Three additional heat-maps showing ADAR, ADARB1 and ADARB2 expression values from GTEx body sites are reported.
The second tab named "TCGA Heat-Map/BoxPlot" displays an RNA Editing heat-map in which mean editing level per TCGA study is reported as well as a BoxPlot with RNA Editing levels per each TCGA body site. Mouse over each body site to open a tooltip showing the average editing level or other relevant values for the boxplot.
Three additional heat-maps showing ADAR, ADARB1 and ADARB2 expression values from TCGA studies are reported.
The third tab named "Other Info" displays a table with additional annotations from Gencode, PhastCons and dbSNP.
The fourth tab named "Alternative Annotations" displays a table with gene/transcript annotations from RefSeq database and UCSC KnownGene table.
The last tab named "Editing Details" displays two tables including: 1) the number of GTEx samples, GTEx tissues and GTEx body sites in which the position appears edited; 2) the number of TCGA Samples, TCGA Studies and TCGA Diseases in which the position appears edited. Clicking on "View [GTEx or TCGA] Editing Details" button will cause the opening of a new window with a table including editing levels per each experiment.
For RNA editing sites from non-human organisms, the output table will include the following columns:
Column Name | Meaning |
---|---|
Chr | Chromosome Name |
Position | Chromosome Coordinate |
Ref | Reference Nucleodite |
Ed | Edited Nucleotide |
Strand | Strand (+ or -) |
dbSNP | a colored flag indicating the presence of a SNP in dbSNP. Only SNPs classified as "genomic" are taken into account. A green flag indicates a match with dnSNP and provides also an external link to NCBI |
Location | Location of RNA Editing in repetitive or non-repetitive regions. |
Repeats | Class and family of repeat including the RNA editing position. |
Gene | Gene Symbol |
Region | Genic Region according to ANNOVAR |
EditedIn | The number of Samples in which the specific position appears to be edited. It is showed by a progression bar. |
ExFun | Exonic function limited to synonymous and non-synonymous positions. A colored flag is used to indicate if a site is synonymous (green) or non-synonymous (red). Click on to open a pop-up with details. |
Phast | PhastCons conservation scores calculated for multiple alignments of 45 vertebrate genomes to the human genome. It ranges from 0 (no conservation) to 1000 (max conservation). Values derive from UCSC phastCons46way table. |
KnownIn | A colored flag indicating the presence of a site in other available database (A: ATLAS, R: RADAR, D: DARNED). Click on R or D to open an external link to RADAR or DARNED databases, respectively. |
For each non-human position, REDIportal provides additional info by clicking on blue arrow in the first column. This will cause the opening of four tabs. The first tab named "Heat-Map" displays an RNA Editing heat-map in which mean editing level per body site is reported. Mouse over each body site to open a tooltip showing the average editing level.
The second tab named "Box Plot" displays RNA Editing levels per each body site by means of box plots. Relevant values are available by mousing over each box plot.
The third tab named "Alternative Annotations" displays a table with gene/transcript annotations from RefSeq database and UCSC KnownGene table.
The last tab named "Editing Details" displays the number of samples, tissues and body sites in which the position appears to be edited. Clicking on "View Editing Details" button will cause the opening of a new windows with a table including editing levels per each experiment.
The "View Editing Details" button enables the opening of a new window including relevant editing info described in the table below. The layout of this window is equal for human and non-human organisms.
Column Name | Meaning |
---|---|
RNAseq Run | RNAseq Run accession number according to SRA database. |
WGS Run | Whole Genome Sequencing Run accession number according to SRA database. |
Tissue | Tissue Name according to GTEx project. |
BodySite | Body Site Name according to GTEx project. |
n.As | Number of RNAseq reads supporting Adenosine |
n.Gs | Number of RNAseq reads supporting Guanosine |
EditingFreq | RNA Editing Frequecy |
gCoverage | Number of supporting genomic reads |
gFreq | Max Frequency of AG change at genomic level. |
Individual run or tissue or bosy sites can be selected by using the "Select" button below each column. Numerical columns can be sorted by clicking on each column title.
Result table can be downloaded and exported in Excel or PDF format for further analyses.
Result table can also be filtered by clicking on the "Filter Editing Levels" button. This will cause the opening of a pop-up in which the user can insert numeric values to filter RNA editing levels as well as reads supporting adenosines or guanosines.
Specific columns of result table can be hided by clicking on "Column visibility" button.
Users can increase the number of visible rows by using the "Show" button.
Also in the main result table, specific columns can be hided by clicking on "Column visibility" button.
Search results can be downloaded using the "Download" button. This will cause the opening of a pop-up in which users can select columns to download.
Columns of each result table can be exchanged or moved in order to customize the aspect and column order.
Columns with gray arrows are sortable in ascending or descending order.
REDIportal allows also RNA editing searches at the sample level. Users can browse RNA editing statistics detected in each RNAseq experiment by selecting the “Search Sample” page from the main menu and providing specific options. The search can be done by typing in the “Sample name” the GTEx run accession number or the TCGA aliquot ID (for example SRR1069188 for GTEx or TCGA-OR-A5J1-01A-11R-A29S-07 for TCGA). Samples can also be selected by Data Source (GTEx or TCGA), Data Status (Normal or Tumor), Data Type (Bulk tissue or single cell), GTEx tissue or body site, TCGA study or TCGA disease type. In addition, the expression of ADAR genes or Alu Editing index values can be used to further select samples.
Once a search has been performed, results will be displayed in a table including the following columns:
Column Name | Meaning |
---|---|
Sample | Sample Name (RNAseq accession number) |
WGS/WES | WGS/WES Name (DNAseq accession number) from the same individual, if available |
Source | Project source name |
Organism | Organism name |
Events | Number of RNA editing events detected in the sample |
Hyper | Number of hyper-edited events detected in the sample |
Body Site | Name of the body site |
Status | Disease Status |
Type | Tissue type: bulk or single cell |
AEI | Alu Editing Index |
REI | Recoding Editing Index |
ADAR | Expression of ADAR gene (in TPM) |
ADARB1 | Expression of ADARB1 gene (in TPM) |
ADARB2 | Expression of ADARB2 gene (in TPM) |
For each sample, REDIportal provides additional info by clicking on blue arrow in the first column. This will cause the opening of five tabs.
Tab name | Content |
---|---|
Genomics Facts | Main statistics about the genomics location of detected RNA editing events |
Base Distribution | It shows the distribution of detected variants by our HPC REDItools pipeline |
RNA Editing Indices | Box plots of AEI and REI indices for the specific body site. Details of recoding events per sample are available by clicking on the "REI details" |
RNA Editing Levels | Distribution of RNA editing levels from detected sites |
Transcriptome Coverage | Fraction of edited genes over the entire annotation. |
RNA editing events stored in REDIportal are also visible in their genic context through our novel Gene View functionality. Users can explore known events per gene by selecting the "Gene View" page from the Search Menu and provide the name of the favourite gene (according to available organisms and genome assemblies).
Once a gene has been selected, the user will be able to see the structure of the gene locus organised in transcripts and a panel containing all known editing events for the specific locus. Users can also zoom on specific gene locations.
All RNA editing events stored in REDIportal are visible in their genomic context through JBrowse, a fast genome browser based on JavaScript and HTML5. It is embedded in REDIportal by default, allowing the browsing of basic tracks such as individual RNA editing sites, SNPs, RefSeq gene annotations, Alu elements and LINEs.
Genomic intervals can be inspected entering chromosome coordinates in the JBrowse search box (inside the red ovale below) using the format Chr:start..end or Chr:start-end (for example chr4:158101247..158308846). Commas can be used as thousands separators (like in UCSC) in the start and stop nucleotide position numbers but they are not required.
Alternatively, the JBrowse search box accepts gene symbols and allows the autocomplete function to easily suggest gene names during the typing (an example is shown in the red circle below).
If the gene symbol is present in multiple JBrowse tracks implemented in REDIportal, a dialog window will be opened allowing the selection of the correct track. In the example below, GRIA1 gene is in both Gencode Basic V19 track and RefSeq track. The dialog window will allow the selection of needed track using "Go" buttons.
Navigation buttons are located to the left of the search box in the consolidated header region. The arrow buttons move the view about the distance of one screen left or right. The larger zoom buttons zoom in or out about twice as far as the smaller buttons.
In addition to these buttons, JBrowse supports click and drag selection of regions in both the chromosome-level and detail-level position bars.
Each JBrowse data track has a context-specific menu, hidden by default. The down arrow on the title bar of the track allows the visualization of the following options:
About this track: provides some additional information about a particular track such as the track type, category and legend.
Pin to top: causes that track to always be displayed directly beneath the header area at the top of the browser window.
Edit Config: allows the user to directly edit the configuration script for a particular track, even though it is not recommended for most users.
Delete track: turns off individual tracks.
Save Track Data: allows the viewing and saving of track data in gff3, bed or sequin format. Reference sequences can be exported in fasta format.
Display mode: enables three display modes 1) "Normal" view, 2) "Compact" with reduced height of each object in the track and 3) "Collapse" (default for RNA editing and SNP tracks) moves all objects to a single line on the track.
Show labels: displays labels when the view is zoomed in sufficiently. "Show labels" box is turned off by default in RNA editing and SNP tracks.
Details of each track can be explored by clicking on each annotation, since it will cause the opening of a specific pop-up window. In addition, left-clicks on features will open an embedded popup window showing further options:
Zoom: allows the zoom on the specific feature;
Highlight: enables the highlighting of a feature (this behaviour can be disabled clicking the highlight button on navigation bar);
Link to a specific web page to recover additional info. In case of gene annotations, they are linked to GeneCards database. SNPs are connected to NCBI dbSNP while RNA editing sites are linked to REDIportal details including info about edited tissues, body sites and samples with correlated frequencies values.
View details: features with no specific links have a “View details“ option to open a pop-up window as above.
JBrowse embedded in REDIportal includes also further tracks, available as a list in the left side bar, visible only in the full-screen modality. Such full-screen view can be enabled clicking the “Full-screen view” link in the upper right corner.
REDIportal allows also the browsing of edited dsRNAs. Users can look at putative dsRNAs per gene by selecting the organism name and the gene of interest. Since not all genes include dsRNAs, an auto-complete function as been activated for the Gene Name field.
Once a gene has been selected, results will be displayed in a new page including:
1) a graphical overview of the gene locus, in which only the longest transcript is taken into account, showing putative dsRNAs by arcs. The transcript structure is rendered by Goslings a grammar-based toolkit for scalable and interactive genomics data visualization. UTRs, CDS and intronic regions are shown in different colors.
2) a table reporting the genomic coordinates of putative dsRNAs including their length and percent of identity. By the blue arrow, users can visualize per each dsRNA a heat-map with the corresponding RNA editing values per GTEx body site. RNA editing per dsRNA is calculated as an index likewise the Alu editing index. Such dsRNA index is defined as the weighted mean of the RNA editing levels occurring in a given dsRNA. The dsRNA index is also calculated per the entire transcript and showed in the last row of the table by a heat-map.
Users can now perform searches across the CLAIRE database (Cell Line A-to-I Rna Editing) enabling the identification of cell lines suitable for investigating specific RNA editing sites. Cell lines can be retrieved according to the AEI (Alu Editing Index) and the expression levels of ADAR and ADARB1 genes.
Up to 5 target sites (in hg19 coordinates) can be selected.
For each site, specific search parameters can be tuned such as the expression of the target gene, the coverage depth (number of reads supporting the site) and the RNA editing level.
Once a search has been performed, results will be displayed in a table including the following columns:
Column Name | Meaning |
---|---|
Cell Line | Cell Line name |
Sample | Sample name |
AEI | Alu Editing Index |
ADAR expr | ADAR expression (TPM) |
ADARB1 expr | ADARB1 expression (TPM) |
Tissue | Human tissue of origin |
Cell lines or samples or tissues can be selected by using the "Select" button below each column and all columns can be sorted (in ascending or descending order) by clicking on each column title.
Results can be downloaded and exported in Excel or PDF format for further analyses.
For each cell line, additional info are provided in tabular format by clicking on blue arrow in the first column. They include the expression of the target gene (in TPM), the coverage depth (number of reads) of the specific target site and its editing level.