Overview
RNAPeaks is an R package for producing publication-quality visualizations of RNA-binding protein (RBP) peaks overlaid on annotated gene structures. Starting from a BED file of protein binding sites, RNAPeaks draws each peak as a rectangle above the corresponding gene model — showing exons, UTRs, and introns at accurate genomic scale.
The package provides four analysis modes:
| Mode | Function | Input |
|---|---|---|
| Single-gene peak plot | PlotGene() |
BED + gene ID |
| Genomic region peak plot | PlotRegion() |
BED + coordinates |
| Splicing map (SE) | createSplicingMap() |
BED + SE.MATS |
| Splicing map (RI) | createRetainedIntronSplicingMap() |
BED + RI.MATS |
| Sequence motif map (SE) | createSequenceMap() |
SE.MATS + motif |
| Sequence motif map (RI) | createRetainedIntronSequenceMap() |
RI.MATS + motif |
Installation
# Bioconductor dependencies
if (!require("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install(c(
"GenomicRanges", "IRanges", "S4Vectors", "BiocGenerics",
"GenomeInfoDb", "AnnotationHub", "Biostrings", "BSgenome"
))
# Install RNAPeaks from GitHub
devtools::install_github("Krushna-B/RNAPeaks")Loading GTF annotation
Gene structure diagrams are built from an Ensembl GTF annotation.
Load it once per session with LoadGTF() and optionally
cache it to disk to avoid re-downloading on subsequent sessions.
# First call downloads from AnnotationHub (~1–2 min)
gtf <- LoadGTF(species = "Human")
# Cache locally for instant loading next time
saveRDS(gtf, "human_gtf.rds")
# The package ships with a pre-loaded GTF — no download required
data(gtf_human)
gtf <- gtf_humanSupported species: "Human" (Ensembl GRCh38, AH110867)
and "Mouse" (Ensembl GRCm38, AH47076).
Preparing a BED file
Use checkBed() to validate and normalize any BED-format
data frame before passing it to a plotting function. Columns are mapped
by position:
| Position | Name | Default |
|---|---|---|
| 1 | chr |
required |
| 2 | start |
required |
| 3 | end |
required |
| 4 | tag |
"peak" |
| 5 | score |
0 |
| 6 | strand |
"+" |
my_bed <- read.table("my_peaks.bed", header = FALSE, sep = "\t")
my_bed <- checkBed(my_bed)The package includes a ready-to-use sample dataset:
head(sample_bed)
#> chr start end tag score strand
#> 1 21 8401778 8401840 AARS_K562_IDR 1000 +
#> 2 19 4035770 4035869 AARS_K562_IDR 1000 +
#> 3 11 93721690 93721719 AARS_K562_IDR 1000 +
#> 4 19 2270224 2270300 AARS_K562_IDR 1000 +
#> 5 m 12167 12208 AARS_K562_IDR 1000 +
#> 6 16 2760062 2760143 AARS_K562_IDR 1000 +Gene-level visualization
PlotGene() plots all RBP peaks that overlap a single
gene of interest.
result <- PlotGene(
bed = sample_bed,
geneID = "GAPDH",
gtf = gtf
)
result$plot # display the plot
result$csv # filtered peaks used in the figureKey parameters:
| Parameter | Default | Description |
|---|---|---|
geneID |
— | Gene symbol or Ensembl ID ("ENSG...") |
TxID |
NA |
Specific transcript isoform; defaults to longest |
species |
"Human" |
"Human" or "Mouse"
|
order_by |
"Count" |
Peak track ordering: "Count", "Target", or
"Region"
|
merge |
0 |
Merge peaks within this many bp of each other |
five_to_three |
FALSE |
Orient plot 5′ → 3′ regardless of strand |
Region-level visualization
PlotRegion() displays all genes and their overlapping
peaks within a specified genomic window.
result <- PlotRegion(
bed = sample_bed,
gtf = gtf,
Chr = "12",
Start = 56000000,
End = 56050000,
Strand = "+"
)
result$plotSplicing map analysis
createSplicingMap() quantifies RBP binding frequency at
every position across a four-region window spanning exon-intron
junctions of skipped-exon (SE) events from rMATS output.
createSplicingMap(
bed_file = sample_bed,
SEMATS = sample_se.mats
)The function classifies events into three groups based on
IncLevelDifference and PValue /
FDR:
-
Retained —
IncLevelDifference < -0.1: exon inclusion significantly higher in condition 1 -
Excluded —
IncLevelDifference > 0.1: exon skipping significantly higher in condition 1 - Control — non-significant, stable inclusion (PValue > 0.95)
A bootstrap procedure samples a matched number of control events
across control_iterations replicates to build a null
distribution. Positions where Retained or Excluded frequencies exceed
the control z-score threshold are highlighted as significant
regions.
createSplicingMap(
bed_file = sample_bed,
SEMATS = sample_se.mats,
control_iterations = 20, # bootstrap replicates
show_significance = TRUE, # overlay significance bars
WidthIntoExon = 50, # bp into flanking exons
WidthIntoIntron = 300 # bp into flanking introns
)Sequence motif analysis
createSequenceMap() works identically to
createSplicingMap() but instead of protein binding, it
measures the per-position frequency of a target sequence motif. Full
IUPAC ambiguity codes are supported.
library(BSgenome.Hsapiens.UCSC.hg38)
# YCAY motif (Y = C or T)
createSequenceMap(
SEMATS = sample_se.mats,
sequence = "YCAY"
)Common IUPAC codes:
| Code | Bases |
|---|---|
R |
A or G |
Y |
C or T |
S |
G or C |
W |
A or T |
N |
any base |
Retained intron splicing map
createRetainedIntronSplicingMap() quantifies RBP binding
frequency around retained-intron junctions: the upstream exon/intron
boundary (UE-RI5) and the downstream intron/exon boundary (RI3-DE).
createRetainedIntronSplicingMap(
bed_file = sample_bed,
RIMATS = sample_se.mats
)Retained intron sequence map
createRetainedIntronSequenceMap() measures per-position
motif frequency at retained intron junctions using the same approach as
createSequenceMap().
library(BSgenome.Hsapiens.UCSC.hg38)
createRetainedIntronSequenceMap(
RIMATS = sample_se.mats,
sequence = "YCAY"
)