Overview
createSequenceMap() measures per-position sequence motif
frequency across the same four-region window used by
createSplicingMap(), applied to skipped-exon (SE) events.
Full IUPAC ambiguity codes are supported, enabling analysis of
degenerate RBP binding motifs.
The analysis logic, event classification, bootstrap procedure, and
significance testing are identical to createSplicingMap().
See the createSplicingMap article for
details on those parameters.
Basic usage
library(BSgenome.Hsapiens.UCSC.hg38)
createSequenceMap(
SEMATS = sample_se.mats,
sequence = "YCAY" # NOVA binding motif; Y = C or T
)IUPAC ambiguity codes
Any standard IUPAC code is supported:
| Code | Matches |
|---|---|
R |
A, G |
Y |
C, T |
S |
G, C |
W |
A, T |
K |
G, T |
M |
A, C |
B |
C, G, T |
D |
A, G, T |
H |
A, C, T |
V |
A, C, G |
N |
A, C, G, T |
# YGCY motif (MBNL binding site)
createSequenceMap(SEMATS = sample_se.mats, sequence = "YGCY")
# Poly-C run
createSequenceMap(SEMATS = sample_se.mats, sequence = "CCCC")Multiple motifs
Pass a character vector to sequence to generate one plot
per motif:
plots <- createSequenceMap(
SEMATS = sample_se.mats,
sequence = c("YCAY", "YGCY", "CCCC")
)
plots[["YCAY"]]Using a custom genome
By default createSequenceMap() uses
BSgenome.Hsapiens.UCSC.hg38. Supply any installed BSgenome
object to use a different assembly:
library(BSgenome.Mmusculus.UCSC.mm10)
createSequenceMap(
SEMATS = sample_se.mats,
sequence = "YCAY",
genome = BSgenome.Mmusculus.UCSC.mm10
)Returning data
freq_df <- createSequenceMap(
SEMATS = sample_se.mats,
sequence = "YCAY",
return_data = TRUE
)
head(freq_df)