strsplit_ctl {fansi} | R Documentation |
A drop-in replacement for base::strsplit. It will be noticeably slower, but should otherwise behave the same way except for Control Sequence awareness.
strsplit_ctl( x, split, fixed = FALSE, perl = FALSE, useBytes = FALSE, warn = getOption("fansi.warn"), term.cap = getOption("fansi.term.cap"), ctl = "all" ) strsplit_sgr( x, split, fixed = FALSE, perl = FALSE, useBytes = FALSE, warn = getOption("fansi.warn"), term.cap = getOption("fansi.term.cap") )
x |
a character vector, or, unlike base::strsplit an object that can be coerced to character. |
split |
character vector (or object which can be coerced to such)
containing regular expression(s) (unless |
fixed |
logical. If |
perl |
logical. Should Perl-compatible regexps be used? |
useBytes |
logical. If |
warn |
TRUE (default) or FALSE, whether to warn when potentially
problematic Control Sequences are encountered. These could cause the
assumptions |
term.cap |
character a vector of the capabilities of the terminal, can
be any combination "bright" (SGR codes 90-97, 100-107), "256" (SGR codes
starting with "38;5" or "48;5"), and "truecolor" (SGR codes starting with
"38;2" or "48;2"). Changing this parameter changes how |
ctl |
character, which Control Sequences should be treated specially. See the "_ctl vs. _sgr" section for details.
|
This function works by computing the position of the split points after
removing Control Sequences, and uses those positions in conjunction with
substr_ctl
to extract the pieces. This concept is borrowed from
crayon::col_strsplit
. An important implication of this is that you cannot
split by Control Sequences that are being treated as Control Sequences.
You can however limit which control sequences are treated specially via the
ctl
parameters (see examples).
list, see base::strsplit.
The *_ctl
versions of the functions treat all Control Sequences specially
by default. Special treatment is context dependent, and may include
detecting them and/or computing their display/character width as zero. For
the SGR subset of the ANSI CSI sequences, fansi
will also parse, interpret,
and reapply the text styles they encode if needed. You can modify whether a
Control Sequence is treated specially with the ctl
parameter. You can
exclude a type of Control Sequence from special treatment by combining
"all" with that type of sequence (e.g. ctl=c("all", "nl")
for special
treatment of all Control Sequences but newlines). The *_sgr
versions
only treat ANSI CSI SGR sequences specially, and are equivalent to the
*_ctl
versions with the ctl
parameter set to "sgr".
Non-ASCII strings are converted to and returned in UTF-8 encoding. The
split positions are computed after both x
and split
are converted to
UTF-8.
fansi for details on how Control Sequences are interpreted, particularly if you are getting unexpected results, base::strsplit for details on the splitting.
strsplit_sgr("\033[31mhello\033[42m world!", " ") ## Next two examples allow splitting by newlines, which ## normally doesn't work as newlines are _Control Sequences_ strsplit_sgr("\033[31mhello\033[42m\nworld!", "\n") strsplit_ctl("\033[31mhello\033[42m\nworld!", "\n", ctl=c("all", "nl"))