View all functions

CategoryStrings: Regex

What does the regexpi function do in MATLAB / RunMat?

regexpi(text, pattern) evaluates regular expression matches while ignoring case by default. Outputs mirror MATLAB: you can retrieve 1-based match indices, substrings, capture tokens, token extents, named tokens, or the text split around matches. Flags such as 'once', 'tokens', 'match', 'split', 'tokenExtents', 'names', 'emptymatch', and 'forceCellOutput' are supported, together with case toggles ('ignorecase', 'matchcase') and newline behaviour ('dotall', 'dotExceptNewline', 'lineanchors').

How does the regexpi function behave in MATLAB / RunMat?

  • Case-insensitive matching is the default; include 'matchcase' when you need case-sensitive behaviour.
  • With one output, regexpi returns a numeric row vector of 1-based match start indices.
  • With multiple outputs, the default order is match starts, match ends, matched substrings.
  • When the input is a string array or cell array of character vectors, outputs are cell arrays whose shape matches the input container.
  • 'forceCellOutput' forces cell outputs even for scalar inputs, matching MATLAB semantics.
  • 'once' limits each element to its first match, influencing every requested output.
  • 'emptymatch','allow' keeps zero-length matches; 'emptymatch','remove' is the default filter.
  • Named tokens (using (?<name>...)) return scalar struct values per match when 'names' is requested. Unmatched names resolve to empty strings for MATLAB compatibility.

regexpi Function GPU Execution Behaviour

regexpi executes entirely on the CPU. If inputs or previously computed intermediates are resident on the GPU, RunMat gathers the necessary data before evaluation and returns host-side outputs. Acceleration providers do not offer specialised hooks today; computed tensors remain on the host unless explicit GPU transfers are requested later.

Examples of using the regexpi function in MATLAB / RunMat

Finding indices regardless of case

idx = regexpi('Abracadabra', 'a');

Expected output:

idx =
     1     4     6     8    11

Returning matched substrings ignoring case

matches = regexpi('abcXYZ123', '[a-z]{3}', 'match');

Expected output:

matches =
  1×2 cell array
    {'abc'}    {'XYZ'}

Extracting capture tokens case-insensitively

tokens = regexpi('ID:AB12', '(?<prefix>[a-z]+)(?<digits>\d+)', 'tokens');
first = tokens{1}{1};
second = tokens{1}{2};

Expected output:

first =
    'AB'
second =
    '12'

Limiting regexpi to the first match

firstMatch = regexpi('aXaXaX', 'ax', 'match', 'once');

Expected output:

firstMatch =
    'aX'

Splitting a string array without worrying about letter case

parts = regexpi(["Color:Red"; "COLOR:Blue"], 'color:', 'split');

Expected output:

parts =
  2×1 cell array
    {1×2 cell}
    {1×2 cell}

parts{2}{2}
ans =
    'Blue'

Enforcing case-sensitive matches with 'matchcase'

idx = regexpi('CaseTest', 'case', 'matchcase');

Expected output:

idx =
     []

FAQ

How are the outputs ordered when I request several?

If you do not specify explicit output flags, the default order is match starts, match ends, and matched substrings—identical to MATLAB. Providing flags such as 'match' or 'tokens' returns only the requested outputs.

Can I make regexpi behave like regexp with case sensitivity?

Yes. Include the 'matchcase' flag to disable the default case-insensitive mode. You can also pass 'ignorecase' explicitly to emphasise the default.

Does regexpi support string arrays and cell arrays?

Yes. Outputs mirror the input container shape, and each element stores results for the corresponding string or character vector.

How do zero-length matches behave?

By default ('emptymatch','remove'), zero-length matches are omitted. Use 'emptymatch','allow' to keep them, which is helpful when inspecting optional pattern components.

Does regexpi run on the GPU?

No. All matching occurs on the CPU. RunMat gathers GPU-resident inputs before processing and leaves outputs on the host. Explicit gpuArray calls are required if you want to move the results back to the GPU.

Are named tokens supported?

Yes. Use the (?<name>...) syntax and request the 'names' output flag. Each match produces a scalar struct with fields for every named group.

What happens with 'once'?

'once' restricts each input element to the first match. All requested outputs honour that limit, returning scalars instead of per-match cells.

Can I keep scalar outputs in cells?

Yes. Pass 'forceCellOutput' to wrap even scalar results in cells, which is useful when writing code that must treat scalar and array inputs uniformly.

See Also

regexp, regexprep, contains, split, strfind

Source & Feedback