View all functions

CategoryStrings: Transform

What does the split function do in MATLAB / RunMat?

split(text) breaks text into substrings separated by delimiters. The input can be a string scalar, string array, character array, or a cell array of character vectors—split mirrors MATLAB behaviour across each of these representations. When you omit the delimiter argument, split collapses whitespace runs and returns the remaining tokens as a string array.

How does the split function behave in MATLAB / RunMat?

  • The default delimiter is whitespace (isspace), and consecutive whitespace is treated as a single separator (equivalent to 'CollapseDelimiters', true).
  • When you supply explicit delimiters, they can be a string scalar, string array, character array (rows), or a cell array of character vectors. Delimiters are matched left to right and the longest delimiter wins when several candidates match at the same position.
  • 'CollapseDelimiters' controls whether consecutive delimiters generate empty substrings. The default is false when you specify explicit delimiters and true when you rely on the whitespace default.
  • 'IncludeDelimiters' inserts the matched delimiters as separate elements in the output string array.
  • Outputs are string arrays. For scalar inputs, the result is a row vector. For string/character arrays, the first dimension matches the number of rows in the input and additional columns are appended to accommodate the longest token list. Missing values are padded with <missing>.
  • Missing string scalars propagate unchanged.

split Function GPU Execution Behaviour

split executes on the CPU. When the input or delimiter arguments reside on the GPU, RunMat gathers them to host memory before performing the split so the results match MATLAB exactly. Providers do not need to implement custom kernels for this builtin today.

GPU residency in RunMat (Do I need gpuArray?)

String manipulation currently runs on the host. If text data lives on the GPU (for example after a gathered computation), split automatically fetches it. You never need to move text explicitly before calling this builtin.

Examples of using the split function in MATLAB / RunMat

Split A String On Whitespace

txt = "RunMat Accelerate Planner";
pieces = split(txt);

Expected output:

pieces = 1×3 string
    "RunMat"    "Accelerate"    "Planner"

Split A String Using A Custom Delimiter

csv = "alpha,beta,gamma";
tokens = split(csv, ",");

Expected output:

tokens = 1×3 string
    "alpha"    "beta"    "gamma"

Include Delimiters In The Output

expr = "A+B-C";
segments = split(expr, ["+", "-"], "IncludeDelimiters", true);

Expected output:

segments = 1×5 string
    "A"    "+"    "B"    "-"    "C"

Preserve Empty Segments When CollapseDelimiters Is False

values = "one,,three,";
parts = split(values, ",", "CollapseDelimiters", false);

Expected output:

parts = 1×4 string
    "one"    ""    "three"    ""

Split Each Row Of A Character Array

rows = char("GPU Accelerate", "Ignition Interpreter");
result = split(rows);

Expected output:

result = 2×2 string
    "GPU"          "Accelerate"
    "Ignition"     "Interpreter"

Split Elements Of A Cell Array

C = {'RunMat Snapshot'; "Fusion Planner"};
out = split(C, " ");

Expected output:

out = 2×2 string
    "RunMat"    "Snapshot"
    "Fusion"    "Planner"

Handle Missing String Inputs

names = ["RunMat", "<missing>", "Accelerate Engine"];
split_names = split(names);

Expected output:

split_names = 3×2 string
    "RunMat"        "<missing>"
    "<missing>"     "<missing>"
    "Accelerate"    "Engine"

FAQ

What delimiters does split use by default?

When you omit the second argument, split treats any Unicode whitespace as a delimiter and collapses consecutive whitespace runs so they produce a single split point.

How do explicit delimiters change the defaults?

Providing explicit delimiters switches the default for 'CollapseDelimiters' to false, matching MATLAB. You can override that behaviour with a name-value pair.

What happens when 'IncludeDelimiters' is true?

Matched delimiters are inserted between tokens in the output string array, preserving their original order. Tokens still expand to fill rows and columns, with missing values used for padding.

How is the output sized for string arrays?

The number of rows matches the input. Columns are added to accommodate the longest token list observed across all elements. Shorter rows are padded with <missing>.

How does split handle missing strings?

Missing string scalars propagate unchanged. When padding is required, <missing> is used so MATLAB and RunMat stay aligned.

Can I provide empty delimiters?

No. Empty delimiters are disallowed, matching MATLAB's input validation. Specify at least one character per delimiter.

Which argument types are accepted as delimiters?

You may pass string scalars, string arrays, character arrays (each row is a delimiter), or cell arrays containing string scalars or character vectors.

See Also

strsplit, replace, lower, upper, strip

Source & Feedback