Three Different Cuts
In this post, we’ll look at how Rust, Go, and Zig express the signature of function cut
— the power tool of string manipulation.
Cut takes a string and a pattern, and splits the string around the first occurrence of the pattern:
cut("life", "if") = ("l", "e")
.
At a glance, it seems like a non-orthogonal jumbling together of searching and slicing.
However, in practice a lot of ad-hoc string processing can be elegantly expressed via cut
.
A lot of things are key=value
pairs, and cut fits perfectly there.
What’s more, many more complex sequencies, like
--arg=key=value
,
can be viewed as nested pairs.
You can cut around =
once to get --arg
and key=value
, and then cut the second time to separate key
from value
.
In Rust, this function looks like this:
Rust’s Option
is a good fit for the result type, it clearnly describes the behavior of the function when the pattern isn’t found in the string at all.
Lifetime 'a
expresses the relationship between the result and the input — both pieces of result are substrings of &'a self
, so, as long as they are used, the original string must be kept alive as well.
Finally, the separator isn’t another string, but a generic P: Pattern
.
This gives a somewhat crowded signature, but allows using strings, single characters, and even fn(c: char) -> bool
functions as patterns.
When using the function, there are is a multitude of ways to access the result:
Here’s a Go equivalent:
It has a better name!
It’s important that frequently used building-block functions have short, memorable names, and “cut” is just perfect for what the function does.
Go doesn’t have an Option
, but it allows multiple return values, and any type in Go has a zero value, so a boolean flag can be used to signal None
.
Curiously if the sep
is not found in s
, after
is set to ""
, but before
is set to s
(that is, the whole string).
This is occasionally useful, and corresponds to the last Rust example.
But it also isn’t something immediately obvious from the signature, it’s an extra detail to keep in mind.
Which might be fine for a foundational function!
Similarly to Rust, the resulting strings point to the same memory as s
.
There are no lifetimes, but a potential performance gotcha — if one of the resulting strings is alive, then the entire s
can’t be garbage collected.
There isn’t much in way of using the function in Go:
Zig doesn’t yet have an equivalent function in its standard library, but it probably will at some point, and the signature might look like this:
Similarly to Rust, Zig can express optional values.
Unlike Rust, the option is a built-in, rather than a user-defined type (Zig can express a generic user-defined option, but chooses not to).
All types in Zig are strictly prefix, so leading ?
concisely signals optionality.
Zig doesn’t have first-class tuple types, but uses very concise and flexible type declaration syntax, so we can return a named tuple.
Curiously, this anonymous struct is still a nominal, rather than a structural, type!
Similarly to Rust, prefix
and suffix
borrow the same memory that s
does.
Unlike Rust, this isn’t expressed in the signature — while in this case it is obvious that the lifetime would be bound to s
, rather than sep
, there are no type system guardrails here.
Because ?
is a built-in type, we need some amount of special syntax to handle the result, but it curiously feels less special-case and more versatile than the Rust version.
Moral of the story? Work with the grain of the language — expressing the same concept in different languages usually requires a slightly different vocabulary.