Module utf8
This package provides some basic utilities for working with Unicode text.
Lily ensures that String values are valid UTF-8, but the String class otherwise does not provide Unicode functionality. Instead, it is provided in this module.
Notes:
- This module works with codepoints, not graphemes, since the latter are significantly more complicated to handle.
- There are currently no utilities for inspecting properties of codepoints.
Functions
define as_list(string: String): List[Integer]
Return a List of all codepoints in string.
define compare(a: String, b: String): Integer
Compare a and b in terms of codepoints.
If a is lesser than b, -1 is returned. If a is greater than b, 1 is returned. If they are identical, 0 is returned. This is the same return format List.sort uses, so this function can be used as a custom comparator for it.
define each_codepoint(string: String, fn: Function(Integer))
Call fn for each codepoint within string.
define get(string: String, index: Integer): Integer
Return the codepoint at index in string.
If a negative index is given, it is treated as an offset from the end of string, with -1 being considered the last element.
IndexErrorifindexis out of range.
define length(string: String): Integer
Return the length of string in codepoints.
define slice(string: String, start: *Integer, stop: *Integer): String
Create a new String copying a section of string from start to stop. Unlike String.slice, the indices refer to codepoints, not bytes.
If a negative index is given, it is treated as an offset from the end of string, with -1 being considered the last element.
On error, this generates an empty String. Error conditions are:
- Either
startorstopis out of range. - The
startis larger than thestop(reversed).