struct
String::Grapheme
Overview
Grapheme represents a Unicode grapheme cluster, which describes the smallest
functional unit of a writing system. This is also called a user-perceived character.
In the latin alphabet, most graphemes consist of a single Unicode codepoint
(equivalent to Char). But a grapheme can also consist of a sequence of codepoints,
which combine into a single unit.
For example, the string "e\u0301" consists of two characters, the latin small letter e
and the combining acute accent ´. Together, they form a single grapheme: é.
That same grapheme could alternatively be described in a single codepoint, \u00E9 (latin small letter e with acute).
But the combinatory possibilities are far bigger than the amount of directly
available codepoints.
"e\u0301".size # => 2
"é".size # => 1
"e\u0301".grapheme_size # => 1
"é".grapheme_size # => 1
This combination of codepoints is common in some non-latin scripts. It's also
often used with emojis to create customized combination. For example, the
thumbs up sign 👍 (U+1F44D) combined with an emoji modifier such as
U+1F3FC assign a colour to the emoji.
Instances of this type can be acquired via String#each_grapheme or String#graphemes.
The algorithm to determine boundaries between grapheme clusters is specified in the Unicode Standard Annex #29.
EXPERIMENTAL The grapheme API is still under development. Join the discussion at #11610.
Defined in:
string/grapheme/grapheme.crstring/grapheme/properties.cr
Instance Method Summary
-
#==(other : self) : Bool
Returns
trueif other is equivalent toself. -
#bytesize : Int32
Returns the number of bytes in the UTF-8 representation of this grapheme cluster.
-
#inspect(io : IO) : Nil
Appends a representation of this grapheme cluster to io.
-
#size : Int32
Returns the number of characters in this grapheme cluster.
-
#to_s(io : IO) : Nil
Appends the characters in this grapheme cluster to io.
-
#to_s : String
Returns the characters in this grapheme cluster.
Instance methods inherited from struct Struct
==(other) : Bool
==,
hash(hasher)
hash,
inspect(io : IO) : Nil
inspect,
pretty_print(pp) : Nil
pretty_print,
to_s(io : IO) : Nil
to_s
Instance methods inherited from struct Value
==(other : JSON::Any)==(other : YAML::Any)
==(other) ==, dup dup
Instance methods inherited from class Object
! : Bool
!,
!=(other)
!=,
!~(other)
!~,
==(other)
==,
===(other : JSON::Any)===(other : YAML::Any)
===(other) ===, =~(other) =~, as(type : Class) as, as?(type : Class) as?, class class, dup dup, hash(hasher)
hash hash, in?(collection : Object) : Bool
in?(*values : Object) : Bool in?, inspect(io : IO) : Nil
inspect : String inspect, is_a?(type : Class) : Bool is_a?, itself itself, nil? : Bool nil?, not_nil!(message)
not_nil! not_nil!, pretty_inspect(width = 79, newline = "\n", indent = 0) : String pretty_inspect, pretty_print(pp : PrettyPrint) : Nil pretty_print, responds_to?(name : Symbol) : Bool responds_to?, tap(&) tap, to_json(io : IO) : Nil
to_json : String to_json, to_pretty_json(indent : String = " ") : String
to_pretty_json(io : IO, indent : String = " ") : Nil to_pretty_json, to_s(io : IO) : Nil
to_s : String to_s, to_yaml(io : IO) : Nil
to_yaml : String to_yaml, try(&) try, unsafe_as(type : T.class) forall T unsafe_as
Class methods inherited from class Object
from_json(string_or_io, root : String)from_json(string_or_io) from_json, from_yaml(string_or_io : String | IO) from_yaml
Macros inherited from class Object
class_getter(*names, &block)
class_getter,
class_getter!(*names)
class_getter!,
class_getter?(*names, &block)
class_getter?,
class_property(*names, &block)
class_property,
class_property!(*names)
class_property!,
class_property?(*names, &block)
class_property?,
class_setter(*names)
class_setter,
def_clone
def_clone,
def_equals(*fields)
def_equals,
def_equals_and_hash(*fields)
def_equals_and_hash,
def_hash(*fields)
def_hash,
delegate(*methods, to object)
delegate,
forward_missing_to(delegate)
forward_missing_to,
getter(*names, &block)
getter,
getter!(*names)
getter!,
getter?(*names, &block)
getter?,
property(*names, &block)
property,
property!(*names)
property!,
property?(*names, &block)
property?,
setter(*names)
setter
Instance Method Detail
Returns true if other is equivalent to self.
Two graphemes are considered equivalent if they contain the same sequence of codepoints.
Returns the number of bytes in the UTF-8 representation of this grapheme cluster.