Eric-Paul Ickhorn
70c8fc6f72
This commit adds two main utilities: 1. It adds a function for extracting the Unicode rune encoded at a given offset in an UTF-8 formatted string. 2. It adds functions for checking if a rune is a Latin letter, whether it's lowercase or uppercase, if it's a digit, and so on. This is particularily useful for writing the tokenizer for the scripts which have to be parsed (the rule-parser will only accept tokens). |
||
---|---|---|
.. | ||
utf8.c |