This commit adds two main utilities:
1. It adds a function for extracting the Unicode rune encoded at a
given offset in an UTF-8 formatted string.
2. It adds functions for checking if a rune is a Latin letter,
whether it's lowercase or uppercase, if it's a digit, and so on.
This is particularily useful for writing the tokenizer for the scripts
which have to be parsed (the rule-parser will only accept tokens).