Source: http://www.w3.org/International/questions/qa-forms-utf-8
Matches a valid UTF-8 encoded string. Can be used to check for UTF-8 vs. ISO-8859-*.
Based on code from http://www.w3.org/International/questions/qa-forms-utf-8 and syntax from https://tools.ietf.org/html/rfc3629
NOTE: because this is a test for UTF-8 correctness, it does not use the /u
modifier.
If you try it online with a non-ASCII character (>127), you can ignore the warning.
The matched text will also have some incorrect characters in it from the conversion to ISO-8859-1.
Sample PHP code:
if (preg_match('/.../', $str)) {
// It's valid UTF-8. It could also be 7-bit ASCII, but they are 100% compatible
} else {
// It's not UTF-8 or ASCII, probably UTF-16 or ISO-8859-X
}