Here is the list of commonly used Unicode character properties in regular expressions that require to match Unicode texts:
Unicode character class | Meaning |
\p{L} | Match any letter from any language |
\p{Lu} | Match any uppercase letter from any language |
\p{Ll} | Match any lowercase letter from any language |
\p{N} | Match any digit from any language |
\p{P} | Match any punctuation letter from any language |
\p{Z} | Match any kind of whitespace or invisible separator |
\p{C} | Match any invisible control letter |
\p{Sc} | Match any currency symbol |
\R | Any Unicode linebreak sequence; is equivalent to \u000D\u000A|[\u000A\u000B\u000C\u000D\u0085\u2028\u2029]
It is recommended to use \R to match any newline ... |