Unicode

FLEX is designed to support 8-bit characters. I classified UNICODE for use in the lexical analyzer.

  1. ARMENIAN  - [\x30-\x8F][\x05]

  2. ARMENIAN_LIGATURES  - [\x00-\x4F][\xFB]

  3. COPTIC  - [\x80-\xFF][\X2C]

  4. CYRILLIC  - [\x00-\xFF][\x04]

  5. CYRILLIC_SUPPLEMENT  - [\x00-\x2F][\x05]

  6. GEORGIAN  - [\xA0-\xFF][\x10]

  7. GEORGIAN_SUPPLEMENT  - [\x00-\x2F][\x2D]

  8. GREEK  - [\x70-\xFF][\x03]

  9. GREEK_EXTENDED  - [\x00-\xFF][\x1F]

  10. BASIC_LATIN  - [A-Za-z][\x00]

  11. LATIN_1  -  [\xC0-\xD6\xD8-\xF6\xF8-\xFF][\x00]

  12. LATIN_EXTENDED_A  - [\x00-\x7F][\x01]

  13. LATIN_EXTENDED_B  - ([\x80-\xFF][\x01]) | ([\x00-\x4F][\x02])

  14. LATIN_EXTENDED_C  - [\x60-\x7F][\x2C]

  15. LATIN_EXTENDED_D  - [\x20-\xFF][\xA7]

  16. LATIN_EXTENDED_ADDITIONAL  - [\x00-\xFF][\x1E]

  17. ETHIOPIC  - ([\x00-\xFF][\x12]) | ([\x00-\x7F][\x13])

  18. ETHIOPIC_SUPPLEMENT  - [\x80-\x9F][\x13]

  19. ETHIOPIC_EXTENDED  - [\x80-\xDF][\x2D]

  20. NKO  - [\xC0-\xFF][\x07]

  21. TIFINAGH  - [\x30-\x7F][\x2D]

  22. ARABIC  - [\x00-\xFF][\x06]

  23. ARABIC_SUPPLEMENT  - [\x50-\x7F][\x07]

  24. ARABIC_PRESENTATION_FORM_A  - ([\x50-\xFF][\xFB]) | ([\x00-\xFF][\xFC-\xFD])

  25. ARABIC_PRESENTATION_FORM_B  - [\x70-\xFF][\xFE]

  26. HEBREW  - [\x90-\xFF][\x05]

  27. SYRIAC  - [\x00-\x4F][\x07]

  28. THAANA  - [\x80-\xBF][\x07]

  29. CANADIAN_SYLLABICS  - ([\x00-\xFF][\x14\x15]) | ([\x00-\x7F][\x16])

  30. CHEROKEE  - [\xA0-\xFF][\x13]

  31. GLAGOLITIC  - [\x00-\x5F][\x2C]

  32. BENGALI  - [\x80-\xFF][\x09]

  33. DEVANAGARI  - [\x00-\x7F][\x09]

  34. GUJARATI  - [\x80-\xFF][\x0A]

  35. GURMUKHI  - [\x00-\x7F][\x0A]

  36. KANNADA  - [\x80-\xFF][\x0C]

  37. LIMBU  - [\x00-\x4F][\x19]

  38. MALAYALAM  - [\x00-\x7F][\x0D]

  39. ORIYA  - [\x00-\x7F][\x0B]

  40. SINHALA  - [\x80-\xFF][\x0D]

  41. SYLOTI_NAGRI  - [\x00-\x2F][\xA8]

  42. TAMIL  - [\x80-\xFF][\x0B]

  43. TELUGU  - [\x00-\x7F][\x0C]

  44. BUHID  - [\x40-\x5F][\x17]

  45. HANUNOO  - [\x20-\x3F][\x17]

  46. TAGALOG  - [\x00-\x1F][\x17]

  47. TAGBANWA  - [\x60-\x7F][\x17]

  48. BUGINESE  - [\x00-\x1F][\x1A]

  49. BALINESE  - [\x00-\x7F][\x1B]

  50. KHMER  - [\x80-\xFF][\x17]

  51. CJK_UNIFIED_IDEOGRAPHS  - ([\x00-\xFF][\x4E-\x9E])| ([\x00-\xBF][\x9F])

  52. CJK_UNIFIED_IDEOGRAPHS_EXTENSTION_A  - ([\x00-\xFF][\x34-\x4C])|([\x00-\xBF][\x4D])

  53. CJK_COMPATIBILITY_IDEOGRAPHS  - ([\x00-\xFF][\xF9\xFA])

  54. KANBUN  - [\x90-\x9F][\x31]

  55. CJK_RADICALS  - [\x80-\xFF][\x2E]

  56. KANGXI_RADICALS  - [\x00-\xDF][\x2F]

  57. CJK_STROKES  - [\xC0-\xEF][\x31]

  58. BOPOMOFO  - [\x00-\x2F][\x31]

  59. BOPOMOFO_EXTENDED  - [\xA0-\xBF][\x31]

  60. HIRAGANA  - [\x40-\x9F][\x30]

  61. KATAKANA  - [\xA0-\xFF][\x30]

  62. KATAKANA_PHONETIC_EXTENSIONS  - [\xF0-\xFF][\x31]

  63. HANGUL_SYLLABLES  - ([\x00-\xFF][\xAC-\xD6])|([\x00-\xAF][\xD7])

  64. HANGUL_JAMO  - [\x00-\xFF][\x11]

  65. HANGUL_COMPATIBILITY_JAMO  - [\x30-\x8F][\x31]

  66. YI_SYLLABLES  - ([\x00-\xFF][\xA0-\xA3])|([\x00-\x8F][\xA4])

  67. YI_RADICALS  - [\x90-\xCF][\xA4]

  68. MONGOLIAN  - [\x00-\xAF][\x18]

  69. PHAGS_PA  - [\x40-\x7F][\xA8]

  70. TIBETAN  - [\x00-\xFF][\x0F]

  71. OGHAM  - [\x80-\x9F][\x16]

  72. RUNIC  - [\xA0-\xFF][\x16]