Unicode

From Reboil

Unicode is a system for encoding graphemes of various languages of Earth.

Stats

  • Standard: ISO 10646 Information technology — Universal coded character set (UCS)

Combining Marks

Combining Diacritical Marks
Code Point Example Description Comment
U+0300 à Combining Grave Accent
U+0301 á Combining Acute Accent
U+0302 â Combining Circumflex Accent
U+0303 ã Combining Tilde Accent
U+0304 ā Combining Macron
U+0305 A̅B Combining Overline
U+0306 ă Combining Breve
U+0307 ċ Combining Dot Above
U+0308 ä Combining Diaeresis
U+030A å Combining Ring Above
U+030C ǎ Combining Caron Above
U+0323 Combining Dot Below[1]
U+0327 ç Combining Cedilla
U+0328 ǫ Combining Ogonek
U+0331 Combining Macron Below
U+0332 k̲h Combining Low Line
U+3099 あ゙ Combining Katakana-Hiragana Voiced Sound Mark[2]
U+309A あ゚ Combining Katakana-Hiragana Semi-Voiced Sound Mark[2]
U+0310 Combining Candrabindu
U+035E ͞ab Combining Double Macron
U+035F ͟kha Combining Double Macron Below
Example Example Example Example

Code blocks containing combining marks include (from ref):

Unicode FAQ on combining marks.

Combining Diacritical Marks (Devangari)

Combining Diacritical Marks (Devangari)
Code Point Example Description Comment
U+093C ख़ DEVANAGARI SIGN NUKTA
Example Example Example Example

My frequently used code points

Commonly used Unicode points
Code Point Example Description
U+00A0   No break space
U+00A9 © Copyright sign
U+00B6 Pilcrow Sign[3]
U+00C6 Æ Latin Capital Letter Ae
U+00D0 Ð Latin Capital Letter Eth
U+00DE Þ Latin Capital Letter Thorn
U+00E6 æ Latin Small Letter Ae
U+00F0 ð Latin Small Letter Eth
U+00FE þ Latin Small Letter Thorn
U+0141 Ł Latin Capital Letter L
U+0142 ł Latin Small Letter L
U+02BC ʼ Modifier Letter Apostrophe
U+1D2C Modifier Letter Capital A
U+1D2E Modifier Letter Capital B
U+2013 En dash
U+2014 Em dash
U+2018 Left single quotation mark
U+2019 Right single quotation mark
U+201C Left double quotation mark
U+201D Right double quotation mark
U+2026 Horizontal ellipsis
U+2126 Ω Ohm sign
U+2190 Leftwards arrow
U+2191 Upwards arrow
U+2192 Rghtwards arrow
U+2193 Downwards arrow
U+2194 Left right arrow
U+2195 Up down arrow
U+2122 Trade Mark Sign
U+21D0 Leftwards Double Arrow
U+21D1 Upwards Double Arrow
U+21D2 Rightwards Double Arrow
U+21D3 Downwards Double Arrow
U+21D4 Left Right Double Arrow
U+25CC Dotted Circle
U+263C White sun with rays[4]
U+2661 White Heart Suit
U+2665 Black Heart Suit
U+2705 White Heavy Check Mark
U+2728 Sparkles
U+274C Cross Mark
U+29B5 Circle with Horizontal Bar[5]
U+30FB Katakana Middle Dot
U+1F38B 🎋 Tanabata Tree
Example Example Example


Math-related code points
Code Point Example Description
U+00B0 ° Degree sign
U+00B1 ± Plus-minus symbol
U+00D7 × Multiplication Sign
U+03B8 θ Greek Small Letter Theta
U+2032 Prime (e.g. to mark derivatives in Calculus)
U+2219 Bullet Operator (e.g. dot product)
U+221A Square Root
U+221D Proportional To
U+221E Infinity
U+222B Integral (i.e. from Calculus)
U+2248 Almost Equal To


Music-related code points
Code Point Example Description
U+2669 Quarter Note
U+266A Eighth Note
U+266B Beamed Eighth Note
U+266B Eighth note
Flag-related code points[cmt 1]
Code Point code Example Description
U+1F1E6 U+1F1F7 AR 🇦🇷 Argentina flag
U+1F1E7 U+1F1F7 BR 🇧🇷 Brazil flag
U+1F1E8 U+1F1F1 CL 🇨🇱 Chile flag
U+1F1E8 U+1F1F3 CN 🇨🇳 China flag
U+1F1E9 U+1F1EA DE 🇩🇪 Germany flag
U+1F1EA U+1F1EC EG 🇪🇬 Egypt flag
U+1F1EA U+1F1F8 ES 🇪🇸 Spain flag
U+1F1EA U+1F1FA EU 🇪🇺 European Union flag
U+1F1EB U+1F1EE FI 🇫🇮 Finland flag
U+1F1EC U+1F1E7 GB 🇬🇧 Great Britain flag
U+1F1EC U+1F1F7 GR 🇬🇷 Greece flag
U+1F1EC U+1F1F9 GT 🇬🇹 Guatemala flag
U+1F1ED U+1F1F3 HN 🇭🇳 Honduras flag
U+1F1EE U+1F1E9 ID 🇮🇩 Indonesia flag
U+1F1EE U+1F1F1 IL 🇮🇱 Israel flag
U+1F1EE U+1F1F3 IN 🇮🇳 India flag
U+1F1EE U+1F1F8 IS 🇮🇸 Iceland flag
U+1F1EE U+1F1F9 IT 🇮🇹 Italy flag
U+1F1EF U+1F1F5 JP 🇯🇵 Japan flag
U+1F1F0 U+1F1F7 KR 🇰🇷 South Korea flag
U+1F1F2 U+1F1FD MX 🇲🇽 Mexico flag
U+1F1F3 U+1F1F4 NG 🇳🇬 Nigeria flag
U+1F1F3 U+1F1EE NI 🇳🇮 Nicaragua flag
U+1F1F3 U+1F1FF NZ 🇳🇿 New Zealand flag
U+1F1F5 U+1F1E6 PA 🇵🇦 Panama flag
U+1F1F5 U+1F1ED PH 🇵🇭 Philipines flag
U+1F1F5 U+1F1F8 PS 🇵🇸 Palestine flag
U+1F1F5 U+1F1F9 PT 🇵🇹 Portugal flag
U+1F1F7 U+1F1FA RU 🇷🇺 Russia flag
U+1F1F8 U+1F1E6 SA 🇸🇦 Saudi Arabia flag
U+1F1F8 U+1F1EA SE 🇸🇪 Sweden flag
U+1F1F8 U+1F1FB SV 🇸🇻 El Salvador flag
U+1F1F9 U+1F1F7 TH 🇹🇭 Thailand flag
U+1F1F9 U+1F1FC TW 🇹🇼 Taiwan flag
U+1F1FA U+1F1E6 UA 🇺🇦 Ukraine flag
U+1F1FA U+1F1F8 US 🇺🇸 United States flag
U+1F1FB U+1F1EA VE 🇻🇪 Venezuela flag
U+1F1FC U+1F1EB WS 🇼🇸 Samoa flag
U+1F1FD U+1F1F0 XK 🇽🇰 Kosovo flag
U+1F1FE U+1F1EA YE 🇾🇪 Yemen flag
U+1F1FF U+1F1E6 ZA 🇿🇦 South Africa flag

Usage

My language notes pages (e.g. Navajo notes) use combining diacritics extensively.

In TeXmacs, unicode points may be manually entered via the Emacs look-and-feel mode and typing C-q, a hash symbol, and the unicode point number. (e.g. #29B5 will yield a ⦵) In the default look-and-feel, Esc-q works in lieu of C-q.

Encoding

An explanation for the mechanics of how UTF-8 encodes Unicode point numbers across multibytes can be found here.

History

See also


External links

References

  1. Baltakatei: 2023-09-11: Use U+093C DEVANAGARI SIGN NUKTA with Devangari (e.g. Hindi).
  2. 2.0 2.1 Dakuten and handakuten”. (2023-09-07). Wikipedia. Accessed 2023-09-07.
  3. Baltakatei: a.k.a. “paragraph sign”)
  4. Baltakatei: 2024-01-18: Dwarf Fortress glyph for “Dwarfbucks”.
  5. Baltakatei: 2023-11-13: Also known as a “plimsoll symbol” which IUPAC recommends in chemistry contexts to indicate a standard state.

Footnotes


Comments