chmap

A CLI utility to get info for Unicode characters

MIT License

Stars
51

chmap (formerly lcharmap)

A CLI utility to get information for Unicode characters.

What?

chmap is a little utility to get information on Unicode characters, such as its description, hexadecimal/octal representation, Unicode category, UTF-8 encoding, and more.

$ chmap -r 9-10,13935,255-258
codepoint  glyph  encoded     case   description
        9         9           other  character tabulation
       10         A           other  line feed (lf)
    13935       E3 99 AF    other  modifier letter chinese tone yin ping
      255        C3 BF       lower  latin small letter y with diaeresis
      256        C4 80       upper  latin capital letter a with macron
      257        C4 81       lower  latin small letter a with macron
      258        C4 82       upper  latin capital letter a with breve

chmap was inspired by the charmap.exe tool present in Windows XP and later.

Where?

Tarballs are available from GitHub releases for Linux-x86_64 and Linux-armv6l. Extract, cd, and run:

$ sudo ./install.sh /usr/local

Building from Source

Build dependencies

  • a C99 compiler, GNU Make
  • scdoc (manpage)

Download the latest source tarball from GitHub releases (chmap-v$VERSION.tar.xz), extract, and build:

# make clean install

How?

Run chmap with the -r flag (for range) to get info for a range of Unicode runes:

$ chmap -r 0-5
codepoint  glyph  encoded      case   description
        0                      other  null
        1         1            other  start of heading
        2         2            other  start of text
        3         3            other  end of text
        4         4            other  end of transmission
        5         5            other  enquiry

Example ranges: 0-1, 355-369, 34, 189-192,12,45-49.

You may also list info for a list of given characters with the -c flag:

$ chmap -c Lovecraft
codepoint  glyph  encoded      case   description
       76  L      4C           upper  latin capital letter l
      111  o      6F           lower  latin small letter o
      118  v      76           lower  latin small letter v
      101  e      65           lower  latin small letter e
       99  c      63           lower  latin small letter c
      114  r      72           lower  latin small letter r
       97  a      61           lower  latin small letter a
      102  f      66           lower  latin small letter f
      116  t      74           lower  latin small letter t

You can search for characters that match a regex, too, with the -s flag:

$ chmap -s '^latin.*capital letter z'                                                       
codepoint  glyph  encoded      case   description
       90  Z      5A           upper  latin capital letter z
      377        C5 B9        upper  latin capital letter z with acute
      379        C5 BB        upper  latin capital letter z with dot above
      381        C5 BD        upper  latin capital letter z with caron
      437        C6 B5        upper  latin capital letter z with stroke
      548        C8 A4        upper  latin capital letter z with hook
     7824        E1 BA 90     upper  latin capital letter z with circumflex
     7826        E1 BA 92     upper  latin capital letter z with dot below
     7828        E1 BA 94     upper  latin capital letter z with line below
    11371        E2 B1 AB     upper  latin capital letter z with descender
    11391        E2 B1 BF     upper  latin capital letter z with swash tail
    42950        EA 9F 86     upper  latin capital letter z with palatal hook

Adding the -l flag causes chmap to print more information:

$ chmap -l -r 13000                                                                         
codepoint    13000 0x32C8  0o31310
encoding     UTF8(E3 8B 88)
glyph         (2 columns)
description  ideographic telegraph symbol for september
case         other
category     Symbol (other)

See man chmap for more.

Why?

I miss some Windows utilities.

I found chars, but it lacked several features which I'd like:

  • A table-like output mode (which chmap has by default). This makes
    viewing many entries more comfortable in a short terminal.
  • A category field to specify the Unicode category that the rune falls
    under.

Inspiration

License

This lame little utility is licensed under the MIT License. See the LICENSE.md file for more information.