This module supplements unicodedata
standard library module with ability
to lookup and work with Unicode blocks.
Version of module.
The version of Unicode database used in this module.
Normalized name of block.
The first codepoint mapped by block. Inclusive.
The last codepoint mapped by block. Inclusive.
Checks either character is in this block.
Count of codepoints mapped by Block.
Checks if both other.start and other.end are lower than self.start and self.end.
Checks if both other.start and other.end are greater than self.start and self.end.
Checks if both other.start equals to self.start and other.end equals to self.end.
Will return a Block
which maps the codepoint of chr or None
in case not
block maps the codepoint.
A dictionary-like collection of all blocks defined by Unicode.
Returns a list of names of blocks in dictionary. Use this instead of .keys() if you want names presentable to user.
>>> unicodeblocks.blockof('-')
Block('Basic Latin', 0x0, 0x7f)
>>> unicodeblocks.blockof('か')
Block('Hiragana', 0x3040, 0x309f)
>>> unicodeblocks.blockof('日')
Block('CJK Unified Ideographs', 0x4e00, 0x9fff)
>>> len(list(itertools.chain(*unicodeblocks.blocks.values())))
256336
Module doesn't check if codepoints within block are assigned.
For example see \u38D
. If you care about that, you should
try to obtain their name with unicodedata
module.