Metadata-Version: 2.1
Name: tangled_up_in_unicode
Version: 0.2.0
Summary: Access to the Unicode Character Database (UCD)
Home-page: https://github.com/dylan-profiler/tangled-up-in-unicode
License: BSD License
Description: # Tangled up in Unicode
        
        This module provides access to character properties for all Unicode characters, from the Unicode Character Database (UCD) .
        This module provides an alternative to Python's standard library [`unicodedata`](https://docs.python.org/3/library/unicodedata.html).
        `Tangled up in Unicode` provides four main benefits compared to the standard library:
        - The [latest version](http://www.unicode.org/versions/latest/) of the Unicode database is used.
        - Adds human-readable class names (Property value aliases).
        - Extends the properties to use more potential of the database.
        - UCD version independent of Python version (Python 3.6 has UCD 9.0, 3.7 has UCD 11.0.0, 3.8 has 12.0.1, 3.9 has 13.0.0)
        
        Note that Python 3 added unicode support, but that this is different from the UCD.
        Unicode support handles storing and manipulating unicode characters, while this package aims to provide properties of specific characters.
        
        <!-- Please read the [docs](#) for details.-->
        
        ## Example
        
        The default lookup in `unicodedata` for `$`:
        
        | Property 					| Value 		 	|
        |---------------------------|-------------------|
        | Name	   					| Dollar Sign 		|
        | Category (Short)			| Sc 		 		|
        | Bidirectional (Short) 	| ET 				|
        | Combining					| 0					|
        | Mirrored					| 0					|
        | East Asian Width (Short)	| Na				|
        | Decomposition				| 					|
        
        Extra information provided by this package
        
        | Property 						| Value 		 		|
        |-------------------------------|-----------------------|
        | Category Alias (Long)			| Currency_Symbol		|
        | Bidirectional Alias (Long)	| European_Terminator	|
        | East Asian Width Alias (Long)	| Narrow				|
        | Script (Long)					| Common				|
        | Script (Short)				| Zyyy					|
        | Block (Long)					| Basic_Latin			|
        | Block (Short)					| ASCII					|
        | PropList						| Pattern_Syntax		|
        | Uppercase Character			|						|
        | Lowercase Character			|						|
        | Titlecase	Character			|						|
        
        
        ## Properties comparison
        
        | Property					| `tangled-up-in-unicode`			| `unicodedata` 		|
        |---------------------------|-------------------------------|-----------------------|
        | Name						| &#9745;						| &#9745;  				|
        | Decimal					| &#9745;						| &#9745;  				|
        | Digit						| &#9745;						| &#9745;  				|
        | Numeric					| &#9745;						| &#9745;  				|
        | Combining           		| &#9745; + alias				| &#9745;  				|
        | Mirrored           		| &#9745;						| &#9745;  				|
        | Decomposition        		| &#9745;						| &#9745;  				|
        | Category					| &#9745; + alias				| &#9745;  				|
        | Bidirectional				| &#9745; + alias				| &#9745;  				|
        | East Asian Width			| &#9745; + alias				| &#9745;  				|
        | Script					| &#9745; + alias				| -  					|
        | Block						| &#9745; + alias				| -  					|
        | Age						| &#9745; + alias				| -  					|
        | Binary Property Values 	| &#9745;						| -  					|
        | Version					| 14.0.0 ([latest](http://www.unicode.org/versions/latest/))				| 12.0.1				|
        
        _Table 1: presence of properties is denoted by &#9745; (Unicode Character 'BALLOT BOX WITH CHECK' (U+2611))._		
        
        ## Usage
        
        ```python
        import tangled_up_in_unicode as unicodedata
        ```
        
        The package can be installed via pip:
        
        ```
        pip install tangled-up-in-unicode
        ```
        
        ## Performance
        
        The module is written in Python. 
        It can be compiled with Cython to gain [competitive performance](# "Meaning the null hypothesis of the two libraries having the same average runtime could not be rejected.") with the native library.
        
        ## Unsupported features
        
        Some of the features in `unicodedata` are not supported. 
        
        | Feature				| `tangled-up-in-unicode`		| `unicodedata` 		|
        |-----------------------|-------------------------------|-----------------------|
        | lookup	           	| -								| &#9745;  				|
        | normalize           	| -								| &#9745;  				|
        | ucd_3_2_0      		| -								| &#9745;  				|
        
        ## Acknowledgements
        Where possible, code and documentation of the original module are used.
        This repository is part of the Dylan Profiling project.
        
Platform: UNKNOWN
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.6
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Operating System :: OS Independent
Classifier: Topic :: Database
Classifier: Topic :: Scientific/Engineering :: Information Analysis
Classifier: Topic :: Text Processing
Classifier: Topic :: Text Processing :: General
Classifier: Topic :: Utilities
Requires-Python: >=3.6
Description-Content-Type: text/markdown
