DAWG is a Python module based on the dawgdic C++ library that offers a dictionary-like DAWG-based read-only objects for Python (versions 2.x and 3.x).
pip install DAWG
There are several DAWG classes in this package:
- dawg.DAWG - basic DAWG wrapper; it can store unicode keys and do exact lookups;
- dawg.CompletionDAWG - dawg.DAWG subclass that supports key completion and prefix lookups (but requires more memory);
- dawg.BytesDAWG - dawg.CompletionDAWG subclass that maps unicode keys to lists of bytes objects.
- dawg.RecordDAWG - dawg.BytesDAWG subclass that maps unicode keys to lists of data tuples. All tuples must be of the same format (the data is packed using python struct module).
- dawg.IntDAWG - dawg.DAWG subclass that maps unicode keys to integer values.
· The library is not tested under Windows;
· IntDAWG is currently a subclass of DAWG and so it doesn't support keys() and items() methods;
· read() method reads the whole stream (DAWG must be the last or the only item in a stream if it is read with read() method) - pickling doesn't have this limitation;
· DAWGs loaded with read() and unpickled DAWGs uses 3x-4x memory compared to DAWGs loaded with load() method;
· iterator versions of methods are not always implemented;
· there are keys() and items() methods but no values() method.
· prefixes() method for getting all prefixes of a given work is not implemented yet.
What's New in This Release: [ read full changelog ]