This is a memory-efficient dictionary tree with support for wildcards. Source is 11K, header is 2.5K (mostly documentation), which compiles down to about 4.5K of object code. Included in the tarball is a testing program (check the source to see what it does). This version supports substrings and removes the implicit wildcard at the end of each string.
In terms of memory usage, it is very easy to have it actually use less memory than just a flat file:
|1409379||ASCII flat file|
|1032633||s without http:// with bytecount|
|1032652||s with http:// with bytecount|
|866443||s with http:// with bytecount|
|960120||s with http:// with TNSIZE()|
Speed appears to be quite good. I tested it by having it match its original input data file, once per line, and it took 4.32 usec per match on average. There is a read-write lock on each tree (currently just emulated with an int) so that many readers can traverse the tree, but allow for a writer to change it while new readers wait. The #defines in tree.c should be replaced with the appropriate calls to your thread library's read-write lock functions.
There is now support for storing an arbitrary non-structure data type along with each string, which is returned by tree_match (0 is returned on no match, so you are not allowed to add anything with a value of 0).
In addition to all the stuff above, there is a version of tree which uses an mmap'd file instead of malloc(), which allows for persistent storage and multi-process update. Synchronization is handled via flock(). Differences of ftree from tree:
The dumpftree utility will dump a node-by-node listing of an ftree, so that you can get an idea as to how the file is layed out, and how segmented it is. I may make an optimizeftree utility at some point, if there's enough demand for it.
constin all applicable places
Questions? Concerns? Bugfixes? Please submit feedback to firstname.lastname@example.org.