XML::ParseDTD parses a XML DTD and provides methods to access the information stored in the DTD.
$dtd = XML::ParseDTD->new($dtd);
$bool = $dtd->child_allowed($tag, $childtag);
$bool = $dtd->child_list_allowed($tag, @childtags);
$bool = $dtd->attr_allowed($tag, $attribute);
$bool = $dtd->attr_list_allowed($tag, @attributes);
$bool = $dtd->is_empty($tag);
$bool = $dtd->is_defined($tag);
$bool = $dtd->is_fixed($tag, $attribute);
$bool = $dtd->attr_value_allowed($tag, $attribute, $value);
$bool = $dtd->attr_list_value_allowed($tag, %attribute_value);
@tags = $dtd->get_document_tags();
$regexp = $dtd->get_child_regexp($tag);
@attributes = $dtd->get_attributes($tag);
@req_attributes = $dtd->get_req_attributes($tag);
$value = $dtd->get_allowed_attr_values($tag, $attribute);
$default_value = $dtd->get_attr_def_value($tag, $attribute);
$errormessage = $dtd->errstr;
$errornumber = $dtd->err;
This module basically tells you which tags are known by the dtd, which child tags a certain tag might have, which tags are defined as a empty tag, which attributes a certain tag might have, which values are allowed for a certain attribute, which attributes are required, which attributes are fixed, which attributes have which default value ... well i would say it tells you all except the entity definitions (they're on the ToDo list) that is defined in the dtd (at least all that i know of, but i'm not so much into that topic, so please make me aware if i missed something). All this information can be accessed in 2 diffrent ways: 1. you can simply get it 2. you can pass certain data and the module then tells you whether thats ok or not.
This package uses Cache::SharedMemoryCache to cache every parsed DTD, so next time the data structure representing the dtd can be just taken out of memory. Thus the dtd is not refetched and not parsed again which saves quite some time and work. You can easily modify the module so that is uses Cache::FileCache if you prefer, but i think SharedMemory is faster.
Everytime the constructor is called it first checks whether the given dtd is already in memory, if so it compares the last modified date to the date stored in memory and then decides whether it should refetch it or not. If the dtd lays on the local filesystem this operation doesn't produce any reasonable overhead, but if the dtd is fetched out of the internet it might make sense to not check the last modified header every time. You can configure how often it should be checked, by default it is checked averaged every third time. But since most dtds don't change it is mostly save to not check it at all.
Internally the parsed DTD data is simply stored in 6 hash structures. Because of this and because of the caching the module should be very fast.
· Perl 5.004
· Carp 1.01
· LWP::UserAgent 0.01
· Cache::Cache 1.02