June 2002
Beginner
759 pages
80h 42m
English
get_token
get_token( )
Returns the next token found in the HTML document, or
undef if no next token
exists. Each token is returned as an array reference. The
array reference’s first and last items refer to start and end
tags concurrently. The rest of the items in the array include
text, comments, declarations, and process instructions.
get_token uses the
following labels for the tokens:
SStart tag
EEnd tag
TText
CComment
DDeclaration
PIProcess instructions
Consider the following code:
#!/usr/local/bin/perl -w
require HTML::TokeParser;
my $html = '<a href="http://blah">My name is
Nate!</a></p>';
my $p = HTML::TokeParser->new(\$html);
while (my $token = $p->get_token) {
my $i = 0;
foreach my $tk (@{$token}) {
print "token[$i]: $tk\n";
$i++;
}
}The items in each token (in the HTML) are displayed as follows:
token[0]: S token[1]: a token[2]: HASH(0x8146d3c) token[3]: ARRAY(0x814a380) token[4]: <a href="http://blah"> token[0]: T token[1]: My name is Nate! token[2]: token[0]: E token[1]: a token[2]: </a> token[0]: E token[1]: p token[2]: </p>