
#!/usr/local/gnu/bin/perl
use CGI qw(:standard);
use Encode;
binmode STDOUT, ":utf8";
print header(-charset => 'utf-8');
print start_html(-title => 'Collecting words', -encoding => 'utf-8'),
h1('Collecting words');
if (param()) {
if(open(OUT, ">>:utf8", "words.txt")) {
$word = Encode::decode_utf8(param('word'));
print OUT "$word\n";
print p("Thank you for \x{201c}$word\x{201d}!"); }
else {
print p("Internal error, sorry!"); exit(0); }}
else {
print start_form,
"Some word(s): ",textfield('word'),
submit(-name => 'Submit'),
end_form; }
print end_html;
Submitting a File
When
you use a form with a file input field (<input type="file">), the browser creates
a special input widget where the user can pick up a file from his system. The contents
of the file will be included into the form data as one of the parts of a multipart message.
The part has headers of its own, where the encoding could be specified. However, in
practice, the browser will just copy the contents of the file octet by octet, and it will
insert a header that specifies the media type of the data according to the file system
properties. For example, if the filename suffix is .txt, the browser includes a header that
specifies the media type as text/plain without charset indication.
The conclusion is that the encoding and even media types of submitted files remain
unknown. Human intervention or application-related heuristics is needed to ...