Re: Displaying 'umlaut' character

From: Ben Morrow <ben_at_morrow.me.uk>
Date: Wed, 22 Sep 2010 09:09:02 +0100
Message-ID: <u5qom7-io5.ln1_at_osiris.mauzo.dyndns.org>


Quoth jt_at_toerring.de (Jens Thoms Toerring):
> In comp.lang.perl.misc dn.perl_at_gmail.com <dn.perl_at_gmail.com> wrote:
>
> > My aim is to display the ‘special’ (NON-Ascii) German character/
> > diacritic umlaut or diaresis correctly on a browser. The browser calls
> > a cgi perl-script which resides on a linux server. The browser which
> > calls the perl-script displays Vietnamese characters correctly (but
> > not the umlaut) without any special setting.
>
> Stop right here. If you mean with "browser" something like
> firefox, Internet Explorer etc. then there's some mis-under-
> standing here. The browser does not "call" a cgi-script. The
> browser just sends a request to the server which in turn may
> call a cgi-script (that may be written in Perl) and then sends
> the results back to the browser. And a web server normally
> sends a HTML header with the page that may contain a line
> like
>
> <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />

I think you mean 'the page may contain a <head> section which may contain a line like...'. Also, it's pretty-much always better to put that in the HTTP header.

<snip>
> Now, getting a Perl sript to deal correctly with UTF-8 is still
> another thing. If it takes input from files etc. it may have to
> indicate that it expects UTF-8 from them in the call of open(),
> e.g. by using
>
> open my $f, '<:utf8', $filename;

Don't do that. If the file contains invalid UTF-8, you will get strange behaviour up to and including perl segfaults. Use :encoding(utf8) instead: it's a little slower, but much safer. (This doesn't, in general, apply to filehandles open for output only.)

Ben Received on Wed Sep 22 2010 - 03:09:02 CDT

Original text of this message