/ English / tech / Unicode-processing issues in Perl and how to cope with it /

pack “U0C*”, unpack “C*” hack

This is a side-note on the article Unicode-processing issues in Perl and how to cope with it.

One way to tell perl that the $ustring2 contains Unicode data is:

#!/usr/bin/perl

my $ustring1 = "Hello \x{263A}!\n";  
my $ustring2 = <DATA>;
$ustring2 = pack "U0C*", unpack "C*", $ustring2;

print $ustring1, $ustring2;
__DATA__
Hello ☺!

source

(Wide characters warning is still here.)

Wheather or not you see the smiling face character depends on your terminal environment.  But at least perl prints two exactly the same lines now.  And that is right.

This hack is known to work in perls 5.6.1 and newer.