Notes on PERL

This webpage is a result of my disappointment at just how little material for beginner-intermediate level PERL scripting is available online.

Table of Contents

Contributing

If I've made a mistake somewhere, or haven't covered something that you think should be covered, please mail me about it.

Contributors:

Why use PERL?

When you need to write a program in a hurry. When it doesn't have to be super-efficient. When it involves a lot of text-matching.

PERL is great for small housekeeping tools, cron jobs, and generally automating things on the commandline. A five-line PERL script could take over a hundred lines of code to write in C with the same level of error-checking and robustness.

PERL under Windows

As far as I'm aware, there are two major PERL distributions for Windows: Cygwin PERL and ActiveState PERL. I haven't used either extensively.

Hello world

Pull up a text editor and paste the following code into a file:

#!/usr/bin/perl

print "Hello world!\n";

You'll probably want to make the file executable. (chmod +x) Also, PERL might live in a different directory on your system, so you might want to change the top line to point to a valid PERL interpreter on your system. Most flavours of Linux and BSD put it in /usr/bin.

# vi hello.pl
# chmod +x hello.pl
# ./hello.pl
Hello world!
#

Variables

In PERL, a scalar variable is denoted by a $. Arrays use @ and hashes use %. (More on hashes) Variables don't have to be declared before you can use them.

$a = "moo";

However, if you "use strict;" you do have to declare variables. (Why you would want to "use strict;") You can do this with the "my" keyword:

use strict;
my $a = "moo";

Variables are typeless so strings (text) and numbers are interchangeable.

$a = 2;
$b = "4";
$c = $a + $b;  # 6

String concatenation

To join strings, use the dot (.) operator, not the plus operator. Plus is only used to add numerical values.

$a = "Hello";
$b = "world";
$c = $a.", ".$b;  # "Hello, world"

You can put a variable name into a string literal and PERL will expand it for you:

$a = "Hello";
$b = "world";
$c = "$a, $b";  # "Hello, world"

You can suppress this behaviour by escaping (prefixing with a backslash) the dollar sign:

$a = "\$a, \$b";  # "$a, $b"

When using expansion in a string literal, you sometimes need to protect the variable name (using braces) from the rest of the string:

$a = "He";
$b = "$allo!";    # "!"
$c = "${a}llo!";  # "Hello!"

Counting occurrences of a character in a string

$string = "aXbXXcX";
$count = ($string =~ tr/X//);  # 4

The regex above does not modify $string.

Hashes

A hash (denoted with a %) is like an array, but it can take arbitrary (not necessarily numerical) indices. When referring to a hash, use %. When referring to an element of a hash, use $. i.e.:

my %hash;
$hash{"cow"} = "moo";
$hash{"dog"} = "woof";
print "$hash{cow}\n";  # moo

In the above example, "cow" and "dog" are the keys of the hash, and "moo" and "woof" are the values associated with them. Alternative way to initialise the same hash:

my %hash = ("cow"=>"moo", "dog"=>"woof");
print "$hash{cow}\n";  # moo

Adding to a hash:

$hash{"newkey"} = "newvalue";

Deleting from a hash:

delete $hash{"somekey"};

Heads up, PHPers! In PERL, to access a hash you must use braces{ }, not square brackets[ ]. Only arrays are indexed using square brackets.

Enforcing good habits

If you make a typo in a variable name, PERL won't complain - it will treat the typo as a different, most likely undefined, variable. i.e.:

#!/usr/bin/perl

$msg = "Hello world!\n";
print $mesg;              # prints nothing

By putting "use strict;" at the top of your PERL script, PERL will demand that you declare all variables prior to using them and this will help you catch errors like mis-spelled variable names:

#!/usr/bin/perl
use strict;

my $msg = "Hello world!\n";  # declare using 'my'
print $mesg;
# ./hello2.pl
Global symbol "$mesg" requires explicit package name at hello2.pl line 5.
Execution of hello2.pl aborted due to compilation errors.
#

Reading a file into an array

Each line in the file will become a separate element in the array. The newline present at the end of every line will be retained.

open FILE, "<whatever" or die $!;
my @array = <FILE>;
close FILE;

Reading a file into a [scalar] variable

Like the array example, but we use the join function to collapse the array into a scalar variable.

open FILE, "<whatever" or die $!;
my $var = join "", <FILE>;
close FILE;

Finding more help

There's always perldoc. If you need help on a built-in function:

perldoc -f split

You can also use it to search the PERL FAQs:

perldoc -q timestamp

Or retrieve the documentation for a module:

perldoc POSIX

There are some interesting manpages; perl(1) gives a list of them. perlre(1) is recommended reading - it covers regular expressions.

Fun with PERL syntax

PERL has some cute syntactical constructs that you might not be aware of. They can sometimes be used to make code prettier, shorter, or more readable:

Instead of:

foreach $item (@array)
{
 func($item);
}

Try:

func($_) for @array;

The if statement can be written as an afterthought.
Instead of:

if ($this == $that)
{
 die "it's all wrong";
}

Try:

die "it's all wrong" if $this == $that;

Even more fun:

die "wrong" if !$right;

Becomes:

die "wrong" unless $right;

Dealing with time

When dealing with times, use Time::Local; Always store timestamps in UTC. Use time_t because it's easy to parse - store a humanly-readable timestamp on top of that if you have to. When displaying times, always show the time zone offset.

Retrieving function parameters

In a function/subroutine, all the parameters passed to it are put into @_. Here are some ways of pulling them out:

sub something
{
 my ($a) = @_;
 my $b = $_[0];
 my $c = shift;
}

Those three are equivalent, but be aware that shift modifies @_
i.e.:

my ($a, $b) = @_;

Is equivalent to:

my $a = shift;
my $b = shift;

Sorting an array

Use the built-in sort function:

my @sorted_array = sort @array;

Sorting a hash by key

You can use the built-in keys function to obtain an array of all the keys of a given hash:

my @hashkeys = keys %hash;

To write out a hashtable sorted by key:

my %hash;
$hash{"lave"} = 12;
$hash{"diso"} = 15;
$hash{"leesti"} = 10;

print "$hash{$_}. $_\n" for sort keys %hash;

Sorting a hash by value

sort can be directed to use a comparison function:

my %hash;
$hash{"lave"} = 12;
$hash{"diso"} = 15;
$hash{"leesti"} = 10;

print "$hash{$_}. $_\n" for sort { $hash{$a} <=> $hash{$b} } keys %hash;

If the values are textual rather than numeric, use cmp instead of <=>.

Multiline strings

Multi-line strings are possible using << :

$row = <<MARKER;
<tr>
 <td>
  <b>ID</b>
 </td>
</tr>
MARKER

Variable expansion works inside:

print <<MARKER;
<table>
 $row
</table>
MARKER

Make sure the ending marker is flush left against the margin - it is not allowed to be indented.

Heads up, PHPers! In PERL, you need two <'s, and the semicolon goes after the first marker, not the second.

References

References in PERL are sort of like pointers. To create a reference to a variable, escape it with a backslash:

my @array = (1,2,4,8,16);
my %hash = ("a" => 1, "b" => 2);

my $ref1 = \@array;
my $ref2 = \%hash;

Alternately, you can use square brackets to create a reference to an anonymous array, or curly braces for a hash:

my $ref1 = [1,2,4,8,16];
my $ref2 = {"a" => 1, "b" => 2};

To dereference, use two $'s or -> :

print $$ref1[4];  # 16
print $ref1->[4]; # 16

print $$ref2{"b"};  # 2
print $ref2->{"b"}; # 2

See also: the perlref(1) manpage.

Arrays of Arrays

Use references:

my @a = (2,4,6,8);
my @b = (3,5,7,9);
my @c = (\@a, \@b);

print $c[0]->[0]; # 2
print $c[0]->[1]; # 4
print $c[1]->[3]; # 9

Using Data::Dumper

The Data::Dumper module can format data structures into strings:

use Data::Dumper;
my %noises = ("dog" => "woof", "cow" => "moo");
my @primes = (2,3,5,7,11,13);
print Dumper(\%noises, \@primes);

Produces:

$VAR1 = {
          'cow' => 'moo',
          'dog' => 'woof'
        };
$VAR2 = [
          2,
          3,
          5,
          7,
          11,
          13
        ];

This becomes useful when we want to look at the internals of a data structure, such as for debugging purposes.

See also: perldoc Data::Dumper

Path: home > stuff > Notes on PERL