Siaris
Simple Things
Syndicate: full/short
Siaris
Categories
General0
News2
Programming2
LanguageBits0
Perl50
Ruby10
VersionControl1
Misc1
Article Calendar
<= July, 2008
S M T W T F S
12345
6789101112
13141516171819
20212223242526
2728293031
Search this blog

Key links
External Blogs
Brought to you by ...
Ruby
1and1.com

Building a Distribution with h2xs

Andrew L. Johnson (First published by ItWorld.com 2001-03-29)

I have discussed using h2xs in a previous article (dated April 27, 2000 in the article archives: see link below), but I think it would good to look at it again in a little more depth in light of the last two articles and because it produces slightly different output in version 5.6.0 than in 5.005.

Let’s say we decide to build our Cool module and we want to be able to package it up and share it (or perhaps upload it to CPAN if you have a CPAN id). We can automate much of the task by using the h2xs utility that comes with perl:

    [jandrew]$ h2xs -XAn Cool
    Writing Cool/Cool.pm
    Writing Cool/Makefile.PL
    Writing Cool/test.pl
    Writing Cool/Changes
    Writing Cool/MANIFEST

We used the -X option because we are not creating an XS module, and the -A option because we are not using the Autoloader facility (both are beyond the scope of this article), and the -n option to specify the name of the module (Cool in this case). You can see from the output that it has created a ‘Cool’ directory and written several files in it for us. The one we are really interested in is the Cool.pm file, which is a skeleton of our module. If you are using version 5.005 then the contents of the module file will begin like:

    package Cool;

    use strict;
    use vars qw($VERSION @ISA @EXPORT @EXPORT_OK);

    require Exporter;

    @ISA = qw(Exporter AutoLoader);

    # Items to export into callers namespace by default. Note: do not
    # export names by default without a very good reason. Use
    # EXPORT_OK instead.  Do not simply export all your public
    # functions/methods/constants.

    @EXPORT = qw(

    );
    $VERSION = '0.01';

    # Preloaded methods go here.

    # Autoload methods go after =cut, and are processed by the
    # autosplit program.

    1;
    __END__

There is also stub POD after the END token (see: perldoc perlpod). Unfortunately, even using the -A option, this still tries to include Autoloader in the @ISA array — you should remove that, and delete the the last comment since we aren’t using autoloaded methods.

As you can see, this is the basic structure of the module we previously built by hand, although there I neglected to ‘use strict’ and declare our globals with ‘use vars’ (oversight on my part). You’ll also note that although the @EXPORT array is ready for us to fill, the @EXPORT_OK array isn’t set up — I would simply change that to be @EXPORT_OK if I wanted to export by demand (which I usually do).

All that’s left for us to do here is add put our cool() function into the @EXPORT_OK array and then define that subroutine after the ‘Preloaded methods go here’ comment.

In 5.6.0 the skeleton file begins like:

    package Cool;

    require 5.005_62;
    use strict;
    use warnings;

    require Exporter;

    our @ISA = qw(Exporter);

    # Items to export into callers namespace by default. Note: do not
    # export names by default without a very good reason. Use EXPORT_OK
    # instead.  Do not simply export all your public
    # functions/methods/constants.

    # This allows declaration use Cool ':all'; If you do not need this,
    # moving things directly into @EXPORT or @EXPORT_OK will save memory.

    our %EXPORT_TAGS = ( 'all' => [ qw(

    ) ] );

    our @EXPORT_OK = ( @{ $EXPORT_TAGS{'all'} } );

    our @EXPORT = qw(

    );
    our $VERSION = '0.01';

    # Preloaded methods go here.

    1;
    __END__

The only differences being that the arrays are now declared with our() which is new in 5.6.0, and we have an %EXPORT_TAGS hash. Using this hash is beyond the scope of this article, and isn’t necessary for our simple module — thus, I suggest simply ignoring or deleting it and removing the @{ $EXPORT_TAGS{‘all’} } expression from the @EXPORT_OK array. As before, all you need to do now is add your function definition and put its name into whichever of the EXPORT arrays you desire.

But there is more to a distribution than simply having a skeleton .pm file built for us — what about those other files? Well, the other files are there to automated building, testing, and installing the module. Once you’ve completed filling your module, you can install it like so:

    perl Makefile.PL
    make
    make test
    make install

And, if you wanted to install it under a private directory:

    perl Makefile.PL PREFIX=/home/jandrew/perl5lib

When installing, perl will create any needed subdirectories under that directory and install the module appropriately. If you are on windows, you’ll probably need to get ‘nmake’ and use that rather than make.

To create up a distributable package (containing your module and all the ancillary files) you need only do:

    make dist

Doing so in my example module directory creates a ‘Cool-0.01.tar.gz’ which I can now send to a friend or upload to my web-site or otherwise distribute. For further module information please see the following perl docs:

    perldoc perlmod
    perldoc perlmodlib
    perldoc perlmodinstall
    perldoc -f use
    perldoc -f require
Modules, Part 2: Installing a Module

Andrew L. Johnson (First published by ItWorld.com 2001-03-22)

Last week we built a simple module the defined and exported (on demand) one function. We also tested using the module with a test script in the same directory. This week we will look at how to install the module so we can use it from anywhere (well, within reason).

The first thing to know is where perl looks for modules when it wants to load them — and we can find out by looking at the special @INC (include) array. If you type ‘perl -V’ at the command prompt you will see a bunch of configuration information, at the of which you will see the contents of the @INC. You can also just view the @INC array with this command-line invocation (you might have to fiddle with the quoting depending on your shell):

    perl -le 'print join "\n", @INC'

On my machine it says:

    [jandrew]$ perl -le 'print join "\n",@INC'
    /usr/local/lib/perl5/5.6.0/i586-linux
    /usr/local/lib/perl5/5.6.0
    /usr/local/lib/perl5/site_perl/5.6.0/i586-linux
    /usr/local/lib/perl5/site_perl/5.6.0
    /usr/local/lib/perl5/site_perl/5.005/i586-linux
    /usr/local/lib/perl5/site_perl/5.005
    /usr/local/lib/perl5/site_perl
    .

These are the search paths perl uses to find modules. Notice, that last line is just a dot — meaning current directory, which is why our test script was able to work.

So, if we want to be able to use our Cool.pm module we can place it in one of those directories, and we usually use the sit_perl directory for our current version of Perl. My version here is 5.6.0 so I would place the Cool module at:

    /usr/local/lib/perl5/site_perl/5.6.0/Cool.pm

Now, you may not have permissions to install into these directories. In this case you have two options for using a private installation directory. First create a directory where you will install your private modules — I might use /home/jandrew/perl5lib or something — and put your module there. Now we have to tell Perl where to find it (or rather, we have to get this directory into the @INC array). The ‘use lib’ pragma can be used on a script by script basis to install extra directories in @INC:

    #!/usr/bin/perl -w
    use strict;
    use lib '/home/jandrew/perl5lib';
    use Cool;

That ‘use lib’ line tells perl to install that directory at the beginning of the @INC array (so it will be searched first). The problem with this is that we have to use this line in every script that needs to use one of our private modules — and if we share our module and scripts with someone else, they’ll need to install the module in their own private directory and change that line in ever script we give them. The alternative is to use the PERL5LIB environment variable. Setting this variable (by whatever means your platform or shell uses) gives us a way to tell Perl where to look without our having to specify it in each script:

    [jandrew]$ export PERL5LIB=/home/jandrew/myperlib
    [jandrew]$ perl -le 'print join "\n", @INC'
    /home/jandrew/myperlib
    /usr/local/lib/perl5/5.6.0/i586-linux
    /usr/local/lib/perl5/5.6.0
    /usr/local/lib/perl5/site_perl/5.6.0/i586-linux
    /usr/local/lib/perl5/site_perl/5.6.0
    /usr/local/lib/perl5/site_perl/5.005/i586-linux
    /usr/local/lib/perl5/site_perl/5.005
    /usr/local/lib/perl5/site_perl
    .

Here we see that once the environment variable is set, that directory is then automatically prepended to the @INC array. Now we can decide to change our private directory and move all of our modules and we only have to change this environment variable to point to the new location rather than each of our scripts. Next week we will look at packaging up our module so we can conveniently distribute it to others.

*****

Building a Simple Module by Hand, Part 1

Andrew L. Johnson (First published by ItWorld.com 2001-03-15)

There isn’t really a whole lot involved when building a module, just a few basic steps and a rule about naming your module. For this example we will building a module named Cool.pm which will contain one function called cool().

The basic outline of our module is as follows:

    package Cool;         # your package/module name

    require Exporter;     # setting to use Exporter's import()
    @ISA = qw(Exporter);  # routine

    @EXPORT = qw();       # Defining how and what to export
    @EXPORT_OK = qw(cool);

                          # your code here
    sub cool {
        print "Hey, cool!\n";
    }

    1;                    # end with a true statement
    __END__
                          # your POD documentation

    =head1 NAME

    Cool -- a useless example module

    =head1 SYNOPSIS

        use Cool qw(cool);
        cool();

    =head1 DESCRIPTION

    This example module merely illustrates the basic components
    of a module. If it were a real module we would document it
    properly.

So we have six basics things to do in building a module: 1) declare the package; 2) set up exportation; 3) what to export, and how; 4) our actual code; 5) end with a true statement; 6) include our POD documentation for the module. Pretty simple right? Let’s look at each of the steps in turn.

Step one is declaring our package name. The rule is, our package name will be the same (case sensitive) as the filename we store this in (except that we put a .pm extension on the file). So our module will exist in a file called ‘Cool.pm’. If we build a nested package like Cool::Kewler, we would declare that package name and store it in a file name Kewler.pm under a directory in our @INC path named Cool, so it would be in a file named ‘Cool/Kewler.pm’ (we’ll talk more about where to put module files next week).

Step two is pulling in Perl’s Exporter module via the require() statement, and including it in our @ISA array. What this does is gives us a default import() routine (we inherit it from Exporter) so we do not have to build our own (a subject to deep for the present article).

Step three defines what we want to export and how to export it. The @EXPORT array holds whatever functions and variables we want to automatically export in the calling script. The @EXPORT_OK array holds whatever we want to export by demand. What does that mean? Well, if the calling script starts of like:

    #!/usr/bin/perl -w
    use strict;
    use Cool;

It will only receive the functions specified in the @EXPORT array. If the caller does this:

    #!/usr/bin/perl -w
    use strict;
    use Cool qw(cool);

It is asking to import the cool() function from the @EXPORT_OK array. Our module is setup to use the @EXPORT_OK array because that is generally nicer — that way the caller decides what they want to import and doesn’t have to worry about a whole bunch of functions being imported by default.

Step four defines our actual code — in this case a single function named cool() that merely prints a string.

Step five is very simple, but very important — the last statement evaluated by a module must return a true value as an indicator that all has gone well. Using just a ‘1;’ is the standard way of doing this.

Step six is also very important — if you are going to build a module, you should document it with POD so users of it can find out what it does and how to use it. The perlpod manpage (perldoc perlpod) explains the POD markup language we use for documenting modules and scripts.

OK, if you save the above module in a file named ‘Cool.pm’, you can test it out using the following script (in the same directory):

    #!/usr/bin/perl -w
    use strict;
    use Cool qw(cool);
    cool();
    __END__

And, you may also type ‘perldoc Cool’ to bring up the modules POD (again, in the same directory).

*****

More on Context: Boolean and Operator Contexts

Andrew L. Johnson (First published by ItWorld.com 2001-03-08)

Perl has a special kind of scalar context called boolean context — this context occurs inside of conditional expressions (if and while conditions for example) and applies to the arguments of logical operator. In such a context, an expression is evaluated in scalar context and then a truth value (true or false) is determined from the result of that expression.

So, for example, if we want to exit the program with a usage message if no command line arguments were given:

    die usage("No arguments given") unless @ARGV;

(of course, I am assuming a function named ‘usage’ exists and produces a relevant message). This tests the @ARGV array in a boolean scalar context — and recall that when an array is evaluated in a scalar context it returns its size. So, if @ARGV is empty, the result is 0 which is false and so we die() with an appropriate message. Another example might be to test if an array (or any list) has a certain number of some particular entry — we can use grep() to test a condition:

    my @list = (12, 14, 101,102,103, 104, 4);
    if( (grep{$_ > 100} @list) > 3) {
        print "More than 3 items greater than 100\n";
    }

Normally, grep() returns the list of successful elements, but in scalar context it returns the number of successful elements. The greater than operator places the grep() into scalar context, and that result is compared to 3. Logical operators also provide a scalar boolean context to their operands:

    @ARGV || die "No arguments";

Here the @ARGV array is again evaluated in scalar boolean context, so if it contains anything the OR expression is ignored (logical short circuit evaluation), otherwise the left hand side is false and the die() is then called.

Another, very different, kind of context sometimes causes people problems — that is, operator context. Perl variables are not typed according to what kind of data they hold, but operators are associated with particular types of data (numerical or string operators) and Perl will convert data to the appropriate type for the given operator:

    if(21 <  200) {print "True\n"}  # true
    if(21 lt 200) {print "True\n"}  # false

In the second example above the string operator ‘lt’ (less than) is used so Perl converts 21 to "21" and 200 to "200" and compares them according to ascii collating sequence (asciibetical order). The string "21" is greater than the string "200". Such conversion works the other way around as well:

    if( "14" gt "9"){print "True\n"} # false
    if( 14  >  9)   {print "True\n"} # true

As a final warning, converting a number to a string is simple — the string is just the character representation of the number. However, converting strings to numbers isn’t always as clear. For example, how should the string "xyz" be converted? The rule is: if a string begins with something that looks like a number (ignoring any leading whitespace) then convert to the equivelant number, ignoring any non-numeric characters that might follow the number. Otherwise, the numeric value of a string is 0:

    my $a = 14;   # $a is: 14
    $a .= 5;      # $a is: "145"

    my $b = "Hello World";
    $a .= $b;              # $a is: "145Hello World"
    $a += 5;               # $a is: 150
    $a += $b;              # $a is: 150

Here we first set $a to the number 14, then we concatenated it (a string operation) with 5 (so Perl converts the 14 and the 5 to strings and concatenates them into the string "145"). We set $b to just a string and then concatenate with $a giving us a string in $a that starts with a number. When add 5 to $a Perl converts the string "145Hello World" into the number 145 and then adds 5 to it. Lastly, if we try to add $a with the string in $b, Perl treats $b as a number and it evaluates to 0, so $a is left as 150.

If you run that whole last example with warnings turned on (you do run with the -w switch right?) you’ll notice two warnings about using a string in a numeric context. It is easy to mistakenly use a string comparison operator instead of a numeric one (or vica versa), and using warnings will often tell you when you’ve made such a questionable comparison.

*****

Context is Everything

Andrew L. Johnson (First published by ItWorld.com 2001-03-01)

There are two main contexts that you really must be aware of when writing Perl programs: Scalar context and List context. Things can behave differently when evaluated in one or the other of these contexts.

Probably the most well know example is assigning an array:

    my @list   = @array; # @array is in list context
    my $scalar = @array; # @array is in scalar context

In the first case above, @array is being assigned to another array, providing list context, so it returns the list of its contents. In the second example, @array is being assigned to a scalar variable thus putting it in scalar context — an array in scalar context returns the size of the array. Now, one twist on the second example above is:

    my ($scalar) = @array; # @array is in list context;

Here the parentheses around the scalar mean that the left hand side of the assignment is a list of scalars (in this case, a list of length one), thus the @array is in list context. Now @array returns the list of its contents and the first element is assigned to $scalar (the rest of the list returned by the array is ignored).

Another example is using the keys() function of a hash — in list context it returns the list of keys in the given hash, but in scalar context it returns the number of keys in the hash. A hash itself in list context returns the full list of key value pairs (in the same sequence as the keys() function returns the keys), but in scalar context it does something rather different — it returns information about the underlying hash structure:

    my %hash = (one => 1, two => 2);
    print %hash, "\n";        # prints: one1two2
    print scalar %hash, "\n"; # prints: 2/8

The print() function provides a list context (the function expects to receive a list of arguments). The scalar() function can be used to explicitly evaluate an expression in a scalar context (as we do in the last example above). That final version is telling us that our hash currently has 8 buckets allocated and that two are being used.

Now, we’ve already seen that functions can return different things depending on context with the keys() function. Another example is the localtime() function (see: perldoc -f localtime). Operators can also act context dependently, as with the match operator m// (and that behavior is also modified by the /g modifier). The x operator (string replication) works differently if its left hand side is a scalar or a list.

Functions themselves may also supply context to their arguments. Take the join() function for example — the first thing it expects as an argument is a scalar value holding the string to be used as the join separator. If you thought you could just put all the arguments to join into a single array and pass that you wouldn’t get the results you wanted:

    my @args = (":", 1,2,3);
    my $str = join @args;
    print $str;              # $str is empty

Here the join() function expected a scalar as the first argument an so evaluated the @args array in a scalar context to get the separator character (in this case the size of the array which is 4), and there weren’t any following arguments to join together.

If a function or operator behaves differently depending on scalar or list context it is documented in the relevant documentation pages (see the perlfunc and perlop manpages).

*****