Blogpost on http://www.geschka.com My first Object Oriented Perl ...

4 downloads 65 Views 628KB Size Report
My first Object Oriented Perl Script using Moose. Tags: Perl, OOP, Moose, Software Tests, Test::More, Command line Parsing,. Options, Getopt::Long.
Blogpost on http://www.geschka.com My first Object Oriented Perl Script using Moose Tags: Perl, OOP, Moose, Software Tests, Test::More, Command line Parsing, Options, Getopt::Long Daniel Geschka 2013/01/13

Inhalt 1

Motivation ....................................................................................................................................... 4

2

Requirements to ffind.pl ................................................................................................................. 4

3

2.1

Support for Windows-/DOS-like Wildcards ............................................................................. 4

2.2

Supported Actions ................................................................................................................... 4

2.3

Easy Extensibility ..................................................................................................................... 4

Basic Design ..................................................................................................................................... 5 3.1

3.1.1

recursiveFileWalk() and the FileWalker-“interface” ....................................................... 5

3.1.2

The FWA (File Walk Action)-Objects ............................................................................... 6

3.2 4

5

Uml Class Diagram ................................................................................................................... 5

Uml Activity Diagram of the main script ................................................................................. 6

Used OOP-features of Moose.......................................................................................................... 6 4.1

Building a simple class ............................................................................................................. 6

4.2

Defining an object (reference) as a member .......................................................................... 7

4.3

Inheriting from a Base Class and Implemeting interfaces ....................................................... 8

4.4

Further Reading ....................................................................................................................... 8

Commandline parsing with Getopt::Long ....................................................................................... 9 5.1

Setting up Getopt::Long .......................................................................................................... 9

5.2

Accessing a simple switch........................................................................................................ 9

5.3

Accessing a (string) value ........................................................................................................ 9

5.4

Splitting a string value by a divider character ....................................................................... 10

5.5

Alternate (unique) names ..................................................................................................... 10

5.6

Printing on screen the Usage-message ................................................................................. 10

5.7

Further reading...................................................................................................................... 11

6

Testing modules with Test::More.................................................................................................. 11

7

Prerequisites for using ffind.pl on Windows ................................................................................. 12

8

Usage of ffind.pl ............................................................................................................................ 12

9

8.1

The command line interface.................................................................................................. 12

8.2

Examples................................................................................................................................ 13

8.2.1

Matching and default arguments .................................................................................. 13

8.2.2

Actions ........................................................................................................................... 13

Conclusion ..................................................................................................................................... 14

1 Motivation

1 Motivation As you can see from my earlier posts I got over the last couple of month a little bit in contact with Perl as a script language being used as a substitution for the old Dos-/Windows batch processor for some general script based tasks and in special for creating an automatic and full customizable build system for a larger Visual C++ project with a bunch of targets and some resulting distribution packages (installers and zip-archives). For practicing purposes I invested a bit time in investigating about the OOP-features of Perl. When you read about it in the internet a lot of people share the opinion that Perl’s default OOP-support works well, but it requires a lot of tedious work for the programmer to set up a class. They recommend you to give Moose a try, which is a relatively new module of Perl. It promises to make OOP under Perl easy and is already deployed in production by some real companies. To learn Moose by doing I decided to program a little script which mimics in a smaller feature subset the possibilities of the UNIX console program “find” as I know it from my earlier days. It should give me under Windows (and of course under other OSes where Perl is available too) the possibility to do a recursive file search using some include- and exclude-filters and let on the result set happen some actions like “print”, “put to an archive” or “clean the matching” files.

2 Requirements to ffind.pl 2.1 Support for Windows-/DOS-like Wildcards One goal was that using wild cards for restricting the results to certain files should be easy and familiar to most users. So it should support the Windows-/DOS-like wildcards: the asterisk matching any substring and the quotation mark for matching a single arbitrary character.

2.2 Supported Actions Like known from the Unix-find program different actions should be available for the resulting files, like:  



Print results: Print the found paths of the matching items on the console. Create a zip archive: Create a zip archive containing all matching files. Either the target path should be full assigned for scripting usage or a target directory should be indicated where the archive should be created getting an automatic generated filename using the current datetimestamp for creating fast and easy snapshot-backups. Clean / delete the matched items: Remove the matching items from the disk. An optional parameter should suppress the delete confirmation for usage in scripts.

2.3 Easy Extensibility From a software design point of view it should be relatively easy to extend the script and add new features especially new actions. That’s where the OOP-thingy comes into play.

http://www.geschka.com | My first Object Oriented Perl Script using Moose

4 / 14

3 Basic Design

3 Basic Design The basic design is illustrated in the following two Uml diagrams:

3.1 Uml Class Diagram

3.1.1

recursiveFileWalk() and the FileWalker-“interface”

The main function which does the file walk down the directory tree is the function recursiveFileWalk(). It expects the directory in which to go and a reference to an already setup FileWalker-object. recursiveFileWalk() doesn’t know anything else about the object except that it has the functions recursiveFileWalk() needs to call when it meets a file or a directory inside the given directory: 

 

OnDir(path: string, name: string): bool: Give full path and name of dir to object to let it do things with it and evaluate the Boolean result value to get known if the directory should be followed or not. OnFile(path: string, dir: string, fileName: string): Just give full path and name of file to object to let it do things with it. OnWalkFinished(): Notify object that the (recursive) file walk ended.

Because recursiveFileWalk() only knows the FileWalker-interface of the given object it is relatively easy to add new actions to the script or change existent ones, as no changes are required to this function in that cases: the function is closed to changes. The interface FileWalker is nowhere explicitly defined. It’s implicitly defined through the signature of the function calls recursiveFileWalk() does on the reference. If the given object does not implement such a function a runtime error is given. In Moose this behavior is called “Duck Typing”: something that supports quack() is a duck - no additional declaration required. http://www.geschka.com | My first Object Oriented Perl Script using Moose

5 / 14

4 Used OOP-features of Moose 3.1.2

The FWA (File Walk Action)-Objects

Each File-Walk-Action-Object implements one of the possible actions (print, zip, clean, future actions …). As already mentioned each FWA-Object implements implicitly the duck typing FileWalker-“interface”. After being setup the objects do their work only when their FileWalker-“interface”-functions are called by the recursiveFileWalk()-function on its way down the directory tree. Each FWA-Object inherits via the Moose-keyword “extends” from the Base Class FWABase. FWABase contains common data each FWA-Object needs: Configuration flags, pattern matching objects and functions to get known if a file system item is matching or if a directory should be followed.

3.2 Uml Activity Diagram of the main script

In the first phase of the script the command line arguments and options are evaluated using Perls Getopt::Long-module. This produces a lot of configuration data which is used in the second phase of the script to determine the File Walk Object to use and configure it’s special parameters. Afterwards the parameters common to all FWA-objects are applied to it. Finally recursiveFileWalk() is called with the ready to go setup File Walk Object and the directory which should be scanned as parameters.

4 Used OOP-features of Moose 4.1 Building a simple class The following code snippet shows a real world Moose class used in the script. It encapsulates a Windows like wild card kind of pattern match. Therefore it uses Perl’s regular expressions. Read the comments to get known how Moose is used to build this simple class. It only has one string member (the pattern) and four functions: two declared ones and two Moose automatically generates for you: # Means: A Moose class. http://www.geschka.com | My first Object Oriented Perl Script using Moose

6 / 14

4 Used OOP-features of Moose # Is: A base class for our file glob-style pattern match: # Pattern can contain special wild cards '?' and '*' as # known from windows and perls glob. As it is intended # for windows matches are case insensitive. package FileGlobMatch { use Moose; # Means: This package/class uses Moose :-) # Means: This is a string member named pattern. Its value can be # read and written. Moose generates automatically a setter # (example call: $obj->pattern(".+?\.txt") and a getter # (example call: my $patternCopy = $obj->pattern(). has 'pattern', is => 'rw', isa => 'Str'; # Means: This is a method which gets a string, does some preparation # treatment on it and then assigns it to the member 'pattern' # using the Moose-generated setter mentioned above. To make it # different from it the first letter 'P' is a written capitalized. # Does: Replace '?' and '*' with corresp. reg ex and set the pattern member: sub Pattern { my $self = shift; # First argument is always a reference to # the instance of the object => hence the # name 'self'. my $pattern = shift; # The first 'real' argument. $pattern =~ s/\./\\\./g; # Quote '.'-chars to '\.' $pattern =~ s/\?/\.\{1\}/g; # ? => .{1} (exact one arbitrary character). $pattern =~ s/\*/\.\*\?/g; # * => .*? (any character 0 or more times, # don't greedy as group) $pattern = "\\A".$pattern."\\z"; # Add anchor for string-begin and # string-end $self->pattern($pattern); # set the Pattern } # The Match method. Get as argument the string to match against the # 'pattern'-member and returns true on march, else false. sub Match { (my $self, my $subject) = (@_); # Doing match with a direct call to $self->pattern() # does not work (sb-perl v5.16.1 MSWin32-x64-multi-thread) # - don't know why ... my $pattern_healer = $self->pattern(); if ($subject =~ m/$pattern_healer/i){ return 1; } else { return 0; } } }

4.2 Defining an object (reference) as a member The following snippet creates for a class an object reference as a member. It allocates and initializes it on the fly with some kind of instant anonymous method definition/call which itself calls the ctor of the target class: # class FWABase: FBA = File Walk Action. http://www.geschka.com | My first Object Oriented Perl Script using Moose

7 / 14

4 Used OOP-features of Moose package FWABase { use Moose; … # Means: Name is globPatternsDirsFollowNo. Is read / write. Is a ref to an instance of the # Moose-class FileGlobMatchList. On construction time an instance of the class # FileGlobMatchList is immediately created initializing the zero. It’s not required that # the emptyListDefaultMatchReturnValue member with ref is assigned. In this case its # nevertheless always created, as mentioned just before. has 'globPatternsDirsFollowNo' => ( is => 'rw', isa => 'FileGlobMatchList', default => sub { FileGlobMatchList->new('emptyListDefaultMatchReturnValue' => 0); }, required => 0 );

4.3 Inheriting from a Base Class and Implemeting interfaces The following snippet demonstrates Moose inheritance and the implemention of a Moose duck type interface: # Is: The class representing the File Walk Action 'Print'. # It inherits as all FWA-Object from FWABase. # It implements for the recursiveFileWalk-function the # implicit duck typing FileWalker-“interface”: # + OnDir(path: string, name: string): bool # + OnFile(path: string, dir: string, fileName: string): void # + OnWalkFinished(): void package FWAPrint { use Moose; # Uses Moose OOP-features. extends 'FWABase'; # inherites from ... sub OnDir { # Impl. 'FileWalker' interfaces OnDir (my $self, my $path, my $name) = (@_); if(!$self->ShouldDirBeFollowed($name)){ return 0; # Don't go further in that directory. } if($self->IsItemInActionFilter($name,$path,1)){ print "$path [d]\n"; } return 1; # Go further in that directory. } sub OnFile { # Impl. 'FileWalker' interfaces OnFile (my $self, my $path, my $dir, my $name) = (@_); if($self->IsItemInActionFilter($name,$path,0)){ print "$path [f]\n"; } } sub OnWalkFinished { # Impl. 'FileWalker' interfaces OnWalkFinished # Does nothing special if file walk is finished. } }

4.4 Further Reading The book Modern Perl contains a good introduction in chapter 7 Objects to Moose. The books url: http://onyxneon.com/books/modern_perl. You can also read and also download the free version on the internet: Html: http://modernperlbooks.com/books/modern_perl/chapter_00.html, Pdf: http://onyxneon.com/books/modern_perl/modern_perl_a4.pdf.

http://www.geschka.com | My first Object Oriented Perl Script using Moose

8 / 14

5 Commandline parsing with Getopt::Long

5 Commandline parsing with Getopt::Long Supplying your script with a standard conform and convenient to use command line interface is easy in Perl once you know how to do it. To parse the arguments yourself is not necessary. Getopt::Long is a powerful, easy to use and mature Perl module which does the trick.

5.1 Setting up Getopt::Long First you have to import the module at the top of your script. Here you can addionally give the module config switches. We don’t want to use the auto abbreviation feature as we handle the options names, synonyms and abbreviations completely ourselfs: ... use Getopt::Long qw(:config no_auto_abbrev); ...

Second inside your script you declare your Perl hash where Getopt::Long should store the parsed values and third you define your command line options indicating a long name, synonyms, abbreviations and optional some additional type info: # Somewhere in the main section of the script define, parse and get the # command line options. Getopt::Long also handles unnamed arguments: my %options = (); # Hash getting the values dependend of the type my $result = GetOptions( # Returns false on error(, not if an option isn’t set ;-) \%options, # A ref to the hash where the values should be stored. "print", # Define option ‘print’ as a simple switch (set or not set) "clean", "fclean", "zip=s", # ‘=s’: Should be treated as string "name-excl|ne=s", # ‘|ne’: Alternate short name. ‘=s’: See line above. "name-incl|ni=s", "path-excl|pe=s", "path-incl|pi=s", "follow-incl|fi=s", "follow-excl|fe=s", "no-dirs|nd", "no-files|nf", "seperator|s=s", "usage|h|help|?|version|v", # All synonyms for the first name ‘usage’ "debug" );

5.2 Accessing a simple switch You can test now against the defined hash if a simple switch is set using the unique name of the option. Here: test if debug-flag is set: if($options{"debug"}){ ... # set } else { ... # not set }

5.3 Accessing a (string) value Example for accessing a value, here a string: if($options{“separator”}){ print “Seperator is: ”.$options{“separator”}.”\n”; }

http://www.geschka.com | My first Object Oriented Perl Script using Moose

9 / 14

5 Commandline parsing with Getopt::Long See Getopt::Long’s manual for the other supported types and their abbreviation-strings for indicating the type inside the GetOptions-function.

5.4 Splitting a string value by a divider character This example shows how to split an options string value into multiple substring using a separator char which is for command line options handling a very common task: # Option ‘name-incl’ should support something like this: # “*.jpg,*.png,*.gif” => “*.jpg” + “*.png” + “*.gif” if($options{"name-incl"}){ # Test if option "name-incl" is set my @files_array = # List/array getting the substrings split( # The split-function ‘,’, # The split character $options{"name-incl"} # The string which should be splitted ); foreach (@files_array){ # Iterate over the found substrings $fileWalker->globPatternsNamesIncl->AddPattern($_); # Do sth. w. the substring } }

5.5 Alternate (unique) names You can define alternate names for an option. Never the less you only have to test and can access the option under the one ‘main’-name: my %options = (); my $result = GetOptions( \%options, ... "usage|h|help|?|version|v", # All synonyms for the first name ‘usage’ ... ); ... # Does user wants usage? User can indicate one of this ... # -usage –h –help -? –version -v # ... for the following option switch to be set (=true): if($options{“usage”}){ printUsage(); exit 0; # windows return code, no perl bool }

5.6 Printing on screen the Usage-message A nice feature of Perl is the possibility to print out multiline preformatted text using a text label. This is very handy for giving the user a nice to read usage print on the command line. Without that feature you would end up in doing a lot print-function calls with tedious formatting and indention counting. With that feature it’s really a ‘what you see is what you get’-approach. In the example the name of the text-label is ‘EndOfText’ which mustn’t be present in the text. Also I noted that the perl reserved @-charcater for arrays must be quoted when used in the text e.g. for an email-address: # sub print usage sub printUsage{ # --- MULTILINE TEXT SECTION: BEGIN --print Pattern("????-??-??-hello word.txt"); # Set here the test pattern. Is nothing # from Test::More ;-) ok ($obj->Match("2012-01-01-hello word.txt")); # Do a simple test which should pass. If # not you get an error. is ($obj->Match("2012-01-01-hello word.txt_"), 0, "should fail"); # This test should fail. # If it does not you get an # error. # Define more tests here ... ... }

# Call the test function test(); done_testing; # Notify Test::More that the test for this test-module is finished. http://www.geschka.com | My first Object Oriented Perl Script using Moose

11 / 14

7 Prerequisites for using ffind.pl on Windows

7 Prerequisites for using ffind.pl on Windows For using ffind.pl on Windows one needs an installed Perl-distribution and the module Moose. If one installs the latest Active State Perl- or Strawberry Perl -Distribution for Windows both come with Moose already included (state of information: February 2013). You can download the script from http://www.geschka.com. Search there in the search field of the homepage for ‘ffind.pl’.

8 Usage of ffind.pl 8.1 The command line interface If you call … perl ffind.pl –help

… you get the following output describing the command line usage of ffind.pl: Perl script to search a directory tree recursive using wild card filters and apply actions on the matching files and directories. Usage: perl ffind.pl [dir] [options] [dir]:

(Target) dir: If you omit it the current working dir will be the target dir.

[options]: -print: Just print the matched files and directories. This is the defa ult action. -zip: Arg = String: Create zip-archive containing the matched files. The arg can be the complete path of the archive. If arg is an existing directory in this directory the archive will be put with the auto generated file-name: YYYY-MM-DD-hh-mm-ss.zip. -clean: Delete matching files and directories. You will be prompted for confirmation. -fclean: Like clean but with no confirmation prompt. Useful for scripts. But beware ... (!) -name-excl (ne): Arg = String: When set: Include only dirs and files whose name matches the given wildcard string. If wildcard string contains spaces use quotes. Seperate more than one possible wildcard strings with a comma (','). -name-incl (ni): Arg = String: When set: Exclude dirs and files whose name matches the given wildcard string. If wildcard string contains spaces use quotes. Seperate more than one possible wildcard strings with a comma (','). -path-excl (pe): Arg = String: When set: Exclude dirs and files whose path matches the given wildcard string. If wildcard string contains spaces use quotes. Seperate more than one possible wildcard strings with a comma (','). -path-incl (pi): Arg = String: When set: Include only dirs and files whose path matches the given wildcard string. If wildcard string contains spaces use quotes. Seperate more than one possible wildcard strings with a comma (','). -follow-incl (fi): Arg = String: When set: Follow only dirs whose name matches the given wildcard string. Affects not, wether a matching dir will be included to the current action. If wildcard string contains spaces use quotes. Seperate more than one possible wildcard strings with a comma (','). -follow-excl (fe): Arg = String: When set: Follow only dirs whose http://www.geschka.com | My first Object Oriented Perl Script using Moose

12 / 14

8 Usage of ffind.pl name matches not (!) the given wildcard string. Affects not, wether a non matching dir will be included to the current action. If wildcard string contains spaces use quotes. Seperate more than one possible wildcard strings with a comma (','). -no-dirs (nd): Flag, that no direcories should be matched, only files. -no-files (nf): Flag, that no files should be matched,only dirs. -seperator: The default seperator for wildcard-list is the comma. If the comma must part of your wildcard-string indicate here a different seperator-character. -usage or -h or -help or -? or -version or -v: Print this usage. Wild cards: *: Matches any character any times as few as possible. ?: Matches any character exact one time. Priority: Excluding options have priority over including options. (c) 2013 by Daniel Geschka - http://www.geschka.com Bug reports and comments: ffind.pl\@email.de Version: 1.0

8.2 Examples 8.2.1

Matching and default arguments

If no action is explicitly given the default action is to print out the matches. If no directory is given default directory is current working directory. The following will print out all files and dirs recursively in the current working directory: perl ffind.pl

Match in users temp dir all jpgs: perl ffind.pl C:\Users\daniel\Temp -ni "*.jpg,*.jpeg"

Match in users temp dir all items whose name contains substring ‘pic’ except the gifs and jpgs: perl ffind.pl C:\Users\daniel\Temp -ni "*pic*" -ne "*.gif,*.png"

Same as last plus: Path of item must additionally contain the substring ‘yellow/bmw‘: perl ffind.pl C:\Users\daniel\Temp -ni "*pic*" -ne "*.gif,*.png" -pi "*yellow/bmw*"

Match in users temp dir all text-files. Don’t go in dirs with the name ‘_reserved’. Will speed up script as these dirs are not not matched but are not scanned at all (recursively!): perl ffind.pl C:\Users\daniel\Temp -ni "*.txt" -fe "_reserved"

Match in users temp dir all dirs whose name matches the substring ‘green’. The flag –nf excludes all files => only dirs.: perl ffind.pl C:\Users\daniel\Temp -ni "*green*" -nf

8.2.2

Actions

Print the files and folders of the current working dir: perl ffind.pl ./ -print

Delete / clean all object files. Don’t display file deletion warning / confirmation: http://www.geschka.com | My first Object Oriented Perl Script using Moose

13 / 14

9 Conclusion perl ffind.pl ./ -ni “*.o,*.obj” -fclean

Create a zip-archive of all text files w. automatic current timestamp name in dir ‘c:\users\daniel\backups’: perl ffind.pl ./ -ni “*.txt” –zip “c:\users\daniel\backups”

Create a zip-archive of all text files w. full path indication to ‘c:\users\daniel\backups\all_txts_backup.zip’: perl ffind.pl ./ -ni “*.txt” –zip “c:\users\daniel\backups\all_txts_backup.zip”

9 Conclusion Finally it’s a nice addition to Perl to have the possibility to leverage now the powers of OOP more easily once you have managed to overcome the first obstacles of Moose. In detail if you have made a mistake it’s not easy to get known from the error message the real cause of the error. But so far I guess that’s more a general problem of Perl, not Moose ;-) So if you have the choice you might want to give Python a try as it is object oriented by design and a lot easier to learn without being less powerfull than Perl.

http://www.geschka.com | My first Object Oriented Perl Script using Moose

14 / 14