Spydr: Perl Mechanize

WWW-Mechanize

I used Perl mechanize to test my web application. Found it awesome. If you have plans to test your web app then, give a try with mechanize. It will make your job easier.
Thanks to Andy Lester . Download WWW::Mechanize from here

lets see a small example with www::mechanize

Load necessary modules.
#!/usr/bin/perl -w
use strict;
use WWW::Mechanize;

Create agent
# create agent
my $mech = WWW::Mechanize->new();
$mech->get("http://search.cpan.org");
if(! $mech->success) {
print "Failed to load\n";
}

Get the list of module category URL

# Get all the links from the page that matches the given pattern.
my @links = $mech->find_all_links(url_regex => qr/modlist\/(.*)/);
foreach my $modlist (@links) {
print("The URL is " . $modlist->url_abs . "\n");
}

The output of the above will be like this

The URL is http://search.cpan.org/modlist/Archiving_Compression_Conversion
The URL is http://search.cpan.org/modlist/File_Name_System_Locking
The URL is http://search.cpan.org/modlist/Option_Parameter_Config_Processing
The URL is http://search.cpan.org/modlist/Bundles
The URL is http://search.cpan.org/modlist/Graphics
The URL is http://search.cpan.org/modlist/Perl6
The URL is http://search.cpan.org/modlist/Commercial_Software_Interfaces
The URL is http://search.cpan.org/modlist/Internationalization_Locale
The URL is http://search.cpan.org/modlist/Pragmas
The URL is http://search.cpan.org/modlist/Control_Flow_Utilities
The URL is http://search.cpan.org/modlist/Language_Extensions
The URL is http://search.cpan.org/modlist/Security
The URL is http://search.cpan.org/modlist/Data_and_Data_Types
The URL is http://search.cpan.org/modlist/Language_Interfaces
The URL is http://search.cpan.org/modlist/Server_Daemon_Utilities
The URL is http://search.cpan.org/modlist/Database_Interfaces
The URL is http://search.cpan.org/modlist/Mail_and_Usenet_News
The URL is http://search.cpan.org/modlist/String_Language_Text_Processing
The URL is http://search.cpan.org/modlist/Development_Support
The URL is http://search.cpan.org/modlist/Miscellaneous
The URL is http://search.cpan.org/modlist/User_Interfaces
The URL is http://search.cpan.org/modlist/Documentation
The URL is http://search.cpan.org/modlist/Networking_Devices_IPC
The URL is http://search.cpan.org/modlist/World_Wide_Web
The URL is http://search.cpan.org/modlist/File_Handle_Input_Output
The URL is http://search.cpan.org/modlist/Operating_System_Interfaces

Now lets see how to use forms. On top of the page you can see the search form. Lets search for mechanize. To do that,

1 #search something
2 $mech->submit_form(
3 fields => {
4 query => 'mechanize'
5 }
6 );

In line number 4 query is the name of the form field.

< \input type="text" name="query" value="mechanize" size="35">

Fetch all the urls from search output

my @search_links = $mech->find_all_links();

Get all the URL that ends with .pm extension

foreach my $search_list (@search_links) {

if ($search_list->url_abs =~ /(.*).pm/) {

print "I found mechanize module and the URL is $1 \n";

}

Note: @search_links bring only the 1st page's URLs. The actual search output have more than 6-7 pages.

Using Fork
If you think the test takes more time then you can use fork to create a child process.

Example to create a child process:

if ( !defined ($child_pid = fork()) ) {

die "cannot fork: $!";

}

elsif ($child_pid == 0) {

# I'm the child

}

else {

# I'm the parent

}

If you have a array with huge number of URLs, you can split that into two and process them in child and parent. This will make the task easier.

Source code:

#!/usr/bin/perl -w

use strict;
use WWW::Mechanize;

# create agent
my $mech = WWW::Mechanize->new();
$mech->get("http://search.cpan.org");
if(! $mech->success) {
   print "Failed to load\n";
}

# Get all the links from the page that matches the given pattern.
my @links = $mech->find_all_links(url_regex => qr/modlist\/(.*)/);

foreach my $modlist (@links) {
print("The URL is " . $modlist->url_abs . "\n");
}

#search something
$mech->submit_form(
   fields => {
   query => 'mechanize'
   }
);

my @search_links = $mech->find_all_links();
foreach my $search_list (@search_links) {
   if ($search_list->url_abs =~ /(.*).pm/) {
   print "I found mechanize module and the URL is $1 \n";
   }
}

See also:
Test::WWW::Mechanize
Test::WWW::Mechanize::Catalyst

Spydr

Perl Mechanize

No comments:

Where I am on the web

Followers