ablog

不器用で落着きのない技術者のメモ

WWW::Mechanize::AutoPager を使ってみた

WWW::Mechanize::AutoPager を使ってみた。
コードは WWW::Mechanize::AutoPager - Automatic Pagination using AutoPagerize - metacpan.org
からほぼコピペ。

#!/usr/bin/perl

use WWW::Mechanize;
use WWW::Mechanize::AutoPager;
use Data::Dumper;

my $mech = WWW::Mechanize->new;
$mech->autopager->add_site(
	url         => 'http://.+.tumblr.com/',
	nextLink    => '//div[@id="content" or @id="container"]/div[last()]/a[last()]',
	pageElement => '//div[@id="content" or @id="container"]/div[@class!="footer" or @class!="navigation"]',
);

$mech->get('http://otsune.tumblr.com/');

while () {
	print Dumper $mech->next_link;
	$mech->get($mech->next_link);
	last if ( $@ or !defined($mech->next_link) );
}
  • 実行結果
$ ./autopager_tumblr.pl
$VAR1 = bless( do{\(my $o = 'http://otsune.tumblr.com/page/2')}, 'URI::http' );
$VAR1 = bless( do{\(my $o = 'http://otsune.tumblr.com/page/3')}, 'URI::http' );
$VAR1 = bless( do{\(my $o = 'http://otsune.tumblr.com/page/4')}, 'URI::http' );

...

$VAR1 = bless( do{\(my $o = 'http://otsune.tumblr.com/page/86')}, 'URI::http' );
$VAR1 = bless( do{\(my $o = 'http://otsune.tumblr.com/page/87')}, 'URI::http' );
$VAR1 = bless( do{\(my $o = 'http://otsune.tumblr.com/page/88')}, 'URI::http' );