WWW::Mechanize::AutoPager を使ってみた。
コードは WWW::Mechanize::AutoPager - Automatic Pagination using AutoPagerize - metacpan.org
からほぼコピペ。
- autopager_tumblr.pl
#!/usr/bin/perl use WWW::Mechanize; use WWW::Mechanize::AutoPager; use Data::Dumper; my $mech = WWW::Mechanize->new; $mech->autopager->add_site( url => 'http://.+.tumblr.com/', nextLink => '//div[@id="content" or @id="container"]/div[last()]/a[last()]', pageElement => '//div[@id="content" or @id="container"]/div[@class!="footer" or @class!="navigation"]', ); $mech->get('http://otsune.tumblr.com/'); while () { print Dumper $mech->next_link; $mech->get($mech->next_link); last if ( $@ or !defined($mech->next_link) ); }
- 実行結果
$ ./autopager_tumblr.pl $VAR1 = bless( do{\(my $o = 'http://otsune.tumblr.com/page/2')}, 'URI::http' ); $VAR1 = bless( do{\(my $o = 'http://otsune.tumblr.com/page/3')}, 'URI::http' ); $VAR1 = bless( do{\(my $o = 'http://otsune.tumblr.com/page/4')}, 'URI::http' ); ... $VAR1 = bless( do{\(my $o = 'http://otsune.tumblr.com/page/86')}, 'URI::http' ); $VAR1 = bless( do{\(my $o = 'http://otsune.tumblr.com/page/87')}, 'URI::http' ); $VAR1 = bless( do{\(my $o = 'http://otsune.tumblr.com/page/88')}, 'URI::http' );