Frequently Asked Questions 
3rd-Jun-2012 01:41 pm - [sticky post]
What is fetch_story and WWW::FetchStory?

WWW::FetchStory is a perl module, with an associated script, 'fetch_story', which fetches a story from a fiction website, intelligently dealing with the content from various different fiction websites such as fanfiction.net; it strips the extras from the HTML (such as navbars and javascript) and saves it so that all you get is the story text and its formatting. If a story is deduced to be a multi-chapter story, all chapters of the story are downloaded, and saved to separate files.

It can also convert the downloaded HTML files into one EPUB file.

The module can be downloaded from CPAN at http://search.cpan.org/dist/WWW-FetchStory/

What sites does fetch_story support?

The fetch_story script can download and parse stories from the following sites:

AO3: http://www.archiveofourown.org AO3 General fanfic archive
Ashwinder: (http://ashwinder.sycophanthex.com/) A Severus Snape/Hermione Granger HP fiction archive.
DigitalQuill: (http://www.digital-quill.org/) A Harry Potter fiction archive.
DracoAndGinny: (http://www.dracoandginny.com) A Draco Malfoy/Ginny Weasley HP fiction archive.
Dreamwidth: (http://www.dreamwidth.org) Journalling site where some post their fiction.
FanfictionNet: (http://www.fantiction.net/) Huge fan fiction archive.
FictionAlley: (http://www.fictionalley.org/) A Harry Potter fiction archive.
Gutenberg: (http://www.gutenberg.org) Project Gutenberg; public-domain works
HPAdultFanfiction: (http://hp.adultfanfiction.net) An adult Harry Potter fiction archive.
LiveJournal: (http://www.livejournal.com/) Journalling site where some people post their fiction.
Owl: (http://owl.tauri.org/) A Harry Potter fiction archive. (but this site is DEAD)
PetulantPoetess: (http://www.thepetulantpoetess.com/) A Harry Potter fiction archive.
PotionsAndSnitches: (http://www.potionsandsnitches.net) A Severus Snape + Harry Potter gen fiction archive.
PotterPlace: (http://www.potterplacearchives.com) A Harry Potter fiction archive.
RestrictedSection: (http://restrictedsection.org) An adult Harry Potter fiction archive.
SSHGExchange: (http://community.livejournal.com/sshg_exchange/) Severus Snape/Hermione Granger fiction exchange comm.
TardisBigBang3: (http://www.tardisbigbang.com/Round3/) Round 3 of the TARDIS BigBang challenge.
Teaspoon: (http://www.whofic.com) A Teaspoon And An Open Mind; a Doctor Who fiction archive.
TwistingHellmouth: (http://www.tthfanfic.org) Twisting The Hellmouth; Buffy The Vampire Slayer crossovers.

What is this "Perl" of which you speak? What's a perl module? What's a perl script? I don't understand all this computer stuff!

Perl is a scripting language; people write scripts in it. Basically, the difference between a script and a program is that a program runs by itself, while a script needs another program to run it - in this case, the program is Perl.

What this boils down to is that before you can use "fetch_story", you have to install perl, and then install fetch_story.

So how do I install Perl?


Perl is probably already installed. If not, use your favourite package manager to install it.


The most commonly used version of Perl on MS-Windows is ActiveState Perl: http://www.activestate.com/activeperl/downloads
There should be installation instructions on that site.

Apple Mac:

(Sorry, I have no data)

How do I install fetch_story?

You need to install the WWW-FetchStory module from CPAN (the Comprehensive Perl Archive Network).


In a terminal, type the following:

sudo cpan WWW::FetchStory

That will install fetch_story and all the other modules it depends upon.


This article tells you how to install CPAN modules: http://www.activestate.com/blog/2010/10/how-install-cpan-modules-activeperl

Apple Mac:

(Sorry, I have no data)

How do I use fetch_story? When I click on it, nothing happens!

The fetch_story script doesn't have a graphical interface, sorry. You have to run it from the command-line, inside a terminal.

fetch_story --help

will give you a short help message.

perldoc fetch_story

will give you the full manual for the script.

You can also look at http://search.cpan.org/dist/WWW-FetchStory/scripts/fetch_story for the manual.

It's not working! Help!

Er, you'll have to be more specific than that. Post a question here on the community and hopefully someone will be able to help you. But before you do, check out what other people have posted; they may have run into the same problem already.

For some examples (from before this comm was started) see:

I am not a geek! How can I contribute?

Help answer questions.
Write how-tos.
Give tips, not just about fetch_story, but about related things like EPUB readers.
Report bugs.
Report errors in the documentation.

Enjoy reading the fic you've downloaded!

I am a geek! How can I contribute?

The code for WWW::FetchStory is available from a Git repository at GitHub: https://github.com/rubykat/WWW-FetchStory

Please join in and welcome!

It would be especially helpful if you could write additional Fetcher Plugins for the module.

What are Fetcher Plugins?

In order to tidy the HTML and parse the pages for data about the story, site-specific "Fetcher" plugins have been written for various sites such as fanfiction.net, LiveJournal and others. These plugins can scrape meta-information about the story from the given page, including the URLs of all the chapters of a multi-chapter story.

Of course, if the site in question alters its page format, then the Fetcher for that site will break.
