This is part 1 of a series of writeups I'm doing on my progress on my NaNoGenMo project, Markov's Fanfiction.
Gathering the stories is the easy part. It was one of the main tenets of my previous Markov-based fanfic generator. Gathering story IDs, on the other hand, is another story entirely.
Specifically, I have to have a large collection of stories to draw from, or my fanfic will be too specific. For example, my previous generator read from 2 stories: Terrifying Renegade, a Steven Universe fanfic that describes itself as "a darker, slow burning story about corrupted gems and finding hope", and Come with Me, a Steven Universe fanfic centered around an AU where the gems are human, and Lapis and Peridot fall in love. The results were... Let's just call it "less than stellar".
She sank to her mother, who even now she could make it two. She listened to the sign up sheets of each other their shoulders. Plus, Lapis imagined, Peridot just liked being able to catch Pearl, who had no idea who this Steven person was. In all honesty, it sounded robotic, did it on as Peridot just liked being able to run through her. And maybe she just wanted to float, like all the time, anyway. Even Peridot, the most unlikely of recruits, had taken to referring to her and be her friend. Use the door, as it tended to have its own ideas about when he realized the gems could keep up. Then the screen went black and white keys delicately, barely pressing them down as she slunk back out to the kids. The only immediately viable choice was to be on the clothe to rid herself of the entire time. She looked from Connie to Steven, bending a bit when she could muster.
-- Em Adn, a fanfic from the previous generator
While some of those sentences came out well-formed, like "Plus, Lapis imagined, Peridot just liked being able to catch Pearl, who had no idea who this Steven person was", most of them are gibberish. Also, they don't have much word diversity.
I created a bookmark that, when ran on a Fanfiction.net search page, copies the IDs and titles from all the story links to your clipboard. Using this on the first 2 pages of Steven Universe fanfiction, I gathered plenty of stories.
Now, the hard work begins. I have to figure out how to make the Markov chains behave. I have to be able to fine-tune the Markov output to allow some similarity to the original corpus, due to the sheer volume of text.
But that's for another part. I'll keep you all posted!