I was browsing Reddit and I stumbled upon a small sub-reddit at /r/Earthporn this small conner of Reddit had higres pictures of earth scenery. From there i was directed to /r/CityPorn, /r/SpacePorn , /r/MachinePorn, /r/AnimalPorn and /r/BotanicalPorn all HD pic. This got me thinking since i am lazy as fuck i did not want to spend every day going through six sub-reddit download and save them the best way to do is to automate the process.
In order to download the images the first thing that needed to be down was to harvest the urls from the sub-reddits home page. I could have gone and screen scraped with beautiful soup but reddit provides this nifty feature where by if you append .json to end of a url eg http://www.reddit.com/r/earthporn/.json it will return the json file for the corresponding page with posts and urls and other data (also if you append .xml it will return and XML file with the pages data ). This allows me to skip all the dirty crud of passing html of a constenly changing page. Below is the version 0.0.1 of URL harvester code.