scrapy newbie: tutorial. python cannot find the spider
I'm trying to follow Scrapy tutorial, but I'm stuck in one of the first
step. I think I have correctly created the spider:
class dmoz(BaseSpider): name = "dmoz" allowed_domains = ["dmoz.org"]
start_urls = [
"http://www.dmoz.org/Computers/Programming/Languages/Python/Books/",
"http://www.dmoz.org/Computers/Programming/Languages/Python/Resources/" ]
def parse(self, response):
filename = response.url.split("/")[-2]
open(filename, 'wb').write(response.body)
I have saved that (as dmoz_spider.py) from the IDLE-shell typing the .py
extension in a given folder, which corresponds with the directory of the
terminal window.
However, when I type scrapy crawl dmoz I get this:
2013-08-09 19:18:06+0200 [scrapy] INFO: Scrapy 0.16.5 started (bot: dmoz)
2013-08-09 19:18:07+0200 [scrapy] DEBUG: Enabled extensions: LogStats,
TelnetConsole, CloseSpider, WebService, CoreStats, SpiderState 2013-08-09
19:18:08+0200 [scrapy] DEBUG: Enabled downloader middlewares:
HttpAuthMiddleware, DownloadTimeoutMiddleware, UserAgentMiddleware,
RetryMiddleware, DefaultHeadersMiddleware, RedirectMiddleware,
CookiesMiddleware, HttpCompressionMiddleware, ChunkedTransferMiddleware,
DownloaderStats 2013-08-09 19:18:08+0200 [scrapy] DEBUG: Enabled spider
middlewares: HttpErrorMiddleware, OffsiteMiddleware, RefererMiddleware,
UrlLengthMiddleware, DepthMiddleware 2013-08-09 19:18:08+0200 [scrapy]
DEBUG: Enabled item pipelines: Traceback (most recent call last): File
"/Library/Frameworks/Python.framework/Versions/2.7/bin/scrapy", line 5, in
pkg_resources.run_script('Scrapy==0.16.5', 'scrapy') File
"/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pkg_resources.py",
line 499, in run_script self.require(requires)[0].run_script(script_name,
ns) File
"/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pkg_resources.py",
line 1235, in run_script execfile(script_filename, namespace, namespace)
File
"/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/Scrapy-0.16.5-py2.7.egg/EGG-INFO/scripts/scrapy",
line 4, in execute() File
"/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/Scrapy-0.16.5-py2.7.egg/scrapy/cmdline.py",
line 131, in execute _run_print_help(parser, _run_command, cmd, args,
opts) File
"/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/Scrapy-0.16.5-py2.7.egg/scrapy/cmdline.py",
line 76, in _run_print_help func(*a, **kw) File
"/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/Scrapy-0.16.5-py2.7.egg/scrapy/cmdline.py",
line 138, in _run_command cmd.run(args, opts) File
"/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/Scrapy-0.16.5-py2.7.egg/scrapy/commands/crawl.py",
line 43, in run spider = self.crawler.spiders.create(spname,
**opts.spargs) File
"/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/Scrapy-0.16.5-py2.7.egg/scrapy/spidermanager.py",
line 43, in create raise KeyError("Spider not found: %s" % spider_name)
KeyError: 'Spider not found: dmoz'
I cannot understand what is wrong, but given that I'm quite new to
programming, it might be a very easy thing. Thank you in advance!
No comments:
Post a Comment