Skip to content

walbuc/Django-Scrapy

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Django and Scrapy

An example of how to use Django ORM to store in a db obtained data by a Scrapy Spider an then exopse the data through an REST API

As an example, i set up this project to scrap all over rolling stone lists/rankings and store them in a relational db with proper data models

Non pip requirements

  • Python 2.7
  • pip
  • virtualenv
  • Some broker compatible with celery, i use redis
  • a db compatible with django, i use sqlite 3 in dev, postgres or mongodb in prod. If you are not familiar how django manages dbs go here

Installation

Clone project and install requirements in virtualenv

# install fabric in python global enviroment
pip install fabric
# clone repo
git clone git://github.com/drkloc/rstone_scrapper.git
cd rstone_scrapper
# setup app
fab DEV setup

For OSX users only

You need to install lxml with static deps before runing pip against requirements file:

STATIC_DEPS=true pip install lxml

Settings override

Any settings override (Database config, broker config, etc) are conveniently made inside settings_local.py. Just copy the demo file:

cp settings_local_demo.py settings_local.py

and start customizing whatever you want/need.

Start redis-server and celery deamon

redis-server
python manage.py celeryd

Initialization

scrapy runspider scrap.py

Running server

python manage.py runserver

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors