Metadata-Version: 2.1
Name: board-game-scraper
Version: 2.11.2
Summary: Board games data scraping and processing from BoardGameGeek and more!
Home-page: https://recommend.games/
Author: Markus Shepherd
Author-email: markus@recommend.games
License: MIT
Project-URL: Documentation, https://gitlab.com/recommend.games/board-game-scraper/blob/master/README.md
Project-URL: Funding, https://paypal.me/mschepke
Project-URL: Say Thanks!, https://saythanks.io/to/mk.schepke%40gmail.com
Project-URL: Source, https://gitlab.com/recommend.games/board-game-scraper
Project-URL: Tracker, https://gitlab.com/recommend.games/board-game-scraper/issues
Project-URL: Twitter, https://twitter.com/recommend_games
Description: 
        # board-game-scraper
        
        Scraping data about board games from the web. View the data live at
        [Recommend.Games](https://recommend.games/)! Install via
        
        ```bash
        pip install board-game-scraper
        ```
        
        ## Sources
        
        * [Board Game Atlas](https://www.boardgameatlas.com/) (`bga`)
        * [BoardGameGeek](https://boardgamegeek.com/) (`bgg`)
        * [DBpedia](https://wiki.dbpedia.org/) (`dbpedia`)
        * [Luding.org](https://luding.org/) (`luding`)
        * [Spielen.de](https://gesellschaftsspiele.spielen.de/) (`spielen`)
        * [Wikidata](https://www.wikidata.org/) (`wikidata`)
        
        ## Run scrapers
        
        [Requires Python 3](https://pythonclock.org/). Make sure
        [Pipenv](https://docs.pipenv.org/) is installed and create the virtual
        environment:
        
        ```bash
        python3 -m pip install --upgrade pipenv
        pipenv install --dev
        pipenv shell
        ```
        
        Run a spider like so:
        
        ```bash
        JOBDIR="jobs/${SPIDER}/$(date --utc +'%Y-%m-%dT%H-%M-%S')"
        scrapy crawl "${SPIDER}" \
            --output 'feeds/%(name)s/%(time)s/%(class)s.csv' \
            --set "JOBDIR=${JOBDIR}"
        ```
        
        where `$SPIDER` is one of the IDs above.
        
        Run all the spiders with the [`run_scrapers.sh`](run_scrapers.sh) script. Get a
        list of the running scrapers' PIDs with the [`processes.sh`](processes.sh)
        script. You can close all the running scrapers via
        
        ```bash
        ./processes.sh stop
        ```
        
        and resume them later.
        
        ## Tests
        
        You can run `scrapy check` to perform contract tests for all spiders, or
        `scrapy check $SPIDER` to test one particular spider. If tests fails,
        there most likely has been some change on the website and the spider needs
        updating.
        
        ## Board game datasets
        
        If you are interested in using any of the datasets produced by this scraper,
        take a look at the
        [BoardGameGeek guild](https://boardgamegeek.com/thread/2287371/boardgamegeek-games-and-ratings-datasets).
        A subset of the data can also be found on [Kaggle](https://www.kaggle.com/mshepherd/board-games).
        
        ## Links
        
        * [board-game-scraper](https://gitlab.com/recommend.games/board-game-scraper):
         This repository
        * [Recommend.Games](https://recommend.games/): board game recommender using the
         scraped data
        * [recommend-games-server](https://gitlab.com/recommend.games/recommend-games-server):
         Server code for [Recommend.Games](https://recommend.games/)
        * [board-game-recommender](https://gitlab.com/recommend.games/board-game-recommender):
         Recommender code for [Recommend.Games](https://recommend.games/)
        
Keywords: board games,tabletop games,data,datasets,scraper,scrapy,spider,boardgamegeek,bgg,ludoj,ludoj-scraper
Platform: UNKNOWN
Classifier: Framework :: Scrapy
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: 3.6
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3.8
Classifier: Topic :: Games/Entertainment :: Board Games
Requires-Python: >=3.6.0
Description-Content-Type: text/markdown
Provides-Extra: cloud
