Sunday, April 15, 2018

Reading Wikipedia Tables

This module, wikiHelp.py, is a small group of functions for loading a table from a Wikipedia page into a Pandas dataframe. The motivation is to have a tool that provides consistent results loading small dataframes for practice exercises. Once the module is imported, we can easily read in the Nth table from a Wikipedia page as a dataframe simply as below:
from wikiHelp import WIKI, getWtable
CONTINENT = "_".join(["List_of_sovereign_states" \
    , "and_dependent_territories_by_continent"])
africaDF = getWtable("%s%s" %(WIKI, CONTINENT), tabNum=0)
print africaDF.shape

No comments:

Post a Comment