Web Scraping With Python: Scrapy, SQL, Matplotlib To Gain Web Data Insights

Now I’m going to show you a comprehensive example how you can make raw web data useful and interesting using Scrapy, SQL and Matplotlib. It’s really supposed to be just an example because there are so many types of data out there and there are so many ways to analyze them and it really comes down to what is the best for you and your business.

Scraping And Analyzing Soccer Data

Briefly, this is the process I’m going to be using now to create this example project:

  • Task Zero: Requirements Of Reports

Figuring out what is really needed to be done. What are our (business) goals and what reports should we create? What would a proper analysis look like?

  • Task One: Data Fields And Source Of Data

Planning ahead what data fields and attributes we’ll need to satisfy the requirements. Also, looking for websites where I can get data from.

  • Task Two: Scrapy Spiders

Creating scrapers for the website(s) that we’ve chosen in the previous task.

  • Task Three: Process Data

Cleaning, standardizing, normalizing, structuring and storing data into a database.

  • Task Four: Analyze Data

Creating reports that help you make decisions or help you understand data more.

  • Task Five: Conclusions

Draw conclusions based on analysis. Understand data.

Storytime is over. Start working!