Now I’m going to show you a comprehensive example how you can make raw web data useful and interesting using Scrapy, SQL and Matplotlib. It’s really supposed to be just an example because there are so many types of data out there and there are so many ways to analyze them and it really comes down to what is the best for you and your business.
Scraping And Analyzing Soccer Data
Briefly, this is the process I’m going to be using now to create this example project:
- Task Zero: Requirements Of Reports
Figuring out what is really needed to be done. What are our (business) goals and what reports should we create? What would a proper analysis look like?
- Task One: Data Fields And Source Of Data
Planning ahead what data fields and attributes we’ll need to satisfy the requirements. Also, looking for websites where I can get data from.
- Task Two: Scrapy Spiders
Creating scrapers for the website(s) that we’ve chosen in the previous task.
- Task Three: Process Data
Cleaning, standardizing, normalizing, structuring and storing data into a database.
- Task Four: Analyze Data
Creating reports that help you make decisions or help you understand data more.
- Task Five: Conclusions
Draw conclusions based on analysis. Understand data.
Storytime is over. Start working!