{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Plotting google trends in python" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "[Google Trends](https://trends.google.com/trends/) show you the search-term frequency of a specific term relative to the total search-volume. \n", "\n", "As by now, their is [no official API interface to Google trends](https://en.wikipedia.org/wiki/Google_Trends#Google_Trends_API). Their are, however, some unofficial packages to access the Google Trends. One of them is [pytrends](https://github.com/GeneralMills/pytrends) which we will use here to get the relaitve search-term frequency.\n", "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Pytrends is available from pypi. To install use pip from the commandline: ```pip install pytrends```. For the analysis we will use [pandas](http://pandas.pydata.org/) and [matplotlib](https://matplotlib.org/) which also need to be installed on your system." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "First, we import the required packages:" ] }, { "cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [], "source": [ "from pytrends.request import TrendReq\n", "import pytrends\n", "import matplotlib.pyplot as plt\n", "import pandas as pd" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Google Trends currently allows to ask for up to five keywords. \n", "Here, we want to compare the relative search term frequency of \"global warming\" and \"climate change\" for the last 5 years ([for further option see the pytrends readme](https://github.com/GeneralMills/pytrends)):" ] }, { "cell_type": "code", "execution_count": 27, "metadata": {}, "outputs": [], "source": [ "search_terms = [\"climate change\", \"global warming\"]\n", "timeframe = \"2012-01-01 2020-08-01\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To issue an request, we first have to init a TrendReq object:" ] }, { "cell_type": "code", "execution_count": 28, "metadata": {}, "outputs": [], "source": [ "pytrends = TrendReq(hl='en-US', tz=360)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can than build the payload for the request and ask for the interest of these search-terms over time:" ] }, { "cell_type": "code", "execution_count": 29, "metadata": {}, "outputs": [], "source": [ "pytrends = TrendReq(hl='en-US', tz=360)\n", "pytrends.build_payload(search_terms, \n", " timeframe=timeframe)\n", "trends = pytrends.interest_over_time()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "pytrends returns a pandas dataframe, which we can inspect:" ] }, { "cell_type": "code", "execution_count": 30, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", " | climate change | \n", "global warming | \n", "isPartial | \n", "
---|---|---|---|
date | \n", "\n", " | \n", " | \n", " |
2020-04-01 | \n", "45 | \n", "23 | \n", "False | \n", "
2020-05-01 | \n", "41 | \n", "21 | \n", "False | \n", "
2020-06-01 | \n", "33 | \n", "17 | \n", "False | \n", "
2020-07-01 | \n", "25 | \n", "13 | \n", "False | \n", "
2020-08-01 | \n", "28 | \n", "14 | \n", "False | \n", "