{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Building Waffle Chart\n", "---" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Let's import the libraries first" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "import pandas as pd\n", "import matplotlib as mpl\n", "import matplotlib.pyplot as plt\n", "import matplotlib.patches as mpatches # needed for waffle Charts\n", "%matplotlib inline\n", "mpl.style.use('ggplot') # optional: for ggplot-like style\n", "\n", "# check for latest version of Matplotlib\n", "print ('Matplotlib version: ', mpl.__version__) # >= 2.0.0" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Downloading and Prepping Data" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Data downloaded and read into a dataframe!\n" ] } ], "source": [ "df_can = pd.read_excel('https://s3-api.us-geo.objectstorage.softlayer.net/cf-courses-data/CognitiveClass/DV0101EN/labs/Data_Files/Canada.xlsx',\n", " sheet_name='Canada by Citizenship',\n", " skiprows=range(20),\n", " skipfooter=2)\n", "\n", "print('Data downloaded and read into a dataframe!')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Dataset: Immigration to Canada from 1980 to 2013 - [International migration flows to and from selected countries - The 2015 revision](http://www.un.org/en/development/desa/population/migration/data/empirical2/migrationflows.shtml) from United Nation's website\n", "\n", "The dataset contains annual data on the flows of international migrants as recorded by the countries of destination. The data presents both inflows and outflows according to the place of birth, citizenship or place of previous / next residence both for foreigners and nationals. " ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "scrolled": true }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
TypeCoverageOdNameAREAAreaNameREGRegNameDEVDevName1980...2004200520062007200820092010201120122013
0ImmigrantsForeignersAfghanistan935Asia5501Southern Asia902Developing regions16...2978343630092652211117461758220326352004
1ImmigrantsForeignersAlbania908Europe925Southern Europe901Developed regions1...14501223856702560716561539620603
2ImmigrantsForeignersAlgeria903Africa912Northern Africa902Developing regions80...3616362648073623400553934752432537744331
3ImmigrantsForeignersAmerican Samoa909Oceania957Polynesia902Developing regions0...0010000000
4ImmigrantsForeignersAndorra908Europe925Southern Europe901Developed regions0...0011000011
\n", "

5 rows × 43 columns

\n", "
" ], "text/plain": [ " Type Coverage OdName AREA AreaName REG \\\n", "0 Immigrants Foreigners Afghanistan 935 Asia 5501 \n", "1 Immigrants Foreigners Albania 908 Europe 925 \n", "2 Immigrants Foreigners Algeria 903 Africa 912 \n", "3 Immigrants Foreigners American Samoa 909 Oceania 957 \n", "4 Immigrants Foreigners Andorra 908 Europe 925 \n", "\n", " RegName DEV DevName 1980 ... 2004 2005 2006 \\\n", "0 Southern Asia 902 Developing regions 16 ... 2978 3436 3009 \n", "1 Southern Europe 901 Developed regions 1 ... 1450 1223 856 \n", "2 Northern Africa 902 Developing regions 80 ... 3616 3626 4807 \n", "3 Polynesia 902 Developing regions 0 ... 0 0 1 \n", "4 Southern Europe 901 Developed regions 0 ... 0 0 1 \n", "\n", " 2007 2008 2009 2010 2011 2012 2013 \n", "0 2652 2111 1746 1758 2203 2635 2004 \n", "1 702 560 716 561 539 620 603 \n", "2 3623 4005 5393 4752 4325 3774 4331 \n", "3 0 0 0 0 0 0 0 \n", "4 1 0 0 0 0 1 1 \n", "\n", "[5 rows x 43 columns]" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_can.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Clean up data. We will make some modifications to the original dataset to make it easier to create our visualization." ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "data dimensions: (195, 38)\n" ] } ], "source": [ "# clean up the dataset to remove unnecessary columns (eg. REG) \n", "df_can.drop(['AREA','REG','DEV','Type','Coverage'], axis = 1, inplace = True)\n", "\n", "# let's rename the columns so that they make sense\n", "df_can.rename (columns = {'OdName':'Country', 'AreaName':'Continent','RegName':'Region'}, inplace = True)\n", "\n", "# for sake of consistency, let's also make all column labels of type string\n", "df_can.columns = list(map(str, df_can.columns))\n", "\n", "# set the country name as index - useful for quickly looking up countries using .loc method\n", "df_can.set_index('Country', inplace = True)\n", "\n", "# add total column\n", "df_can['Total'] = df_can.sum (axis = 1)\n", "\n", "# years that we will be using in this lesson - useful for plotting later on\n", "years = list(map(str, range(1980, 2014)))\n", "print ('data dimensions:', df_can.shape)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Waffle Charts \n", "\n", "\n", "A `waffle chart` is an interesting visualization that is normally created to display progress toward goals. It is commonly an effective option when you are trying to add interesting visualization features to a visual that consists mainly of cells, such as an Excel dashboard." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's revisit the previous case study about Denmark, Norway, and Sweden." ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
ContinentRegionDevName1980198119821983198419851986...200520062007200820092010201120122013Total
Country
DenmarkEuropeNorthern EuropeDeveloped regions272293299106937393...621019710881929394813901
NorwayEuropeNorthern EuropeDeveloped regions1167710651315456...5753736675464953592327
SwedenEuropeNorthern EuropeDeveloped regions281308222176128158187...2051391931651671591341401405866
\n", "

3 rows × 38 columns

\n", "
" ], "text/plain": [ " Continent Region DevName 1980 1981 1982 1983 \\\n", "Country \n", "Denmark Europe Northern Europe Developed regions 272 293 299 106 \n", "Norway Europe Northern Europe Developed regions 116 77 106 51 \n", "Sweden Europe Northern Europe Developed regions 281 308 222 176 \n", "\n", " 1984 1985 1986 ... 2005 2006 2007 2008 2009 2010 2011 \\\n", "Country ... \n", "Denmark 93 73 93 ... 62 101 97 108 81 92 93 \n", "Norway 31 54 56 ... 57 53 73 66 75 46 49 \n", "Sweden 128 158 187 ... 205 139 193 165 167 159 134 \n", "\n", " 2012 2013 Total \n", "Country \n", "Denmark 94 81 3901 \n", "Norway 53 59 2327 \n", "Sweden 140 140 5866 \n", "\n", "[3 rows x 38 columns]" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# let's create a new dataframe for these three countries \n", "df_dsn = df_can.loc[['Denmark', 'Norway', 'Sweden'], :]\n", "\n", "# let's take a look at our dataframe\n", "df_dsn" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Unfortunately, unlike R, `waffle` charts are not built into any of the Python visualization libraries. Therefore, we will learn how to create them from scratch." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Step 1.** The first step into creating a waffle chart is determing the proportion of each category with respect to the total." ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Denmark: 0.32255663965602777\n", "Norway: 0.1924094592359848\n", "Sweden: 0.48503390110798744\n" ] } ], "source": [ "# compute the proportion of each category with respect to the total\n", "total_values = sum(df_dsn['Total'])\n", "category_proportions = [(float(value) / total_values) for value in df_dsn['Total']]\n", "\n", "# print out proportions\n", "for i, proportion in enumerate(category_proportions):\n", " print (df_dsn.index.values[i] + ': ' + str(proportion))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Step 2.** The second step is defining the overall size of the `waffle` chart." ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Total number of tiles is 400\n" ] } ], "source": [ "width = 40 # width of chart\n", "height = 10 # height of chart\n", "\n", "total_num_tiles = width * height # total number of tiles\n", "\n", "print ('Total number of tiles is ', total_num_tiles)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Step 3.** The third step is using the proportion of each category to determe it respective number of tiles" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Denmark: 129\n", "Norway: 77\n", "Sweden: 194\n" ] } ], "source": [ "# compute the number of tiles for each catagory\n", "tiles_per_category = [round(proportion * total_num_tiles) for proportion in category_proportions]\n", "\n", "# print out number of tiles per category\n", "for i, tiles in enumerate(tiles_per_category):\n", " print (df_dsn.index.values[i] + ': ' + str(tiles))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Based on the calculated proportions, Denmark will occupy 129 tiles of the `waffle` chart, Norway will occupy 77 tiles, and Sweden will occupy 194 tiles." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Step 4.** The fourth step is creating a matrix that resembles the `waffle` chart and populating it." ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Waffle chart populated!\n" ] } ], "source": [ "# initialize the waffle chart as an empty matrix\n", "waffle_chart = np.zeros((height, width))\n", "\n", "# define indices to loop through waffle chart\n", "category_index = 0\n", "tile_index = 0\n", "\n", "# populate the waffle chart\n", "for col in range(width):\n", " for row in range(height):\n", " tile_index += 1\n", "\n", " # if the number of tiles populated for the current category is equal to its corresponding allocated tiles...\n", " if tile_index > sum(tiles_per_category[0:category_index]):\n", " # ...proceed to the next category\n", " category_index += 1 \n", " \n", " # set the class value to an integer, which increases with class\n", " waffle_chart[row, col] = category_index\n", " \n", "print ('Waffle chart populated!')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's take a peek at how the matrix looks like." ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 2., 2., 2.,\n", " 2., 2., 2., 2., 2., 3., 3., 3., 3., 3., 3., 3., 3., 3., 3., 3.,\n", " 3., 3., 3., 3., 3., 3., 3., 3.],\n", " [1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 2., 2., 2.,\n", " 2., 2., 2., 2., 2., 3., 3., 3., 3., 3., 3., 3., 3., 3., 3., 3.,\n", " 3., 3., 3., 3., 3., 3., 3., 3.],\n", " [1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 2., 2., 2.,\n", " 2., 2., 2., 2., 2., 3., 3., 3., 3., 3., 3., 3., 3., 3., 3., 3.,\n", " 3., 3., 3., 3., 3., 3., 3., 3.],\n", " [1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 2., 2., 2.,\n", " 2., 2., 2., 2., 2., 3., 3., 3., 3., 3., 3., 3., 3., 3., 3., 3.,\n", " 3., 3., 3., 3., 3., 3., 3., 3.],\n", " [1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 2., 2., 2.,\n", " 2., 2., 2., 2., 2., 3., 3., 3., 3., 3., 3., 3., 3., 3., 3., 3.,\n", " 3., 3., 3., 3., 3., 3., 3., 3.],\n", " [1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 2., 2., 2.,\n", " 2., 2., 2., 2., 2., 3., 3., 3., 3., 3., 3., 3., 3., 3., 3., 3.,\n", " 3., 3., 3., 3., 3., 3., 3., 3.],\n", " [1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 2., 2., 2.,\n", " 2., 2., 2., 2., 3., 3., 3., 3., 3., 3., 3., 3., 3., 3., 3., 3.,\n", " 3., 3., 3., 3., 3., 3., 3., 3.],\n", " [1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 2., 2., 2.,\n", " 2., 2., 2., 2., 3., 3., 3., 3., 3., 3., 3., 3., 3., 3., 3., 3.,\n", " 3., 3., 3., 3., 3., 3., 3., 3.],\n", " [1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 2., 2., 2.,\n", " 2., 2., 2., 2., 3., 3., 3., 3., 3., 3., 3., 3., 3., 3., 3., 3.,\n", " 3., 3., 3., 3., 3., 3., 3., 3.],\n", " [1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 2., 2., 2., 2.,\n", " 2., 2., 2., 2., 3., 3., 3., 3., 3., 3., 3., 3., 3., 3., 3., 3.,\n", " 3., 3., 3., 3., 3., 3., 3., 3.]])" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "waffle_chart" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As expected, the matrix consists of three categories and the total number of each category's instances matches the total number of tiles allocated to each category." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Step 5.** Map the `waffle` chart matrix into a visual." ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" }, { "data": { "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# instantiate a new figure object\n", "fig = plt.figure()\n", "\n", "# use matshow to display the waffle chart\n", "colormap = plt.cm.coolwarm\n", "plt.matshow(waffle_chart, cmap=colormap)\n", "plt.colorbar()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Step 6.** Prettify the chart." ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "([], )" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" }, { "data": { "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# instantiate a new figure object\n", "fig = plt.figure()\n", "\n", "# use matshow to display the waffle chart\n", "colormap = plt.cm.coolwarm\n", "plt.matshow(waffle_chart, cmap=colormap)\n", "plt.colorbar()\n", "\n", "# get the axis\n", "ax = plt.gca()\n", "\n", "# set minor ticks\n", "ax.set_xticks(np.arange(-.5, (width), 1), minor=True)\n", "ax.set_yticks(np.arange(-.5, (height), 1), minor=True)\n", " \n", "# add gridlines based on minor ticks\n", "ax.grid(which='minor', color='w', linestyle='-', linewidth=2)\n", "\n", "plt.xticks([])\n", "plt.yticks([])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Step 7.** Create a legend and add it to chart." ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" }, { "data": { "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# instantiate a new figure object\n", "fig = plt.figure()\n", "\n", "# use matshow to display the waffle chart\n", "colormap = plt.cm.coolwarm\n", "plt.matshow(waffle_chart, cmap=colormap)\n", "plt.colorbar()\n", "\n", "# get the axis\n", "ax = plt.gca()\n", "\n", "# set minor ticks\n", "ax.set_xticks(np.arange(-.5, (width), 1), minor=True)\n", "ax.set_yticks(np.arange(-.5, (height), 1), minor=True)\n", " \n", "# add gridlines based on minor ticks\n", "ax.grid(which='minor', color='w', linestyle='-', linewidth=2)\n", "\n", "plt.xticks([])\n", "plt.yticks([])\n", "\n", "# compute cumulative sum of individual categories to match color schemes between chart and legend\n", "values_cumsum = np.cumsum(df_dsn['Total'])\n", "total_values = values_cumsum[len(values_cumsum) - 1]\n", "\n", "# create legend\n", "legend_handles = []\n", "for i, category in enumerate(df_dsn.index.values):\n", " label_str = category + ' (' + str(df_dsn['Total'][i]) + ')'\n", " color_val = colormap(float(values_cumsum[i])/total_values)\n", " legend_handles.append(mpatches.Patch(color=color_val, label=label_str))\n", "\n", "# add legend to chart\n", "plt.legend(handles=legend_handles,\n", " loc='lower center', \n", " ncol=len(df_dsn.index.values),\n", " bbox_to_anchor=(0., -0.2, 0.95, .1)\n", " )" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "And there you go! What a good looking *delicious* `waffle` chart, don't you think?" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now it would very inefficient to repeat these seven steps every time we wish to create a `waffle` chart. So let's combine all seven steps into one function called *create_waffle_chart*. This function would take the following parameters as input:\n", "\n", "> 1. **categories**: Unique categories or classes in dataframe.\n", "> 2. **values**: Values corresponding to categories or classes.\n", "> 3. **height**: Defined height of waffle chart.\n", "> 4. **width**: Defined width of waffle chart.\n", "> 5. **colormap**: Colormap class\n", "> 6. **value_sign**: In order to make our function more generalizable, we will add this parameter to address signs that could be associated with a value such as %, $, and so on. **value_sign** has a default value of empty string." ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [], "source": [ "def create_waffle_chart(categories, values, height, width, colormap, value_sign=''):\n", "\n", " # compute the proportion of each category with respect to the total\n", " total_values = sum(values)\n", " category_proportions = [(float(value) / total_values) for value in values]\n", "\n", " # compute the total number of tiles\n", " total_num_tiles = width * height # total number of tiles\n", " print ('Total number of tiles is', total_num_tiles)\n", " \n", " # compute the number of tiles for each catagory\n", " tiles_per_category = [round(proportion * total_num_tiles) for proportion in category_proportions]\n", "\n", " # print out number of tiles per category\n", " for i, tiles in enumerate(tiles_per_category):\n", " print (df_dsn.index.values[i] + ': ' + str(tiles))\n", " \n", " # initialize the waffle chart as an empty matrix\n", " waffle_chart = np.zeros((height, width))\n", "\n", " # define indices to loop through waffle chart\n", " category_index = 0\n", " tile_index = 0\n", "\n", " # populate the waffle chart\n", " for col in range(width):\n", " for row in range(height):\n", " tile_index += 1\n", "\n", " # if the number of tiles populated for the current category \n", " # is equal to its corresponding allocated tiles...\n", " if tile_index > sum(tiles_per_category[0:category_index]):\n", " # ...proceed to the next category\n", " category_index += 1 \n", " \n", " # set the class value to an integer, which increases with class\n", " waffle_chart[row, col] = category_index\n", " \n", " # instantiate a new figure object\n", " fig = plt.figure()\n", "\n", " # use matshow to display the waffle chart\n", " colormap = plt.cm.coolwarm\n", " plt.matshow(waffle_chart, cmap=colormap)\n", " plt.colorbar()\n", "\n", " # get the axis\n", " ax = plt.gca()\n", "\n", " # set minor ticks\n", " ax.set_xticks(np.arange(-.5, (width), 1), minor=True)\n", " ax.set_yticks(np.arange(-.5, (height), 1), minor=True)\n", " \n", " # add dridlines based on minor ticks\n", " ax.grid(which='minor', color='w', linestyle='-', linewidth=2)\n", "\n", " plt.xticks([])\n", " plt.yticks([])\n", "\n", " # compute cumulative sum of individual categories to match color schemes between chart and legend\n", " values_cumsum = np.cumsum(values)\n", " total_values = values_cumsum[len(values_cumsum) - 1]\n", "\n", " # create legend\n", " legend_handles = []\n", " for i, category in enumerate(categories):\n", " if value_sign == '%':\n", " label_str = category + ' (' + str(values[i]) + value_sign + ')'\n", " else:\n", " label_str = category + ' (' + value_sign + str(values[i]) + ')'\n", " \n", " color_val = colormap(float(values_cumsum[i])/total_values)\n", " legend_handles.append(mpatches.Patch(color=color_val, label=label_str))\n", "\n", " # add legend to chart\n", " plt.legend(\n", " handles=legend_handles,\n", " loc='lower center', \n", " ncol=len(categories),\n", " bbox_to_anchor=(0., -0.2, 0.95, .1)\n", " )" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now to create a `waffle` chart, all we have to do is call the function `create_waffle_chart`. Let's define the input parameters:" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [], "source": [ "width = 40 # width of chart\n", "height = 10 # height of chart\n", "\n", "categories = df_dsn.index.values # categories\n", "values = df_dsn['Total'] # correponding values of categories\n", "\n", "colormap = plt.cm.coolwarm # color map class" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "And now let's call our function to create a `waffle` chart." ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Total number of tiles is 400\n", "Denmark: 129\n", "Norway: 77\n", "Sweden: 194\n" ] }, { "data": { "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "create_waffle_chart(categories, values, height, width, colormap)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "There seems to be a new Python package for generating `waffle charts` called [PyWaffle](https://github.com/ligyxy/PyWaffle), but it looks like the repository is still being built. But feel free to check it out and play with it." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Thanks for reading :)\n", "Created by [Alex Aklson](https://www.linkedin.com/in/aklson/) and modified by [Tarun kamboj](https://www.linkedin.com/in/kambojtarun/)" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.6" } }, "nbformat": 4, "nbformat_minor": 4 }