{ "cells": [ { "cell_type": "markdown", "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false } }, "source": [ "# Bubble Plot\n", "---" ] }, { "cell_type": "markdown", "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false } }, "source": [ "Import primary modules." ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false } }, "outputs": [], "source": [ "import pandas as pd \n", "import matplotlib.pyplot as plt\n", "%matplotlib inline\n", "plt.style.use('ggplot')" ] }, { "cell_type": "markdown", "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false } }, "source": [ "Dataset: Immigration to Canada from 1980 to 2013 - [International migration flows to and from selected countries - The 2015 revision](http://www.un.org/en/development/desa/population/migration/data/empirical2/migrationflows.shtml) from United Nation's website.\n", "\n", "The dataset contains annual data on the flows of international migrants as recorded by the countries of destination. The data presents both inflows and outflows according to the place of birth, citizenship or place of previous / next residence both for foreigners and nationals. In this notebook, we will focus on the Canadian Immigration data." ] }, { "cell_type": "markdown", "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false } }, "source": [ "Let's download and import our primary Canadian Immigration dataset using *pandas* `read_excel()` You may need to run the following command.\n", "```\n", "!conda install -c anaconda xlrd --yes\n", "```" ] }, { "cell_type": "markdown", "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false } }, "source": [ "Download the dataset and read it into a *pandas* dataframe." ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Data downloaded and read into a dataframe!\n" ] } ], "source": [ "df_can = pd.read_excel('https://s3-api.us-geo.objectstorage.softlayer.net/cf-courses-data/CognitiveClass/DV0101EN/labs/Data_Files/Canada.xlsx',\n", " sheet_name='Canada by Citizenship', skiprows=range(20), skipfooter=2 )\n", "print('Data downloaded and read into a dataframe!')" ] }, { "cell_type": "markdown", "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false } }, "source": [ "Clean up data. We will make some modifications to the original dataset to make it easier to create our visualizations. You don't need to understand it." ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "button": false, "new_sheet": false, "run_control": { "read_only": false }, "scrolled": true }, "outputs": [ { "data": { "text/html": [ "
Country | \n", "Year | \n", "Afghanistan | \n", "Albania | \n", "Algeria | \n", "American Samoa | \n", "Andorra | \n", "Angola | \n", "Antigua and Barbuda | \n", "Argentina | \n", "Armenia | \n", "... | \n", "United States of America | \n", "Uruguay | \n", "Uzbekistan | \n", "Vanuatu | \n", "Venezuela (Bolivarian Republic of) | \n", "Viet Nam | \n", "Western Sahara | \n", "Yemen | \n", "Zambia | \n", "Zimbabwe | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | \n", "1980 | \n", "16 | \n", "1 | \n", "80 | \n", "0 | \n", "0 | \n", "1 | \n", "0 | \n", "368 | \n", "0 | \n", "... | \n", "9378 | \n", "128 | \n", "0 | \n", "0 | \n", "103 | \n", "1191 | \n", "0 | \n", "1 | \n", "11 | \n", "72 | \n", "
1 | \n", "1981 | \n", "39 | \n", "0 | \n", "67 | \n", "1 | \n", "0 | \n", "3 | \n", "0 | \n", "426 | \n", "0 | \n", "... | \n", "10030 | \n", "132 | \n", "0 | \n", "0 | \n", "117 | \n", "1829 | \n", "0 | \n", "2 | \n", "17 | \n", "114 | \n", "
2 | \n", "1982 | \n", "39 | \n", "0 | \n", "71 | \n", "0 | \n", "0 | \n", "6 | \n", "0 | \n", "626 | \n", "0 | \n", "... | \n", "9074 | \n", "146 | \n", "0 | \n", "0 | \n", "174 | \n", "2162 | \n", "0 | \n", "1 | \n", "11 | \n", "102 | \n", "
3 | \n", "1983 | \n", "47 | \n", "0 | \n", "69 | \n", "0 | \n", "0 | \n", "6 | \n", "0 | \n", "241 | \n", "0 | \n", "... | \n", "7100 | \n", "105 | \n", "0 | \n", "0 | \n", "124 | \n", "3404 | \n", "0 | \n", "6 | \n", "7 | \n", "44 | \n", "
4 | \n", "1984 | \n", "71 | \n", "0 | \n", "63 | \n", "0 | \n", "0 | \n", "4 | \n", "42 | \n", "237 | \n", "0 | \n", "... | \n", "6661 | \n", "90 | \n", "0 | \n", "0 | \n", "142 | \n", "7583 | \n", "0 | \n", "0 | \n", "16 | \n", "32 | \n", "
5 rows × 196 columns
\n", "