{
"nbformat": 4,
"nbformat_minor": 0,
"metadata": {
"colab": {
"provenance": [],
"toc_visible": true,
"authorship_tag": "ABX9TyNqajBRWJaFf5J4ai0CdKY0",
"include_colab_link": true
},
"kernelspec": {
"name": "python3",
"display_name": "Python 3"
},
"language_info": {
"name": "python"
}
},
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "view-in-github",
"colab_type": "text"
},
"source": [
""
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {
"id": "BU2N9NpJhUlB"
},
"outputs": [],
"source": [
"from sklearn import datasets\n",
"import pandas as pd\n",
"import seaborn as sns\n",
"import matplotlib.pyplot as plt"
]
},
{
"cell_type": "markdown",
"source": [
"## The iris data set\n",
"\n",
"The object `iris` returned by `load_iris` is a Bunch object, which is similar to a dictionary."
],
"metadata": {
"id": "G5akhpIjlUOn"
}
},
{
"cell_type": "code",
"source": [
"iris = datasets.load_iris()\n",
"iris.keys()"
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "225zs5ITlTHf",
"outputId": "0bc09a20-cf5d-44ba-b11a-33305c36a679"
},
"execution_count": 20,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"dict_keys(['data', 'target', 'frame', 'target_names', 'DESCR', 'feature_names', 'filename', 'data_module'])"
]
},
"metadata": {},
"execution_count": 20
}
]
},
{
"cell_type": "code",
"source": [
"iris['feature_names']"
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "AVfqGOn5hbHm",
"outputId": "50486748-0763-449a-8106-9ff13853f188"
},
"execution_count": 21,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"['sepal length (cm)',\n",
" 'sepal width (cm)',\n",
" 'petal length (cm)',\n",
" 'petal width (cm)']"
]
},
"metadata": {},
"execution_count": 21
}
]
},
{
"cell_type": "code",
"source": [
"iris['data'][:10,:]"
],
"metadata": {
"id": "9rszMWBihb73",
"colab": {
"base_uri": "https://localhost:8080/"
},
"outputId": "67e69ccb-de71-4a21-ea76-1ecd707e8132"
},
"execution_count": 30,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"array([[5.1, 3.5, 1.4, 0.2],\n",
" [4.9, 3. , 1.4, 0.2],\n",
" [4.7, 3.2, 1.3, 0.2],\n",
" [4.6, 3.1, 1.5, 0.2],\n",
" [5. , 3.6, 1.4, 0.2],\n",
" [5.4, 3.9, 1.7, 0.4],\n",
" [4.6, 3.4, 1.4, 0.3],\n",
" [5. , 3.4, 1.5, 0.2],\n",
" [4.4, 2.9, 1.4, 0.2],\n",
" [4.9, 3.1, 1.5, 0.1]])"
]
},
"metadata": {},
"execution_count": 30
}
]
},
{
"cell_type": "markdown",
"source": [
"## Regression\n",
"\n",
"The following dataframe has only continuous attributes. One could use it to address a **regression problem**, where the response variable would be, say, the sepal length."
],
"metadata": {
"id": "csoMOfa0oL4c"
}
},
{
"cell_type": "code",
"source": [
"df=pd.DataFrame(iris['data'], columns=iris['feature_names'] )\n",
"df"
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 423
},
"id": "ctIK7VBVh7CR",
"outputId": "fbc2597b-1562-451b-a88a-4ab2a920944d"
},
"execution_count": 23,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
" sepal length (cm) sepal width (cm) petal length (cm) petal width (cm)\n",
"0 5.1 3.5 1.4 0.2\n",
"1 4.9 3.0 1.4 0.2\n",
"2 4.7 3.2 1.3 0.2\n",
"3 4.6 3.1 1.5 0.2\n",
"4 5.0 3.6 1.4 0.2\n",
".. ... ... ... ...\n",
"145 6.7 3.0 5.2 2.3\n",
"146 6.3 2.5 5.0 1.9\n",
"147 6.5 3.0 5.2 2.0\n",
"148 6.2 3.4 5.4 2.3\n",
"149 5.9 3.0 5.1 1.8\n",
"\n",
"[150 rows x 4 columns]"
],
"text/html": [
"\n",
"
\n", " | sepal length (cm) | \n", "sepal width (cm) | \n", "petal length (cm) | \n", "petal width (cm) | \n", "
---|---|---|---|---|
0 | \n", "5.1 | \n", "3.5 | \n", "1.4 | \n", "0.2 | \n", "
1 | \n", "4.9 | \n", "3.0 | \n", "1.4 | \n", "0.2 | \n", "
2 | \n", "4.7 | \n", "3.2 | \n", "1.3 | \n", "0.2 | \n", "
3 | \n", "4.6 | \n", "3.1 | \n", "1.5 | \n", "0.2 | \n", "
4 | \n", "5.0 | \n", "3.6 | \n", "1.4 | \n", "0.2 | \n", "
... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "
145 | \n", "6.7 | \n", "3.0 | \n", "5.2 | \n", "2.3 | \n", "
146 | \n", "6.3 | \n", "2.5 | \n", "5.0 | \n", "1.9 | \n", "
147 | \n", "6.5 | \n", "3.0 | \n", "5.2 | \n", "2.0 | \n", "
148 | \n", "6.2 | \n", "3.4 | \n", "5.4 | \n", "2.3 | \n", "
149 | \n", "5.9 | \n", "3.0 | \n", "5.1 | \n", "1.8 | \n", "
150 rows × 4 columns
\n", "\n", " | sepal length (cm) | \n", "sepal width (cm) | \n", "petal length (cm) | \n", "petal width (cm) | \n", "species | \n", "
---|---|---|---|---|---|
0 | \n", "5.1 | \n", "3.5 | \n", "1.4 | \n", "0.2 | \n", "setosa | \n", "
1 | \n", "4.9 | \n", "3.0 | \n", "1.4 | \n", "0.2 | \n", "setosa | \n", "
2 | \n", "4.7 | \n", "3.2 | \n", "1.3 | \n", "0.2 | \n", "setosa | \n", "
3 | \n", "4.6 | \n", "3.1 | \n", "1.5 | \n", "0.2 | \n", "setosa | \n", "
4 | \n", "5.0 | \n", "3.6 | \n", "1.4 | \n", "0.2 | \n", "setosa | \n", "
... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "
145 | \n", "6.7 | \n", "3.0 | \n", "5.2 | \n", "2.3 | \n", "virginica | \n", "
146 | \n", "6.3 | \n", "2.5 | \n", "5.0 | \n", "1.9 | \n", "virginica | \n", "
147 | \n", "6.5 | \n", "3.0 | \n", "5.2 | \n", "2.0 | \n", "virginica | \n", "
148 | \n", "6.2 | \n", "3.4 | \n", "5.4 | \n", "2.3 | \n", "virginica | \n", "
149 | \n", "5.9 | \n", "3.0 | \n", "5.1 | \n", "1.8 | \n", "virginica | \n", "
150 rows × 5 columns
\n", "