3.5. Pandas#
Introducción a Pandas
import pandas as pd
?pd
Estructuras de Datos en Pandas
Series: Creación y operaciones básicas.
# una serie es un array unidimensional con un índice
# similar a un diccionario de python
serie1 = pd.Series(['a', 'b', 'c', 'd', 'e'], name='letras')
serie1
# los atributos principales de una serie son: nombre, índice, un valor y un tipo de dato (dtype)
0 a
1 b
2 c
3 d
4 e
Name: letras, dtype: object
DataFrame: Creación y manipulación.
# Un dataframe es una estructura de datos multidimensional
# similar a una hoja de cálculo de excel
# se podría decir que es un conjunto de series
df1 = pd.DataFrame({'numeros': [1, 2, 3, 4, 5],
'listas' : [['matemática', 'lenguaje'], ['matemática, economía'], ['lenguaje', 'economía', 'matemática'], [], ['matemática']],
'letras' : ['a', 'b', 'c', 'd', 'e']},
index=['alumno 1', 'alumno 2', 'alumno 3', 'alumno 4', 'alumno 5'])
df1
numeros | listas | letras | |
---|---|---|---|
alumno 1 | 1 | [matemática, lenguaje] | a |
alumno 2 | 2 | [matemática, economía] | b |
alumno 3 | 3 | [lenguaje, economía, matemática] | c |
alumno 4 | 4 | [] | d |
alumno 5 | 5 | [matemática] | e |
# los atributos principales de una serie son: index, columns, values y dtypes
df1.index
Index(['alumno 1', 'alumno 2', 'alumno 3', 'alumno 4', 'alumno 5'], dtype='object')
Indexación y selección de datos.
Existen dos formas de acceder a los datos de un dataframe (slicing):
Por índice
Por posición
# seleccionar listas de alumnos 1 y 2
df1.loc[['alumno 1', 'alumno 2'], 'listas']
# loc es para acceder por índice
alumno 1 [matemática, lenguaje]
alumno 2 [matemática, economía]
Name: listas, dtype: object
# seleccionar listas de alumnos 1 y 2
df1.iloc[[0, 1], 1]
# iloc es para acceder por posición
alumno 1 [matemática, lenguaje]
alumno 2 [matemática, economía]
Name: listas, dtype: object
Carga y Almacenamiento de Datos
Importar datos desde archivos CSV, Excel, y bases de datos.
Exportar datos a diferentes formatos.
df = pd.read_csv('https://gist.githubusercontent.com/noamross/e5d3e859aa0c794be10b/raw/b999fb4425b54c63cab088c0ce2c0d6ce961a563/cars.csv')
df.head()
Unnamed: 0 | mpg | cyl | disp | hp | drat | wt | qsec | vs | am | gear | carb | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | Mazda RX4 | 21.0 | 6 | 160.0 | 110 | 3.90 | 2.620 | 16.46 | 0 | 1 | 4 | 4 |
1 | Mazda RX4 Wag | 21.0 | 6 | 160.0 | 110 | 3.90 | 2.875 | 17.02 | 0 | 1 | 4 | 4 |
2 | Datsun 710 | 22.8 | 4 | 108.0 | 93 | 3.85 | 2.320 | 18.61 | 1 | 1 | 4 | 1 |
3 | Hornet 4 Drive | 21.4 | 6 | 258.0 | 110 | 3.08 | 3.215 | 19.44 | 1 | 0 | 3 | 1 |
4 | Hornet Sportabout | 18.7 | 8 | 360.0 | 175 | 3.15 | 3.440 | 17.02 | 0 | 0 | 3 | 2 |
oecd = pd.read_html('https://en.wikipedia.org/wiki/List_of_countries_by_minimum_wage')[2]
oecd = oecd.set_index('Country')
oecd
# las columnas son unidemensionales?
# como eliminaría la última fila?
2018 | 2019 | |||||||||
---|---|---|---|---|---|---|---|---|---|---|
Nominal | PPP | Annual working hours | Nominal | PPP | Annual working hours | |||||
Annual | Hourly | Annual | Hourly | Annual working hours | Annual | Hourly | Annual | Hourly | Annual working hours | |
Country | ||||||||||
(Australia,) | 25970.8 | 13.10 | 24481.2 | 12.4 | 1976 | 26388.5 | 13.4 | 24874.9 | 12.6 | 1976 |
(Belgium,) | 21293.0 | 10.20 | 22746.8 | 10.9 | 2086 | 21410.8 | 10.3 | 22872.6 | 11.0 | 2086 |
(Canada,) | 20552.5 | 9.90 | 20946.0 | 10.1 | 2080 | 20880.7 | 10.0 | 21280.5 | 10.2 | 2080 |
(Chile,) | 4902.5 | 2.10 | 7044.4 | 3.0 | 2346 | 5101.7 | 2.2 | 7330.7 | 3.1 | 2346 |
(Colombia,) | 3451.3 | 1.20 | 7677.4 | 2.6 | 2920 | 3533.7 | 1.2 | 7860.9 | 2.7 | 2920 |
(Czech Republic,) | 6565.8 | 3.30 | 10789.9 | 5.4 | 2000 | 7064.3 | 3.5 | 11609.0 | 5.8 | 2000 |
(Estonia,) | 6869.8 | 3.40 | 9890.2 | 4.9 | 2020 | 7254.2 | 3.6 | 10443.5 | 5.2 | 2019 |
(France,) | 20989.7 | 13.20 | 21860.3 | 12.0 | 2289 | 21889.6 | 11.2 | 21949.0 | 12.1 | 2189 |
(Germany,) | 20414.6 | 10.00 | 23439.6 | 11.5 | 2033 | 20916.3 | 10.3 | 24015.6 | 11.8 | 2033 |
(Greece,) | 9208.7 | 3.70 | 13040.0 | 5.2 | 2507 | 10103.7 | 4.0 | 14307.4 | 5.7 | 2507 |
(Hungary,) | 5887.6 | 2.80 | 11049.2 | 5.2 | 2112 | 6151.5 | 2.9 | 11544.5 | 5.5 | 2086 |
(Ireland,) | 22446.1 | 10.80 | 20585.9 | 9.9 | 2080 | 22819.4 | 11.0 | 20928.3 | 10.1 | 2080 |
(Israel,) | 17992.6 | 8.10 | 15416.5 | 6.9 | 2232 | 17842.5 | 8.0 | 15287.8 | 6.9 | 2232 |
(Japan,) | 16805.5 | 8.10 | 16607.6 | 8.0 | 2080 | - | - | - | - | - |
(South Korea,) | 16265.6 | 6.50 | 19540.8 | 7.8 | 2508 | 17968.1 | 7.2 | 21586.1 | 8.6 | 2508 |
(Latvia,) | 5877.2 | 2.80 | 9168.8 | 4.4 | 2086 | 5776.5 | 2.8 | 9011.7 | 4.3 | 2086 |
(Lithuania,) | 5498.9 | 2.80 | 9694.0 | 4.9 | 1959 | 7455.7 | 3.8 | 13143.5 | 6.7 | 1965 |
(Luxembourg,) | 27601.0 | 13.30 | 25811.0 | 12.4 | 2077 | 28073.1 | 13.4 | 26252.5 | 12.5 | 2095 |
(Mexico,) | 1236.0 | 0.60 | 2238.5 | 1.1 | 2080 | 1386.1 | 0.7 | 2510.3 | 1.2 | 2080 |
(Netherlands,) | 23618.0 | 10.50 | 24859.9 | 11.0 | 2253 | 23586.2 | 10.5 | 24826.5 | 11.0 | 2253 |
(New Zealand,) | 22541.6 | 10.80 | 21716.3 | 10.4 | 2080 | 23330.0 | 11.5 | 22475.8 | 11.0 | 2035 |
(Poland,) | 6709.8 | 3.20 | 13704.8 | 6.6 | 2086 | 7032.4 | 3.4 | 14363.8 | 6.9 | 2086 |
(Portugal,) | 9120.9 | 4.40 | 12720.3 | 6.1 | 2086 | 9403.6 | 4.5 | 13114.5 | 6.3 | 2086 |
(Slovak Republic,) | 6620.0 | 2.10 | 10001.1 | 3.2 | 3145 | 6985.5 | 2.1 | 10553.3 | 3.2 | 3319 |
(Slovenia,) | 11506.4 | 5.50 | 16015.8 | 7.7 | 2086 | 11910.7 | 5.7 | 16578.6 | 7.9 | 2086 |
(Spain,) | 11614.2 | 5.60 | 14731.8 | 7.1 | 2086 | 14105.4 | 6.8 | 17891.7 | 8.6 | 2086 |
(Turkey,) | 4943.8 | 2.40 | 12848.3 | 6.2 | 2086 | 5410.1 | 2.6 | 14060.2 | 6.7 | 2086 |
(United Kingdom,) | 20918.7 | 12.23 | 21132.2 | 10.2 | 2079 | 21544.6 | 10.4 | 21764.4 | 10.5 | 2079 |
(United States,) | 15353.3 | 7.40 | 15353.3 | 7.4 | 2080 | 15080.0 | 7.3 | 15080.0 | 7.3 | 2080 |
(Costa Rica,) | 5927.8 | 2.20 | 9326.5 | 3.4 | 2711 | 5961.6 | 2.2 | 9379.6 | 3.5 | 2711 |
(Brazil,) | 3023.8 | 1.30 | 4925.3 | 2.1 | 2346 | 3049.4 | 1.3 | 4967.0 | 2.1 | 2346 |
(Russian Federation,) | 2053.7 | 1.00 | 4936.5 | 2.4 | 2086 | 2090.9 | 1.0 | 5026.0 | 2.4 | 2086 |
Exportación
oecd.to_csv('oecd.csv')
Exploración de Datos con Pandas
Visualización de los primeros y últimos registros.
Resumen estadístico de los datos.
Manejo de valores faltantes.
mw = pd.read_html('https://en.wikipedia.org/wiki/List_of_countries_by_minimum_wage')[2]
mw
Country | 2018 | 2019 | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Country | Nominal | PPP | Annual working hours | Nominal | PPP | Annual working hours | |||||
Country | Annual | Hourly | Annual | Hourly | Annual working hours | Annual | Hourly | Annual | Hourly | Annual working hours | |
0 | Australia | 25970.8 | 13.10 | 24481.2 | 12.4 | 1976 | 26388.5 | 13.4 | 24874.9 | 12.6 | 1976 |
1 | Belgium | 21293.0 | 10.20 | 22746.8 | 10.9 | 2086 | 21410.8 | 10.3 | 22872.6 | 11.0 | 2086 |
2 | Canada | 20552.5 | 9.90 | 20946.0 | 10.1 | 2080 | 20880.7 | 10.0 | 21280.5 | 10.2 | 2080 |
3 | Chile | 4902.5 | 2.10 | 7044.4 | 3.0 | 2346 | 5101.7 | 2.2 | 7330.7 | 3.1 | 2346 |
4 | Colombia | 3451.3 | 1.20 | 7677.4 | 2.6 | 2920 | 3533.7 | 1.2 | 7860.9 | 2.7 | 2920 |
5 | Czech Republic | 6565.8 | 3.30 | 10789.9 | 5.4 | 2000 | 7064.3 | 3.5 | 11609.0 | 5.8 | 2000 |
6 | Estonia | 6869.8 | 3.40 | 9890.2 | 4.9 | 2020 | 7254.2 | 3.6 | 10443.5 | 5.2 | 2019 |
7 | France | 20989.7 | 13.20 | 21860.3 | 12.0 | 2289 | 21889.6 | 11.2 | 21949.0 | 12.1 | 2189 |
8 | Germany | 20414.6 | 10.00 | 23439.6 | 11.5 | 2033 | 20916.3 | 10.3 | 24015.6 | 11.8 | 2033 |
9 | Greece | 9208.7 | 3.70 | 13040.0 | 5.2 | 2507 | 10103.7 | 4.0 | 14307.4 | 5.7 | 2507 |
10 | Hungary | 5887.6 | 2.80 | 11049.2 | 5.2 | 2112 | 6151.5 | 2.9 | 11544.5 | 5.5 | 2086 |
11 | Ireland | 22446.1 | 10.80 | 20585.9 | 9.9 | 2080 | 22819.4 | 11.0 | 20928.3 | 10.1 | 2080 |
12 | Israel | 17992.6 | 8.10 | 15416.5 | 6.9 | 2232 | 17842.5 | 8.0 | 15287.8 | 6.9 | 2232 |
13 | Japan | 16805.5 | 8.10 | 16607.6 | 8.0 | 2080 | - | - | - | - | - |
14 | South Korea | 16265.6 | 6.50 | 19540.8 | 7.8 | 2508 | 17968.1 | 7.2 | 21586.1 | 8.6 | 2508 |
15 | Latvia | 5877.2 | 2.80 | 9168.8 | 4.4 | 2086 | 5776.5 | 2.8 | 9011.7 | 4.3 | 2086 |
16 | Lithuania | 5498.9 | 2.80 | 9694.0 | 4.9 | 1959 | 7455.7 | 3.8 | 13143.5 | 6.7 | 1965 |
17 | Luxembourg | 27601.0 | 13.30 | 25811.0 | 12.4 | 2077 | 28073.1 | 13.4 | 26252.5 | 12.5 | 2095 |
18 | Mexico | 1236.0 | 0.60 | 2238.5 | 1.1 | 2080 | 1386.1 | 0.7 | 2510.3 | 1.2 | 2080 |
19 | Netherlands | 23618.0 | 10.50 | 24859.9 | 11.0 | 2253 | 23586.2 | 10.5 | 24826.5 | 11.0 | 2253 |
20 | New Zealand | 22541.6 | 10.80 | 21716.3 | 10.4 | 2080 | 23330.0 | 11.5 | 22475.8 | 11.0 | 2035 |
21 | Poland | 6709.8 | 3.20 | 13704.8 | 6.6 | 2086 | 7032.4 | 3.4 | 14363.8 | 6.9 | 2086 |
22 | Portugal | 9120.9 | 4.40 | 12720.3 | 6.1 | 2086 | 9403.6 | 4.5 | 13114.5 | 6.3 | 2086 |
23 | Slovak Republic | 6620.0 | 2.10 | 10001.1 | 3.2 | 3145 | 6985.5 | 2.1 | 10553.3 | 3.2 | 3319 |
24 | Slovenia | 11506.4 | 5.50 | 16015.8 | 7.7 | 2086 | 11910.7 | 5.7 | 16578.6 | 7.9 | 2086 |
25 | Spain | 11614.2 | 5.60 | 14731.8 | 7.1 | 2086 | 14105.4 | 6.8 | 17891.7 | 8.6 | 2086 |
26 | Turkey | 4943.8 | 2.40 | 12848.3 | 6.2 | 2086 | 5410.1 | 2.6 | 14060.2 | 6.7 | 2086 |
27 | United Kingdom | 20918.7 | 12.23 | 21132.2 | 10.2 | 2079 | 21544.6 | 10.4 | 21764.4 | 10.5 | 2079 |
28 | United States | 15353.3 | 7.40 | 15353.3 | 7.4 | 2080 | 15080.0 | 7.3 | 15080.0 | 7.3 | 2080 |
29 | Costa Rica | 5927.8 | 2.20 | 9326.5 | 3.4 | 2711 | 5961.6 | 2.2 | 9379.6 | 3.5 | 2711 |
30 | Brazil | 3023.8 | 1.30 | 4925.3 | 2.1 | 2346 | 3049.4 | 1.3 | 4967.0 | 2.1 | 2346 |
31 | Russian Federation | 2053.7 | 1.00 | 4936.5 | 2.4 | 2086 | 2090.9 | 1.0 | 5026.0 | 2.4 | 2086 |
Manipulación de Datos con Pandas
Filtrado de datos basado en condiciones.
Combinación de DataFrames.
Agregación y agrupación de datos.
mw.describe()
2018 | |||||
---|---|---|---|---|---|
Nominal | PPP | Annual working hours | |||
Annual | Hourly | Annual | Hourly | Annual working hours | |
count | 32.00000 | 32.000000 | 32.000000 | 32.000000 | 32.000000 |
mean | 12618.16250 | 6.079063 | 14821.881250 | 6.950000 | 2208.781250 |
std | 8057.31874 | 4.167241 | 6554.528811 | 3.339886 | 275.330035 |
min | 1236.00000 | 0.600000 | 2238.500000 | 1.100000 | 1959.000000 |
25% | 5885.00000 | 2.700000 | 9841.150000 | 4.775000 | 2080.000000 |
50% | 10357.55000 | 4.950000 | 14218.300000 | 6.750000 | 2086.000000 |
75% | 20644.05000 | 10.050000 | 20992.550000 | 10.125000 | 2262.000000 |
max | 27601.00000 | 13.300000 | 25811.000000 | 12.400000 | 3145.000000 |
sales = pd.read_csv('https://raw.githubusercontent.com/alejo-acosta/curso-python-ciee/main/data/sales.csv')
items = pd.read_csv('https://raw.githubusercontent.com/alejo-acosta/curso-python-ciee/main/data/items.csv')
stores = pd.read_csv('https://raw.githubusercontent.com/alejo-acosta/curso-python-ciee/main/data/stores.csv')
oil = pd.read_csv('https://raw.githubusercontent.com/alejo-acosta/curso-python-ciee/main/data/oil.csv')
/tmp/ipykernel_61426/2817981066.py:1: DtypeWarning: Columns (6) have mixed types. Specify dtype option on import or set low_memory=False.
sales = pd.read_csv('https://raw.githubusercontent.com/alejo-acosta/curso-python-ciee/main/data/sales.csv')
sales['date'].astype('datetime64[s]')
0 2013-01-01
1 2013-01-01
2 2013-01-01
3 2013-01-01
4 2013-01-01
...
1131724 2017-08-15
1131725 2017-08-15
1131726 2017-08-15
1131727 2017-08-15
1131728 2017-08-15
Name: date, Length: 1131729, dtype: datetime64[ns]
sales.shape
(1131729, 7)
oil.dtypes
date object
dcoilwtico float64
dtype: object
oecd
2018 | 2019 | |||||||||
---|---|---|---|---|---|---|---|---|---|---|
Nominal | PPP | Annual working hours | Nominal | PPP | Annual working hours | |||||
Annual | Hourly | Annual | Hourly | Annual working hours | Annual | Hourly | Annual | Hourly | Annual working hours | |
Country | ||||||||||
(Australia,) | 25970.8 | 13.10 | 24481.2 | 12.4 | 1976 | 26388.5 | 13.4 | 24874.9 | 12.6 | 1976 |
(Belgium,) | 21293.0 | 10.20 | 22746.8 | 10.9 | 2086 | 21410.8 | 10.3 | 22872.6 | 11.0 | 2086 |
(Canada,) | 20552.5 | 9.90 | 20946.0 | 10.1 | 2080 | 20880.7 | 10.0 | 21280.5 | 10.2 | 2080 |
(Chile,) | 4902.5 | 2.10 | 7044.4 | 3.0 | 2346 | 5101.7 | 2.2 | 7330.7 | 3.1 | 2346 |
(Colombia,) | 3451.3 | 1.20 | 7677.4 | 2.6 | 2920 | 3533.7 | 1.2 | 7860.9 | 2.7 | 2920 |
(Czech Republic,) | 6565.8 | 3.30 | 10789.9 | 5.4 | 2000 | 7064.3 | 3.5 | 11609.0 | 5.8 | 2000 |
(Estonia,) | 6869.8 | 3.40 | 9890.2 | 4.9 | 2020 | 7254.2 | 3.6 | 10443.5 | 5.2 | 2019 |
(France,) | 20989.7 | 13.20 | 21860.3 | 12.0 | 2289 | 21889.6 | 11.2 | 21949.0 | 12.1 | 2189 |
(Germany,) | 20414.6 | 10.00 | 23439.6 | 11.5 | 2033 | 20916.3 | 10.3 | 24015.6 | 11.8 | 2033 |
(Greece,) | 9208.7 | 3.70 | 13040.0 | 5.2 | 2507 | 10103.7 | 4.0 | 14307.4 | 5.7 | 2507 |
(Hungary,) | 5887.6 | 2.80 | 11049.2 | 5.2 | 2112 | 6151.5 | 2.9 | 11544.5 | 5.5 | 2086 |
(Ireland,) | 22446.1 | 10.80 | 20585.9 | 9.9 | 2080 | 22819.4 | 11.0 | 20928.3 | 10.1 | 2080 |
(Israel,) | 17992.6 | 8.10 | 15416.5 | 6.9 | 2232 | 17842.5 | 8.0 | 15287.8 | 6.9 | 2232 |
(Japan,) | 16805.5 | 8.10 | 16607.6 | 8.0 | 2080 | - | - | - | - | - |
(South Korea,) | 16265.6 | 6.50 | 19540.8 | 7.8 | 2508 | 17968.1 | 7.2 | 21586.1 | 8.6 | 2508 |
(Latvia,) | 5877.2 | 2.80 | 9168.8 | 4.4 | 2086 | 5776.5 | 2.8 | 9011.7 | 4.3 | 2086 |
(Lithuania,) | 5498.9 | 2.80 | 9694.0 | 4.9 | 1959 | 7455.7 | 3.8 | 13143.5 | 6.7 | 1965 |
(Luxembourg,) | 27601.0 | 13.30 | 25811.0 | 12.4 | 2077 | 28073.1 | 13.4 | 26252.5 | 12.5 | 2095 |
(Mexico,) | 1236.0 | 0.60 | 2238.5 | 1.1 | 2080 | 1386.1 | 0.7 | 2510.3 | 1.2 | 2080 |
(Netherlands,) | 23618.0 | 10.50 | 24859.9 | 11.0 | 2253 | 23586.2 | 10.5 | 24826.5 | 11.0 | 2253 |
(New Zealand,) | 22541.6 | 10.80 | 21716.3 | 10.4 | 2080 | 23330.0 | 11.5 | 22475.8 | 11.0 | 2035 |
(Poland,) | 6709.8 | 3.20 | 13704.8 | 6.6 | 2086 | 7032.4 | 3.4 | 14363.8 | 6.9 | 2086 |
(Portugal,) | 9120.9 | 4.40 | 12720.3 | 6.1 | 2086 | 9403.6 | 4.5 | 13114.5 | 6.3 | 2086 |
(Slovak Republic,) | 6620.0 | 2.10 | 10001.1 | 3.2 | 3145 | 6985.5 | 2.1 | 10553.3 | 3.2 | 3319 |
(Slovenia,) | 11506.4 | 5.50 | 16015.8 | 7.7 | 2086 | 11910.7 | 5.7 | 16578.6 | 7.9 | 2086 |
(Spain,) | 11614.2 | 5.60 | 14731.8 | 7.1 | 2086 | 14105.4 | 6.8 | 17891.7 | 8.6 | 2086 |
(Turkey,) | 4943.8 | 2.40 | 12848.3 | 6.2 | 2086 | 5410.1 | 2.6 | 14060.2 | 6.7 | 2086 |
(United Kingdom,) | 20918.7 | 12.23 | 21132.2 | 10.2 | 2079 | 21544.6 | 10.4 | 21764.4 | 10.5 | 2079 |
(United States,) | 15353.3 | 7.40 | 15353.3 | 7.4 | 2080 | 15080.0 | 7.3 | 15080.0 | 7.3 | 2080 |
(Costa Rica,) | 5927.8 | 2.20 | 9326.5 | 3.4 | 2711 | 5961.6 | 2.2 | 9379.6 | 3.5 | 2711 |
(Brazil,) | 3023.8 | 1.30 | 4925.3 | 2.1 | 2346 | 3049.4 | 1.3 | 4967.0 | 2.1 | 2346 |
(Russian Federation,) | 2053.7 | 1.00 | 4936.5 | 2.4 | 2086 | 2090.9 | 1.0 | 5026.0 | 2.4 | 2086 |
sales.head(10)
Unnamed: 0 | id | date | store_nbr | item_nbr | unit_sales | onpromotion | |
---|---|---|---|---|---|---|---|
0 | 137 | 137 | 2013-01-01 | 25 | 329362 | 1.0 | NaN |
1 | 167 | 167 | 2013-01-01 | 25 | 378685 | 8.0 | NaN |
2 | 171 | 171 | 2013-01-01 | 25 | 410257 | 1.0 | NaN |
3 | 201 | 201 | 2013-01-01 | 25 | 464263 | 2.0 | NaN |
4 | 225 | 225 | 2013-01-01 | 25 | 514242 | 4.0 | NaN |
5 | 365 | 365 | 2013-01-01 | 25 | 789905 | 5.0 | NaN |
6 | 528 | 528 | 2013-01-01 | 25 | 1047756 | 8.0 | NaN |
7 | 719 | 719 | 2013-01-02 | 1 | 227728 | 7.0 | NaN |
8 | 823 | 823 | 2013-01-02 | 1 | 329071 | 5.0 | NaN |
9 | 824 | 824 | 2013-01-02 | 1 | 329362 | 5.0 | NaN |
sales.dtypes
Unnamed: 0 int64
id int64
date object
store_nbr int64
item_nbr int64
unit_sales float64
onpromotion object
dtype: object
df = sales.merge(items, on='item_nbr', how='left').merge(stores, on='store_nbr', how='left')
df.dtypes
Unnamed: 0 int64
id int64
date object
store_nbr int64
item_nbr int64
unit_sales float64
onpromotion object
family object
class int64
perishable int64
city object
state object
type object
cluster int64
dtype: object
Operaciones Avanzadas
Aplicación de funciones a los datos.
Operaciones con fechas y tiempos.
Visualización de Datos con Pandas
Gráficos básicos con Pandas.
Uso de bibliotecas de visualización externas (ej. Matplotlib, Seaborn).