Python - R Cheat Sheet

https://pandas.pydata.org/Pandas_Cheat_Sheet.pdf

 

TaskExample in RExample in Python
Load a packagelibrary(tidyverse)import pandas as pd
Read a CSV file into a dataframeread.csv('./Data/Raw/File.csv')pd.read_csv('./Data/Raw/File.csv')
Reveal the structure of a dfstr(df)df.info()
Reveal the dimensions of a dfdim(df)df.shape
Reveal a summary of the dfsummary(df)df.describe()
Show the first few rowshead(df)df.head()
Show the data type of column "colX"class(df$colX)df['colX'].dtype
Convert a string column to a date columnas.Date(df$Date,format='%m/%d/%y')pd.to_datetime('Date',format='%y/%m/%d')
Separate date items into columnsseparate(df,dateCol,c("Y", "m", "d"))dateIdx=pd.DatetimeIndex(df['dateCol'])
df['Y'] = dateIdx.year
df['m'] = dateIdx.month
df['d'] = dateIdx.day
Select a slice of columnsselect(df, col1:col3)df.loc[:, 'col1':'col3']
Select a subset of columsnselect(df, c(col1,col2,col4))df.loc[:, 'col1':'col3']
Create a new column with specified valuesmutate(df, col2 = "valueX")df['col2'] = "valueX"
Write df to a CSV filewrite.csv(df,'./Data/Processed/File.csv')df.to_csv('./Data/Processed/File.csv')
Combine data frames (by stacking them vertically: must have common row names)rbind(df1,df2,df3)pd.concat((df1,df2,df3),axis='rows')
Combine data frames (by appending them horizontally: must have common row counts)cbind(df1,df2,df3)pd.concat((df1,df2,df3),axis='columns')
Selecting rows filter(df, colX == 'MyValue')via "query":
df.query("colX == 'MyValue'")
via Boolean mask:
df[df['colX'] == 'MyValue']]