Python-c02-Getting Started with Pandas

I will show a simple example of csv with pandas, you can download the files from here:
survey_results_public.csv
survey_results_schema.csv

Notebook

1
import pandas as pd
1
df = pd.read_csv('data/survey_results_public.csv')
1
df

Respondent Professional ProgramHobby Country University EmploymentStatus FormalEducation MajorUndergrad HomeRemote CompanySize ... StackOverflowMakeMoney Gender HighestEducationParents Race SurveyLong QuestionsInteresting QuestionsConfusing InterestedAnswers Salary ExpectedSalary
0 1 Student Yes, both United States No Not employed, and not looking for work Secondary school NaN NaN NaN ... Strongly disagree Male High school White or of European descent Strongly disagree Strongly agree Disagree Strongly agree NaN NaN
1 2 Student Yes, both United Kingdom Yes, full-time Employed part-time Some college/university study without earning ... Computer science or software engineering More than half, but not all, the time 20 to 99 employees ... Strongly disagree Male A master's degree White or of European descent Somewhat agree Somewhat agree Disagree Strongly agree NaN 37500.0
2 3 Professional developer Yes, both United Kingdom No Employed full-time Bachelor's degree Computer science or software engineering Less than half the time, but at least one day ... 10,000 or more employees ... Disagree Male A professional degree White or of European descent Somewhat agree Agree Disagree Agree 113750.0 NaN
3 4 Professional non-developer who sometimes write... Yes, both United States No Employed full-time Doctoral degree A non-computer-focused engineering discipline Less than half the time, but at least one day ... 10,000 or more employees ... Disagree Male A doctoral degree White or of European descent Agree Agree Somewhat agree Strongly agree NaN NaN
4 5 Professional developer Yes, I program as a hobby Switzerland No Employed full-time Master's degree Computer science or software engineering Never 10 to 19 employees ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
19097 19098 Professional developer Yes, I program as a hobby Canada No Employed full-time Bachelor's degree A business discipline A few days each month 10 to 19 employees ... Disagree Male Some college/university study, no bachelor's d... White or of European descent Somewhat agree Agree Disagree Agree NaN NaN
19098 19099 Student Yes, I program as a hobby India No Not employed, and not looking for work Secondary school NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
19099 19100 Professional non-developer who sometimes write... Yes, I program as a hobby United Kingdom No Independent contractor, freelancer, or self-em... Bachelor's degree Computer science or software engineering Never NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
19100 19101 Professional developer Yes, I program as a hobby United States No Employed full-time Some college/university study without earning ... A humanities discipline Less than half the time, but at least one day ... 100 to 499 employees ... Disagree Male Some college/university study, no bachelor's d... White or of European descent Somewhat agree Somewhat agree Disagree Agree 110000.0 NaN
19101 19102 Professional developer Yes, I program as a hobby France No Employed full-time Master's degree Computer science or software engineering All or almost all the time (I'm full-time remote) 100 to 499 employees ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN

19102 rows × 154 columns

1
2
# shape give us the number of rows and columns of the dataframe
df.shape

(19102, 154)

1
df.info()

<class ‘pandas.core.frame.DataFrame’>
RangeIndex: 19102 entries, 0 to 19101
Columns: 154 entries, Respondent to ExpectedSalary
dtypes: float64(6), int64(1), object(147)
memory usage: 22.4+ MB

1
2
3
# set the number of columns and rows displayed
pd.set_option('display.max_columns', 154)
pd.set_option('display.max_rows', 154)