blog posts David White blog posts David White

Wednesday, January 11, 2023


nyc-storefront-businesses-by-neighborhood

Technical Documentation

Exploratory Data Analysis

Analysis By: David White
Technology Used: Python (NumPy, pandas, Matplotlib, seaborn)
Data Set:
Storefronts Reported Vacant or Not
Topic: City Government
Data Source: City of New York

View Source Code | GitHub
Download PDF
View Raw Data
View Data Dictionary

 
 
↤ BACK TO: Insights, Made Fresh Daily (main page)

Read More
blog posts David White blog posts David White

Tuesday, January 10, 2023


2022-civil-service-exam-scores-fdny-candidates

Technical Documentation

Exploratory Data Analysis

Analysis By: David White
Technology Used: Python (NumPy, pandas, Matplotlib, seaborn)
Data Set:
Civil Service List (Active)
Topic: City Government
Data Source: City of New York

View Source Code | GitHub
Download PDF
View Raw Data
View Data Dictionary

 
 
↤ BACK TO: Insights, Made Fresh Daily (main page)

Read More
blog posts David White blog posts David White

Monday, January 9, 2023


nyc-swimming-beach-attendance-totals-by-year

Technical Documentation

Exploratory Data Analysis

Analysis By: David White
Technology Used: Python (NumPy, pandas, Matplotlib, seaborn)
Data Set:
Swimming Beach Attendance
Topic: City Government
Data Source: City of New York

View Source Code | GitHub
Download PDF
View Raw Data
View Data Dictionary

 
 
↤ BACK TO: Insights, Made Fresh Daily (main page)

Read More
blog posts David White blog posts David White

Sunday, January 8, 2023



Technical Documentation

Exploratory Data Analysis

Analysis By: David White
Technology Used: Python (NumPy, pandas, Matplotlib, seaborn)
Data Set:
Donations received by City Agencies
Topic: City Government
Data Source: City of New York

View Source Code | GitHub
Download PDF
View Raw Data
View Data Dictionary

 
 
↤ BACK TO: Insights, Made Fresh Daily (main page)

Read More
blog posts David White blog posts David White

Saturday, January 7, 2023


nyc-cash-assistance-youth-engagements-by-type

Technical Documentation

Exploratory Data Analysis

Analysis By: David White
Technology Used: Python (NumPy, pandas, Matplotlib, seaborn)
Data Set:
Youth Engagement by Category
Topic: City Government
Data Source: City of New York

View Source Code | GitHub
Download PDF
View Raw Data
View Data Dictionary

 
 
↤ BACK TO: Insights, Made Fresh Daily (main page)

Read More
blog posts David White blog posts David White

Friday, January 6, 2023



Technical Documentation

Exploratory Data Analysis

Analysis By: David White
Technology Used: Python (NumPy, pandas, Matplotlib, seaborn)
Data Set:
DCA Fines and Fees
Topic: City Government
Data Source: City of New York

View Source Code | GitHub
Download PDF
View Raw Data
View Data Dictionary

 
 
↤ BACK TO: Insights, Made Fresh Daily (main page)

Read More
blog posts David White blog posts David White

Thursday, January 5, 2023


nyc-business-inspections-by-year

Technical Documentation

Exploratory Data Analysis

Analysis By: David White
Technology Used: Python (NumPy, pandas, Matplotlib, seaborn)
Data Set:
Inspections
Topic: City Government
Data Source: City of New York

View Source Code | GitHub
Download PDF
View Raw Data
View Data Dictionary

 
 
↤ BACK TO: Insights, Made Fresh Daily (main page)

Read More
blog posts David White blog posts David White

Wednesday, January 4, 2023


.nyc-domain-registrations-by-year

Technical Documentation

Exploratory Data Analysis

Analysis By: David White
Technology Used: Python (NumPy, pandas, Matplotlib, seaborn)
Data Set:
.nyc Domain Registrations
Topic: City Government
Data Source: City of New York

View Source Code | GitHub
Download PDF
View Raw Data
View Data Dictionary

 
 
↤ BACK TO: Insights, Made Fresh Daily (main page)

Read More
blog posts David White blog posts David White

Tuesday, January 3, 2023


nyc-film-permits-from-2021-to-2022

Technical Documentation

Exploratory Data Analysis

Analysis By: David White
Technology Used: Python (NumPy, pandas, Matplotlib, seaborn)
Data Set:
Film Permits
Topic: City Government
Data Source: City of New York

View Source Code | GitHub
Download PDF
View Raw Data
View Data Dictionary

 
 
↤ BACK TO: Insights, Made Fresh Daily (main page)

Read More
blog posts David White blog posts David White

Monday, January 2, 2023



Technical Documentation

Exploratory Data Analysis

Analysis By: David White
Technology Used: Python (NumPy, pandas, Matplotlib, seaborn)
Data Set:
Evictions
Topic: City Government
Data Source: City of New York

View Source Code | GitHub
Download PDF
View Raw Data
View Data Dictionary

 
 
↤ BACK TO: Insights, Made Fresh Daily (main page)

Read More
blog posts David White blog posts David White

Sunday, January 1, 2023


nyc-council-complaints-fielded-by-type

Technical Documentation

Exploratory Data Analysis

Analysis By: David White
Technology Used: Python (NumPy, pandas, Matplotlib, seaborn)
Data Set:
NYC Council Constituent Services
Topic: City Government
Data Source: City of New York

View Source Code | GitHub
Download PDF
View Raw Data
View Data Dictionary

 
 
↤ BACK TO: Insights, Made Fresh Daily (main page)

Read More
blog posts, exploratory data analysis David White blog posts, exploratory data analysis David White

Friday, July 23, 2021

Public school data analysis by state (part 1 of 2)…Read more

Exploratory Data Analysis by David White | view source code & analysis on Google Colaboratory
Public School Demographics by State - Exploratory Data Analysis (EDA) by David White

Public School Demographics by State - Exploratory Data Analysis (EDA) by David White

 

Insights, Made Fresh Daily

Exploratory Data Analysis

Public school data by state (part 1 of 2)

David White | Friday, July 23, 2021


What is Exploratory Data Analysis?

Exploratory data analysis (EDA) is a technique used by data scientists to inspect, characterize and briefly summarize the contents of a dataset. EDA is often the first step when encountering a new or unfamiliar dataset. EDA helps the data scientist become acquainted with a dataset and test some basic assumptions about the data. By the end of the EDA process, some initial insights can be drawn from the dataset and a framework for further analysis or modeling is established.

This exploratory data analysis explores a dataset of information on public schools in the United States. The underlying data was released by the U.S. Department of Education.

See also: Exploratory Data Analysis: Public School Staffing

Here are the takeaways from the dataset:

  • The dataset consists of 51 rows and 42 columns

  • The dataset consists of:

    • student enrollment

    • school staffing

    • student demographic information

  • There are 51 rows and 42 columns in the dataset. None of the rows are blank.

  • The dataset contains totals per state of the number of students in (2) gender categories and (7) race/ethnicity categories.

  • 2018-19 US public school total enrollments by demographic group are as follows:

    • 25.8 million male students

    • 24.4 million female students

    • 473K American Indian/Alaska Native students

    • 2.6 million Asian or Asian/Pacific Islander students

    • 13.7 million Hispanic students

    • 7.6 million Black students

    • 23.7 million White students

    • 176K Hawaiian Nat./Pacific Isl. students

    • 2 million multiracial students

  • The states with the highest number of Black public school students are: Florida, Georgia and Texas

  • The states with the highest number of Hispanic public school students are: California and Texas

  • The state with the highest number of Asian or Asian/Pacific Islander public school students is California. New York and Texas are a distant second and third.

  • The states with the highest number of American Indian/Alaska Native public school students by far is Oklahoma

  • The state with the highest number of Hawaiian/Pacific Islander public school students by far is Hawaii

  • The states with the highest number of White public school students are: California and Texas

Key Insight:

Populations of White students and populations of White students are mostly in portion with the state's overall population. However, for other demographic groups, students of that ethnicity are more heavily concentrated in just a handful of states.

 
↤ BACK TO: Insights, Made Fresh Daily (main page)
Read More
blog posts, data visualization David White blog posts, data visualization David White

Wednesday, July 14, 2021

Using boxplots to reveal what’s typical and what’s an outlier…Read More

Data Visualization by David White | view source code on Google Colaboratory
 

Insights, Made Fresh Daily

Think Inside the Box

Using boxplots to reveal what’s typical and what’s an outlier

David White | Wednesday, July 14, 2021

This collection of boxplots shows the distribution of test scores on the 2019 NYS Common Core Learning Standard Exam. The underlying data was released by the New York State Education Department (NYSED).

For each “box and whisker plot” on the page, the vertical line on the left represents the lowest score. The vertical line on the right represents the highest score. The line down the center of the box represents the median score. The distance from the center line to the left end of the box represents the distance from the median to the 25th percentile score. Likewise, the distance from the center line to the right end of the box represents the distance from the median to the 75th percentile score. The dots off to either side represent outliers.

 
↤ BACK TO: Insights, Made Fresh Daily (main page)
Read More
blog posts, exploratory data analysis David White blog posts, exploratory data analysis David White

Sunday, July 11, 2021

This exploratory data analysis examines a dataset of information on public schools in the City of New YorkRead More

Exploratory Data Analysis by David White | view source code on Google Colaboratory
Student-Teacher Ratios - Exploratory Data Analysis (EDA) by David White

Student-Teacher Ratios - Exploratory Data Analysis (EDA) by David White

 

Insights, Made Fresh Daily

Exploratory Data Analysis

Student-Teacher Ratios in NYC Public Schools

David White | Sunday, July 11, 2021

What is Exploratory Data Analysis?

Exploratory data analysis (EDA) is a technique used by data scientists to inspect, characterize and briefly summarize the contents of a dataset. EDA is often the first step when encountering a new or unfamiliar dataset. EDA helps the data scientist become acquainted with a dataset and test some basic assumptions about the data. By the end of the EDA process, some initial insights can be drawn from the dataset and a framework for further analysis or modeling is established.

This exploratory data analysis examines a dataset of information on public schools in the City of New York. The underlying data was released by the City of New York.

 
↤ BACK TO: Insights, Made Fresh Daily (main page)
Read More