This is a data-driven narrative about changes and disparities in student learning during the initial years of the COVID-19 pandemic resulting from educational inequalities. Written & developed by UCLA digital humanities students using data from the Stanford Education Data Archive (SEDA).
COVID-ERA EDUCATION is a group project completed by UCLA undergraduate students for an introductory digital humanities course during spring quarter of 2024. This project was done under the instruction of Dr. Ashley Sanders Garcia and Nick Schwieterman.
Students were directed to critically analyze a publicly-accessible database using humanities methods. The final project challenges students to construct an argument and research questions that can be explored through their chosen dataset. Their findings are then presented in a narrative-format.
Our group utilized two datasets from the Stanford Education Data Archive (SEDA) concerning student academic achievement, achievement gaps, and demographic/socioeconomic data prior to and under the scope of the pandemic.
How did changes in school structuring, as classes shifted to and from remote format during the initial years of the pandemic, influence student learning?
Is access to school technology and nutrition resources tied to academic performance for marginalized socioeconomic and racial/ethnic groups?
Project Manager and Data Visualization Specialist
Art major & Italian minor
Managed the team, wrote narrative and data critique, created data visualizations.
Data Specialist and Data Visualization Specialist
Cognitive Science major & Data Science Engineering Minor
Sourced, cleaned, and analyzed data.
Web Designer
Cognitive Science Major
Lead the team creatively, focused on the website design and layout.
Editor
Communication & Economics Major
Quality control of all content and scheduled team meetings.
Content Developer
Computer Science Major
Oversaw the narrative of the investigation and incorporated data.
To investigate the impact of the COVID-19 pandemic on education, we utilized data from the Stanford Education Data Archive (SEDA), which offers detailed information on academic performance, educational opportunities, and demographic data across U.S. school districts. The SEDA dataset was crucial in our analysis as it allowed us to examine standardized test scores, school resources, and demographic variables.
Throughout our sourcing process, we utilized multiple peer-reviewed sources to provide a broader context and deeper understanding of the educational inequalities exacerbated by the pandemic. For example, Priscilla Blossom’s article on the benefits of free school meals highlighted positive impacts on student health and academic performance.[1] The CDC Museum COVID-19 Timeline additionally offered a comprehensive overview of key moments and public health responses during the pandemic, providing essential context for our study.[2] We also read articles discussing the rise in educational inequalities due to the pandemic, particularly for students from underrepresented backgrounds.
Silencing the Past by Michel-Rolph Trouillot was involved in guiding our sourcing process as it helped us identify potential silences and biases in our dataset.[3] Before processing our data, we considered potential silences and biases that could arise at various stages, such as variations in data collection methods as well as financial and institutional biases in data compilation. Using the frameworks from this article, we acknowledge the complexities and limitations of any dataset. Catherine D’Ignazio and Lauren F. Klein’s Data Feminism was crucial in guiding our efforts to use data to empower marginalized communities.[4] By recognizing the subjectivity of data and the societal power dynamics that shape it, we aimed to highlight the disparities in educational opportunities and outcomes exacerbated by the pandemic. More information on our data can be found on our Data Critique page.
By synthesizing data from the SEDA archive with insights from peer-reviewed sources, we are able to use data-driven insights to understand the impact of the COVID-19 pandemic on educational inequalities and advocate for more equitable educational practices. For more information on our sources, please refer to our Annotated Bibliography.
The SEDA archive houses numerous datasets with extensive amounts of information. Since a lot of this data we did not need for our specific project, we had the task of sorting and organizing through the different sub-datasets inside SEDA to locate the exact data variables we would use in our analysis. Identifying the exact datasets we would need required an idea of what our project would entail. Because we were looking at educational opportunities and the effect of COVID-19 on education, we knew we wanted to analyze the years 2019 and 2022 to see what differences there were. This narrowed down the specific time period we needed data from. However, there were still many datasets for each year. Datasets were categorized by geographic zone – by district, county, city, state, etc. Deciding we wanted to analyze both national changes between years and how different states compared on factors such as socioeconomic status and academic achievement, we chose to only include the datasets that were separated by state in 2019 and 2022. This ended up being a total of 8 data sheets that were combined in Google Sheets so it had a simpler presentation.
We then began our data cleaning process. This task was not very challenging since the data was pulled from a very clean dataset. In their documentation, SEDA has stated that they have done their own data cleaning. Despite this, we still went through the different sheets and matched it with the dataset that it had been from and to make sure that no variables were missed or imported incorrectly. There were some missing variables involved in the data. However, this is explained in the documentation, which states that if any of the data was left blank, it meant it had a value of 0. Thus, we inserted a value of 0 into every blank cell to make it easier to create visualizations and maps. Finally, we made sure that all the data was in the correct format. For example, the value of “ethnicity” should be a string, and the value of “math score” should be a decimal.
The next step was putting our findings into visually understandable form. What else, if not Tableau, could help us in this step? Utilizing Tableau’s extensive functionality, we generated all of our maps and data visualizations. In choosing which forms of data visualization fit our data and narrative best, we took direction from Nathan Yau’s Data Points, in which he discusses different types of data visualizations and their applications.[5] From our datasets, we then found variables we wanted to compare, and visualized those relationships in bar charts, line charts and maps, as most of our data was quantitative.
Finally we had to represent our narrative to our audience through a website. This is where we used Wordpress. For the website we chose a theme that is accessible, color-blind and reader friendly, and relates to our education and pandemic-related topic. Our chosen theme also provides better contrast to our visualizations and the narrative.