Becoming a Scientist in America
I wanted to understand not just outcomes, but the structure behind those outcomes—how people move (or don’t move) through our education and research systems. By tracing pathways across stages and time, I bring clarity to a process that’s often fragmented, uneven, and hard to see from the ground.
“This project began with a simple but ambitious question: Who becomes a scientist in the United States, and what does their journey look like across time and institutions?”
A Longitudinal, Multi-Stage View of Training, Mobility, and Opportunity
The path to a scientific career isn’t linear, and it certainly isn’t the same for everyone. Some students begin at flagship public universities with deep research infrastructure; others start at community colleges, balancing school with work or caregiving responsibilities. Some are supported by faculty mentors and steady financial aid; others navigate a patchwork of part-time jobs, under-resourced advising, or institutional cultures that were never built with them in mind. These differences matter. They influence who feels a sense of belonging in scientific spaces, who persists through moments of challenge, and who ultimately earns advanced degrees. By stepping back and connecting the dots across multiple stages—K–12, college, graduate school, and beyond—I wanted to better understand the structural factors that shape these varied trajectories: who moves forward, who stalls out, and why.
To do that, I stitched together over a decade of federal data to follow the arc from high school to postdoc. I looked at entry points, like which students enroll in the biological sciences as freshmen, and tracked whether and how those students show up later in graduate programs or research positions. I mapped transitions between stages, identified where funding tends to concentrate, and examined where people land post-PhD—whether in academia, industry, government, or somewhere else entirely. The result is a long-view map of STEMM education and training, one that’s not only data-rich but designed to support practical decision-making. It’s a tool for understanding not just individual success stories, but systemic patterns that shape opportunity across time.
🔬 How I Did It
To build a comprehensive picture, I pulled data from a wide range of federal sources—each offering a different angle on the scientific training landscape. Some datasets followed students from high school into college. Others captured snapshots of graduate enrollment, postdoc funding, or employment plans after the PhD. By linking these stages together, I was able to trace patterns across time and institutions.
The raw data came in many forms—cross-sectional surveys, longitudinal panels, institutional reports. I worked with everything from IPEDS enrollment tables to the Survey of Earned Doctorates and the NSF’s postdoc funding data. Each source came with its own structure and limitations, so much of the work involved cleaning, aligning, and harmonizing the variables to tell a continuous story. I created derived variables for institutional type, modeled changes in enrollment over time, and carefully tracked demographic shifts at key inflection points.
Wherever possible, I disaggregated the data to surface patterns that might otherwise be masked. I was especially interested in moments of transition—between high school and college, college and graduate school, graduate school and postdoc, and postdoc and career. The result is a multi-stage, longitudinal view that connects the dots in a system that’s too often analyzed in silos.
🔍 What the Data Showed
K–12
Student outcomes vary widely by geography—and those differences closely track with family income, parental education, and patterns of racial and socioeconomic segregation. In many districts, the resources available to students are deeply unequal. Students in wealthier or better-resourced areas often have access to experienced teachers, smaller class sizes, and a broad range of STEMM-related coursework. In contrast, students in under-resourced schools may face outdated materials, limited lab experiences, and fewer opportunities for enrichment or acceleration.
These early disparities set the tone for everything that follows. High-achieving students from low-income communities or minoritized backgrounds are less likely to have access to advanced math and science classes, AP coursework, or dual enrollment options that are increasingly seen as gateways into scientific fields. They are also less likely to receive the kind of college and career counseling that encourages STEMM exploration. By the time students enter college, those cumulative differences in preparation and exposure often show up in who selects a science major—and who stays in it.
Undergraduate
Enrollment in the biological sciences has grown steadily over the past decade, with particularly strong gains among women and Latino students. This signals expanding interest in STEMM fields, especially at the entry point. But the story shifts when you look at completion: graduation rates remain highest among men and white students, and the degree gap for Black, Indigenous, and first-generation students continues to persist. These patterns suggest that while more students are entering the field, not all are equally supported to finish.
Geography and institutional type also matter. The South enrolls the largest share of biological science undergraduates, but research-intensive (R1) universities account for the majority of future PhD earners. These schools often have stronger lab infrastructure, better access to research opportunities, and more established faculty networks. That means where a student starts their undergraduate education can significantly shape their long-term trajectory in science—even before graduate school begins.
🧑🏽🏫 Summative analysis for K-12 and Undergraduate is available here
Graduate & PhD Training
While many PhD students earned their bachelor's from R1 institutions, a surprising number began at community colleges—demonstrating that scientific careers often start in places not traditionally associated with advanced research. These alternative pathways matter. The transition into graduate study remains a major drop-off point, especially for students from minoritized, first-generation, or lower-income backgrounds. Gaps in mentorship, research opportunities, and financial support can limit who applies, who gets admitted, and who ultimately persists.
Among those who do pursue advanced degrees, academic continuity is not always guaranteed. Some biological science PhDs arrive with a related undergraduate and master’s degree, having followed a more linear path. Others enter graduate school from unrelated disciplines, bringing different skill sets and perspectives. This variation underscores both the flexibility and fragility of the pathway to a PhD—where small differences in preparation, institutional context, or access to support can shape long-term outcomes in profound ways.
💡Survey analysis for Graduate & PhD Training is here
Postdocs
Postdoctoral positions remain highly concentrated within a narrow segment of the scientific training system. More than half of all postdocs in the biomedical and biological sciences are international scholars on temporary visas, and their positions are overwhelmingly funded by the NIH. These roles are not evenly distributed: most are housed at a small number of elite R1 institutions and medical schools, meaning that access to postdoc opportunities is closely tied to where someone completes their PhD. Despite being a critical stage for launching independent research careers, postdocs often experience limited transparency around pay, mentorship, job stability, and long-term prospects.
To better understand these dynamics, I conducted an in-depth analysis of national postdoc data from 2017 to 2021. The findings underscore how structurally uneven this phase of training really is. The analysis disaggregates trends by funding source, mechanism of support, institutional classification, and gender—highlighting who gets funded, how they’re supported, and where they’re concentrated. It paints a clearer picture of how opportunity, access, and advancement are distributed during one of the most formative—and opaque—stages in a scientific career.
🧪 Postdoctoral Landscape: Funding, Demographics, and Institutional Concentration is here
Plans After Graduation
After the PhD, men are more likely to secure tenure-track faculty positions, while women are disproportionately represented in nonprofit roles, teaching-focused academic appointments, and positions outside the tenure system. This pattern isn’t just about career preference—it reflects longstanding disparities in access to mentorship, hiring networks, and institutional support. While academia remains a key pathway, it’s no longer the dominant one: many PhD holders, regardless of gender, now pursue careers in industry and government, which offer more predictable salaries, clearer advancement structures, and often greater geographic flexibility.
Industry, in particular, offers the highest median income among post-PhD career paths, drawing many graduates into roles that blend research, development, and management. Government jobs offer slightly lower salaries but often come with stability and mission-driven appeal. One striking trend is geographic “stickiness”: many PhD graduates—especially international scholars and women—end up working in the same state where they completed their doctoral training. This suggests that graduate school location not only shapes networks and job prospects, but also deeply influences where scientific labor is retained across the country.
🎓 What Happens After the PhD? Plans, Pay, and Placement
Why This Matters
This project was built to contribute to a broader conversation about who participates in scientific training and how that participation is shaped by policy, geography, and institutional context. There’s no shortage of opinion about where the STEMM workforce is heading, but fewer efforts actually trace the experience longitudinally—across the full education and career spectrum. By putting multiple datasets into dialogue, this project offers a rare view across time, showing not just who enters but who persists, shifts direction, or drops out altogether.
It’s a quantitative effort, but one grounded in practical questions: Where are the inflection points? What types of institutions drive long-term participation? Who gets funded—and who doesn’t? Rather than offer a single answer, this work opens up a set of questions that matter to funders, educators, and institutional leaders alike. It invites decision-makers to consider how structure shapes outcomes, and where change might be most possible.
This isn’t just a data project—it’s a way of seeing the system more clearly. When you look across all the stages together, patterns emerge: where students enter, where they stall, and where they succeed. I created this to help institutions and funders make smarter, more informed decisions about where to invest time, money, and attention.
📂 Project Materials
📚 Data Sources
National Center for Education Statistics (IPEDS, High School & Beyond, Postbaccalaureate Enrollment)
National Science Foundation (NSCG, GSS, SED, SDR)
U.S. Census Bureau
National Institutes of Health (NIH)
Carnegie Classifications of Institutions of Higher Education
Opportunity Atlas (Chetty et al.)
The Educational Opportunity Project (Stanford)
National Teacher and Principal Survey (NTPS)
All data is publicly available. Visualizations, analyses, and interpretations are original.