This article was updated in Winter 2023
What are our pension data best practices? When conducting any analyses, the findings are only as strong as the data being used. This is especially important for policy research, as there are incentives for individuals and groups with conflicting interests to selectively filter data to reach preconceived ends. “Data” can be a largely objective source of information, but how figures are used and what data sources are applied in what contexts can have an outsized effect on policy research and the discussions they aim to inform.
The often contentious nature of public finance discussions related to public pensions means this policy area requires particularly transparent and appropriate utilization of financial data when conducting research. Any “public pension” research product should be held to high standards and expectations of data source quality and applicability. As any experienced researcher will already know, the decision of which data source to use for a project will depend on a myriad of factors, but it is important to acknowledge that some data sources are better than others. Some data sources are more appropriately used for certain kind of analysis versus others, and even the best possible data have limitations that may affect whether they are appropriate to use for a given project.
Equable’s primary mission is to provide quality education information related to public sector retirement systems. Our research team at Equable is committed to transparency in data utilization. To meet this commitment we make the data in our research collection efforts open source, and have a rigorous focus on applying the right dataset to each analytical task.
How We Evaluate Our Pension Data Sources
Public pension data can be broadly classified into two categories: primary source data and secondary data (typically aggregated by another source, such as a research institution).
Primary source data are based on the documents and data provided by public retirement systems. These documents include actuarial valuation reports, comprehensive annual financial reports (both from the retirement system and the state/municipal government), GASB disclosure reports, and experience studies. These reports offer the audited financials, actuarially calculated funding data, and the direct research findings and statements from a retirement system. As a result, these can be thought of as the most accurate, unvarnished data for any given pension or other retirement plan. However, while these data are the most credible, they are often the most resource intensive to collect.
Secondary data sources offer data that have been collected and processed by a third party into some form that is typically aggregated or otherwise provided across multiple retirement systems. Frequently secondary databases rely on primary source documents, but in some cases institutions will collect secondary data, transform and/or combine them, and release that data based on their own methodology. There are numerous benefits to using secondary data sources – they require far less time investment and resources to obtain the data. However, secondary data sources require the researcher to accept any limitations that stem from who compiled the data and what methodology they used.
To the extent possible, stakeholders in public sector retirement systems should utilize primary source data. But we recognize that this can be resource intensive and require specific knowledge about how to interpret pension system reports. By contrast, secondary data sources provide opportunities for research and understanding that can inform the discussion around public pensions more broadly.
At Equable, we compile primary source data for our research. We release all of the data associated with each educational research product that we publish. In the future, we plan to make our entire database available for any researchers who are interested in utilizing it as a valuable secondary data source for their own work.
Understanding the Pros and Cons of Commonly Used Secondary Sources for Public Pension Data
Below we review six commonly cited secondary data sources related to public sector retirement systems. For each source we offer a brief overview of what data they provide, their strengths, and their limitations.
The intent of these comments is to show the various trade-offs that come with the data from each source, as well as to provide comments on what appropriate (and inappropriate) utilization is for each source of information. We are not opining on the character of the researchers who have compiled these sources and assume they are acting with the highest standards of integrity in producing their data. Any positive comments are not necessarily an endorsement of the accuracy of the datasets, nor are any negative comments intended to undermine the researchers or institution publishing the data. (We note that some of the limitations applicable to datasets detailed below are applicable to Equable’s database too.)
The Center for Retirement Research at Boston College’s Public Plans Database (PPD) (available here)
Overview: The Center for Retirement Research has compiled arguably the single most reliable source for state and local pension finance data at the “plan-level.” PPD data include a wealth of different funding variables ranging from actuarial and market values of plan assets, to investment returns, to funded ratios and unfunded liabilities covering from 2001 to the present.
PPD data is organized to provide data points for individual, defined benefit pension plans. Some of this can be aggregated up to the “system-level” for researchers who want to organize their data that way. These data do not cover financials at the “tier-level” within plans, such as where a pension plan has different classes of benefits based on entry age. PPD also publishes separate tables with investment portfolio allocations and will in the future expand their coverage to include other kinds of public sector retirement plans.
Trade-offs of PPD Data
- Strength: Data come directly from primary source documents published by the retirement systems.
- Strength: Data are updated frequently, and plans covered have been expanded over time.
- Strength: PDF files of the primary source documents are available through the project website.
- Limitation: Data do not include detailed plan benefits or benefit policies.
- Limitation: Data are plan-level and not separated across tiers of benefits. This means some elements data that will depend on tiers, such as contribution rates, may not be accurate or will involve judgment calls by the data collector to provide a weighted average.
- Limitation: Documentation is difficult to work with and incomplete in some cases.
A good use of this source is for an up-to-date measure of nationwide funded status of state and city pension funds. These data could also be used to develop complex, quantitative analysis at the plan level by academic researchers. This is not a good source for research that would examine benefit designs or that would require tier-level data.
National Association of Retirement System Administrators (NASRA) Datasets (available here)
Overview: The National Association of Retirement System Administrators is a professional association for the people that run retirement systems. They frequently publish analyses of pension policies and funding using data provided by their members. Their data are not limited to a specific topic area related to public retirement systems, as their research is quite varied and requires different data for their analyses.
Trade-offs of NASRA Data
- Strength: Data are often self-reported from members, meaning that data are comparable to those reported in primary source documents, but can feature other information not normally included in published reports.
- Strength: NASRA has compiled the most comprehensive data on plan design changes, allowing for comprehensive analyses of the policies related to retirement systems. Many of their datasets include narrative information to provide valuable background to researchers.
- Strength: NASRA conducts research across numerous issues related to pension funding, policies, and benefits. This means data across numerous topic areas are available from NASRA.
- Limitation: There is no central dataset. Rather, there are numerous separate, smaller datasets for each respective analysis. Data are spread out over multiple documents instead of being in one place.
- Limitation: Most datasets lack detailed documentation for confirmation of data points. Raw data is not always provided, as datasets are sometimes presented in the form of charts and summary tables included in reports.
- Limitation: Datasets provided do not have consistent coverage of systems, plans, and tiers making compilation across all sources not always possible or practical. There are reasonable reasons for this in the way various datasets are presented, but it is a limitation.
A good use of this source is to review specific issues, like current investment assumptions, from the largest state and city pension funds. NASRA data are well suited to examine the histories of changes to retirement systems by state. However, these data are not a good source of unified data across several variables for all states.
Census Bureau’s Annual Survey of Public Pensions (available here)
Overview: The Census Bureau offers a collection of data for all public retirement systems identified as being sponsored or supported by state and local governments. These data offer measures of funding, liabilities, and more from 1993 through 2019.
Trade-offs of Census’ Survey of Public Pensions Data
- Strength: Data are compiled back through 1993 for many variables, and even earlier for others, while other databases only provide coverage from 2001 forward.
- Strength: Data are government-compiled with the same rigor as other Census data, and are updated regularly (including detailed methodologies related to data collection and transformations).
- Limitation: Data are not compiled from typical primary sources, instead relying on a rotating survey of plans. This survey includes an annual sample of plans reporting information and imputation of those plans not sampled. The surveys are sometimes sent to different state and municipal government departments than those that publish actuarial reports. As a result, many of the data points provided by the Census Bureau do not always line up with the reports that states have published themselves (though the variances are rarely so substantive as to render the data source useless).
- Limitation: Data are limited to state-level aggregations in primary data releases, requiring inquiry to Census or use of flat files to access more nuanced, plan-level data.
- Limitation: Census frequently revises their data and methodology, including for pension data. This can result in the misclassification of some data across plans or over time that researchers must be very careful to account for in reviewing provided documentation.
A good use of this source is for high-level analyses of total assets or liabilities of state and city pension funds. However, these data are not a good source for plan-specific analyses or complex quantitative analysis over time because of the inconsistency in data collection methodology.
Former/Outdated Retirement Datasets
There are at least two organizations that have regularly published datasets related to public retirement plans that, for different reasons, are no longer producing annual updates to their data (as of Winter 2023).
The Pew Charitable Trusts’ Pension Funding Gap Data
Overview: The team at Pew previously published an annual examination of the funded status of public pensions across the country. Such publications have been paused indefinitely as of this updated review in Winter 2023. However, Pew does now periodically publish selective datasets that focus on specific policy topics related to public retirement systems, such as related to investment reporting or funding policies. Pew’s methodology focuses on using data that each retirement system publishes under common GASB standards (known as GASB 67 and GASB 68 reports).
Trade-offs of Pew Data
- Strength: Data are collected from primary source documents published by retirement systems and are only reported once all retirement systems in their database have reported for a given plan-year, meaning that data are for a complete fiscal year when released.
- Strength: These data are limited to a small number of variables, making them easier to review for journalists, stakeholders, and others interested in pension systems.
- Limitation: The Pew methodology of waiting until every retirement system has published their reports for a given year means that comprehensive data are presented on a two-year lag. For example, Pew’s report from June 2020 publishes data for the fiscal year 2018. The benefit is that data is complete for every plan in that fiscal year; the downside is that the data is old and the majority of 2019 data were available as of the spring of 2020.
- Limitation: Data are often only offered aggregated at the state level and are spread across each of their publications, as opposed offered in a central downloadable database.
- Limitation: Pew’s data are based primarily on GASB 67 reports, which mean that for some states they are leaving out large portions of plan liabilities. For example, CalPERS’s PERF A plan — which carries the largest portion of liabilities in CalPERS — does not have to provide a GASB 67 report because it is an agency multiple employer plan. The benefit of this methodology is that Pew data is consistently drawing from a single source. The limitation, means that Pew’s data for CalPERS, and California by extension, does not completely report all liabilities.
A good use of this source is for high-level review by lay analysts or journalists of how state-level unfunded liabilities have trended in recent history. However, the delays involved in Pew’s data collection and quality control processes prevent these data from being a good source for up-to-date funded status information.
Urban Institute’s State and Local Employee Pension Plan Database (SLEPP)
Overview: The Public Pension Project at the Urban Institute has previously compiled a database of benefit-centric information for public retirement systems. Their data include traditional elements of defined benefit pensions such as vesting points, benefit multipliers, retirement eligibility, and final average salary components. This information is now significantly out of date, as of Winter 2023.
Trade-offs of Urban’s SLEPP Data
- Strength: When the SLEPP database was published it provided the most comprehensive collection of benefit design data available.
- Strength: Benefit data are broken down at the tier-level allowing for a more complete reflection of benefits offered by different plans.
- Strength: SLEPP’s data are organized in a manner that renders separate documentation largely unnecessary. Data are sourced clearly, often with links to respective plan websites and documents when available.
- Limitation: The data are only updated sporadically. Urban indicates their data was last updated in 2018 but do not have any clear indication that another update is pending. Further, their last update did not include all plan design changes that were adopted in 2017 and 2018.
- Limitation: The data only focus on benefit design, and do not pair any robust financial data at the system, plan, or tier-level.
A good use of this source is for a comparison of benefit levels across various pension plans by academic researchers or interested stakeholders. These data are not useful for analyses of pension finances.
From the descriptions above it should be clear that data are readily available for research into public pensions. We recommend those with the available time and resources to directly reference the primary source documents, as those are the most reliable sources for accurate data. However, for those looking for more ready-made data, there are numerous quality options available that can help inform a variety of research questions and policy debates. At Equable we encourage those interested in the issue of public pensions to investigate the data for themselves to try to cut through the debate to see where potential challenges for their retirement system may exist. We have detailed the trade-offs across the best data sources available to help identify which data are ideal for different research questions. We acknowledge that this list is not exhaustive, and we invite you to let us know about any others that we might have missed in our outline of our pension data best practices.