Several years ago, Equable published a primer on public pension data with an eye toward helping researchers, legislators, and other public retirement stakeholders identify the best data sources available. In the time since that primer’s publication, Equable’s research team has sought to compile as much publicly available data regarding public retirement system finances and benefit provisions in a single place as possible. The goal was, and still is, quite simple: provide reliable data so that discussions around public retirement systems can be objective, structured, and well-informed.
Highly accurate, transparent, quality data are essential for policy research, as there are incentives for individuals and groups with conflicting interests to selectively filter data to reach preconceived ends. Although data on their own serve as an objective source of information, how they are used can have an outsized effect on the policy discussions they aim to inform. This is especially applicable for potentially contentious discussions around public retirement systems.
As a result, it is more important than ever to have transparent and appropriate utilization of financial data when conducting research. It is also important to acknowledge that some sources of data are better than others. Furthermore, some data sources are more appropriately used for certain kinds of analyses, and even the best possible data have limitations that may affect whether they are appropriate to use for a given project.
One of Equable’s primary missions is to provide quality information, data, and education related to public retirement systems. Our research team is committed to transparency in data utilization. To meet this commitment, the data compiled for use in our research are made openly available on our website.
How We Evaluate Pension Data Sources
Public retirement data, like most other data, can be broadly classified into two categories: primary source data and secondary data (typically aggregated by another source, such as a research institution).
Primary Source Data
Primary source data are based on the documents and information provided by public retirement systems. These documents include actuarial valuation reports, comprehensive annual financial reports, Governmental Accounting Standards Board (GASB) disclosure reports, investment reports, supplemental actuarial analyses, financial statements, and experience studies.
These reports offer the audited financials, actuarially calculated funding data, and the direct research findings and statements from retirement systems. As a result, these can be thought of as the most accurate, unvarnished data for any given pension or other retirement plan.
However, while these data are the most credible, they are often the most cumbersome and resource-intensive to collect and use in comparative analyses.
Secondary Data
Secondary data sources offer data that have been collected and processed by a third party. Frequently, secondary databases rely on primary source documents. But, in some cases, institutions will collect primary data, transform and/or combine them, and release that data based on their own methodology.
There are numerous benefits to using secondary data sources. For example, it requires far less time investment and resources to obtain the data. However, secondary data sources require the researcher to accept any limitations that stem from who compiled the data and what methodology they used.
Why We Prefer Primary Source Data
To the extent possible, stakeholders in public sector retirement systems should utilize primary source data. But this can require specific knowledge about how to interpret retirement system reports or significant investments of time and other resources to compile the data needed. Secondary data sources provide opportunities for research and understanding that can inform the discussion around public retirement systems more broadly with a much lower bar for entry.
At Equable, our databases on retirement system finances and benefit provisions utilize primary source data that are then analyzed for our research. We release all data associated with each research product that we publish. And our entire finance and benefit databases are available through our website for those who are interested in utilizing them for their own work.
Understanding Secondary Sources for Public Pension Data
Below, we review secondary data sources related to public sector retirement systems. For each source, we offer a brief overview of what data they provide, their strengths, and their limitations.
The intent of this primer is to show the various trade-offs that come with the data from each source, as well as to provide comments on what appropriate (and inappropriate) utilization is for each source of information.
Equable’s Public Retirement Research Database
Equable has compiled multiple sizable datasets within its Public Retirement Research Database to help inform public retirement researchers, policymakers, and other stakeholders. These data are parsed into several different pieces, detailed below, out of necessity for data organization.
Equable’s Finance Database
This is one of the most reliable sources for state and local pension finance data at the “plan level.” These data include a collection of 110 variables capturing everything from plan assets and liabilities to investment returns, asset allocations, actuarial valuation methods, amortization periods, and more. These data are captured to the extent primary source documents have been made available for more than 260 state and municipally administered plans covering the years 2000 through the present, with updates coming multiple times each year.
Equable’s Benefits Database
This is the single most comprehensive collection of data capturing plan benefit provisions available to date. These data are organized at the “benefit tier” level and include more than 3,000 different combinations of benefits across the more than 260 state and municipal plans in the Equable Finance Database. These data generally do not vary by year; they reflect the majority of benefit offerings that have been provided by public retirement systems from 1965 through the present.
Equable’s Retirement Security Report Datasets
These datasets represent a subset of Equable’s Benefits Database, reflecting the nearly 2,000 tiers of benefits that were analyzed in our Retirement Security Report.
Equable’s Sources of State Unfunded Liabilities Database
This dataset consists of the actuarial “gain/loss” data published by most public pension plans annually as part of their actuarial valuation reports. These data capture the actuarially determined causes of changes to the liabilities and assets of plans for a given year. They are organized at the plan level and cover as many years as possible from 2000 through 2024, as they tend to lag one to two years with the publication of plan actuarial valuations. We note, however, that these data are limited to a subset of the plans currently included in Equable’s Finance Database and currently do not include most municipal plans.
Trade-Offs of Equable Data
Strengths
- Data are compiled directly from primary sources published by public retirement systems.
- Data are updated frequently, and the number of systems covered expands almost every year to include more plans.
- Data are well organized with clear documentation and sourcing, variable guides, and even subsets of data pre-sectioned for easier use.
Limitations
- Data are not currently published via an interactive tool that allows for analysis or visualization on the Equable website.
- Data are currently updated every three to six months, but we plan to offer more regular updates in the coming years.
- Benefit data are organized in a less-than-intuitive way in some cases, making the documentation essential for anyone looking to use them.
We acknowledge our bias toward data that we compile. The simple fact is that our data are publicly available and, in most cases, reflect the raw primary sources as published by the retirement systems. Equable is committed to transparency in both our data and methodology, with detailed documentation of sources drawn upon and any methodological adjustments that have been made.
The Center for Retirement Research at Boston College’s Public Plans Database
The Center for Retirement Research has compiled one of the single most reliable sources for plan-level state and local pension finance data in its Public Plans Database (PPD). PPD data include a wealth of different funding variables, ranging from actuarial and market values of plan assets to investment returns, funded ratios, and unfunded liabilities covering the years from 2001 to the present.
PPD is organized to provide specific data points for individual, defined benefit pension plans. Its data can be aggregated to the “system level” but do not cover financials at the “tier level” within plans, such as where a pension plan has different classes of benefits based on entry age. However, this is often because plans often do not provide a breakdown of financial data by benefit tier. PPD also publishes separate tables with investment portfolio allocations and will in the future expand its coverage to include other kinds of public sector retirement plans.
PPD data are available in a variety of forms, including a downloadable archive of the primary source documents in PDFs, a data interactive on its website, and downloadable versions of its data for analysis offline.
Trade-Offs of PPD Data
Strengths
- Data are compiled directly from primary source documents published by the retirement systems.
- Data are updated frequently, and plans covered have been expanded over time.
- Data are available in various forms, including through an online interactive, downloadable data files, and even PDFs of the primary source documents.
Limitations
- Data do not capture plan benefit provisions.
- Data are recorded at the plan level and are not separated across tiers of benefits. As a result, some variables that can differ across tiers, such as contribution rates, may report weighted averages that require documentation to properly contextualize.
- Documentation is difficult to work with and incomplete in some cases.
PPD offers an excellent source for up-to-date measures of nationwide funded status of state and city pension funds. It has been a leader in providing these data for almost 20 years. These data should always be under consideration for complex, quantitative analysis of plan-level finances. However, these data are not appropriate for research that would examine benefit designs or that would require tier-level data.
National Association of Retirement System Administrators Datasets
The National Association of Retirement System Administrators (NASRA) is a professional association for the people that run retirement systems. The NASRA Research Center frequently publishes analyses of pension policies and funding using data provided by their members. Its data are not limited to a specific topic area related to public retirement systems, as its research is quite varied and requires different data for its analyses.
Its data could be used to research a variety of different topics and include numerous different datasets ranging from economic indicators to survey results, industry-specific reports, retirement system policies, assumptions used by plan actuaries, and more.
Trade-Offs of NASRA Data
Strengths
- Data are often self-reported from members, meaning that data are comparable to those reported in primary source documents, but can feature other information not normally included in published reports.
- NASRA has compiled comprehensive data on plan design changes, allowing for analyses of the funding policies related to retirement systems. Many of their datasets include narrative information to provide valuable background to researchers.
- NASRA conducts research across numerous issues related to pension funding, policies, and benefits. This means data across numerous topic areas are available from NASRA.
Limitations
- There is no central dataset. Rather, there are numerous separate, smaller datasets spread out over multiple documents instead of being in one place.
- Most datasets lack detailed documentation for confirmation of data points. Raw sources and data files are not always provided, as datasets are sometimes presented in the form of charts and summary tables included in reports.
- Datasets provided do not have consistent coverage of systems, plans, and tiers, making compilation across all sources not always possible. There are reasonable reasons for this in the way various datasets are presented, but it is a limitation.
An effective use of this source is to review specific issues, like current investment assumptions, from the largest state and city pension funds. NASRA data are well suited to examine the histories of changes to retirement systems by state. However, these data are less useful for analyses across variables for all states.
Census Bureau’s Annual Survey of Public Pensions
The Census Bureau offers a collection of data from its Annual Survey of Public Pensions (ASPP) for all public retirement systems identified as being sponsored or supported by state and local governments. These data offer measures of funding, liabilities, and more from 1993 through 2019.
Trade-Offs of Census Bureau ASPP Data
Strengths
- Data are compiled from 1993 for many variables, and even earlier for others, while other databases only provide coverage from 2001 forward.
- Data include 304 state-administered and another 4,632 municipally administered plans, giving the most complete coverage of plans among all secondary data sources.
- Data are government-compiled with the same rigor as other Census data and are updated regularly (including detailed methodologies related to data collection and transformations).
Limitations
- Data are not compiled from published plan documents, instead relying on a rotating survey of plans. This survey includes an annual sample of plans reporting information and imputation of those plans not sampled. The surveys are sometimes sent to different state and municipal government departments than those that publish actuarial reports. As a result, many of the data points may not line up with the reports that states have published themselves (though the variances are rarely so substantive as to render the data source useless).
- Data are limited to state-level aggregations in primary data releases, requiring inquiry to Census or use of flat files to access more nuanced, plan-level data. This renders much of the Census data inaccessible to most stakeholders.
- Census frequently revises their data and methodology. This can result in the misclassification of some data across plans or over time that researchers must be careful to account for in reviewing provided documentation.
An effective use of Census data is for high-level analyses of total assets or liabilities of state and city pension funds. However, these data are not a good source for plan-specific analyses or complex quantitative analysis over time because of the inconsistency in data collection methodology. Data are limited in their accessibility but hold the promise of greater coverage across more plans than other secondary sources.
Conclusion
From the descriptions above, it should be clear that data are readily available for research into public retirement systems. We recommend those with the available time and resources to directly reference the primary source documents, as those are the most reliable sources for accurate data. However, for those looking for more ready-made data, there are numerous quality options available that can help inform a variety of research questions and policy debates.
In this primer, we have detailed the trade-offs across the best data sources available to help identify which data are ideal for different research questions. This list is not exhaustive, as there are several other secondary datasets available from other non-profit organizations, such as the Pew Charitable Trusts, the Reason Foundation, and the Urban Institute. As with the data sources highlighted here, there will be strengths and weaknesses to each of the other data sources we have opted not to detail in this primer. We invite you to let us know about any others that we might have missed in our outline of the available pension data sources.
What is important is that we encourage those interested in the issue of public pensions to investigate the data for themselves to try to cut through the debate to see where potential challenges for their retirement systems may exist. While independently analyzing the data can take longer, quality secondary data sources do exist to help make comparative and longitudinal analyses easier for even novice researchers. The availability of these data helps ensure that stakeholders can all have informed discussions regarding their retirement systems, their funded status, benefit provisions, and the policies that make their plans work.
Notes
- We are not opining on the character of the researchers who have compiled these sources and assume they are acting with the highest standards of integrity in producing their data. Any positive comments are not necessarily an endorsement of the accuracy of the datasets, nor are any negative comments intended to undermine the researchers or institution publishing the data.