Announcing Open Dev Data
The most comprehensive open source developer dataset in crypto is now public.
We are thrilled to announce the launch of Open Dev Data.
Open Dev Data is an open source platform providing continuously updated datasets and tools to measure developer activity across crypto and the decentralized web. It gives protocols, foundations, data scientists, and analysts a single source of truth to understand and communicate the traction of their developer ecosystems.
FROM ELECTRIC CAPITAL’S DEVELOPER REPORT TO OPEN PLATFORM
Since 2019, we have tracked open source crypto ecosystems to understand where developers are building. Our data pipelines now follow thousands of ecosystems, millions of developers, and hundreds of millions of commits in near real time. At this scale and complexity, a once a year PDF report is no longer the right format.
Starting in 2025 we are retiring the standalone annual Developer Report and replacing it with this continuously updating platform. By open sourcing not only the taxonomy, but also the underlying data itself, we hope that it gives every ecosystem the tools required to understand their developer ecosystem and tell their story.
For the last six months, we have been piloting these data pipelines with several leading ecosystems in crypto. We are now opening this developer dataset to the public.
DeveloperReport.com is the first consumer of Open Dev Data and it will continue to show real time developer metrics for the top ecosystems. We hope to see many more consumers of the data for internal dashboards, ecosystem focused reports like this one that the Ethereum Foundation produced, and other integrations into data products.
Let’s look into the different areas of the Open Dev Data platform. The platform has two main components: a taxonomy and datasets built on top of it.
The foundation of the platform is the taxonomy, a community-curated repository cataloguing all of the ecosystems and repos in crypto and the decentralized web. What started as a project with a handful of internal committers in 2019 now has over 1,000 open source contributors helping source and validate the data.
In addition to the community contributions, we also run AI agent processes that search the web, git forges like Github and Gitlab, and social media for new ecosystems and repositories to add to the taxonomy.
The datasets are built on top of the taxonomy through continuous data pipelines that collate all sorts of metrics on the repositories like commits, developers, and stars. We execute commit fingerprinting and developer deduplication by looking at the contents of each commit and determining its original repository and original authors. From that raw substrate, we process the data further into the growth accounting metrics like Monthly Active Developers that one sees on developerreport.com and now in the Open Dev Data tables. All datasets are produced in parquet format, making them directly compatible with DuckDB, Spark, Databricks, or your BI stack for building custom dashboards and models.
HOW TO ACCESS THE DATA
You can now find the dataset in the popular parquet format at https://opendevdata.org. The site has information about each table in the dataset and easy download links.
You can ingest the parquet files directly into tools like DuckDB, Databricks, Spark, or Tableau to build your own dashboards and models.
Example data explorations could include:
Comparing developer activity across ecosystems over time.
Measuring the impact of grants programs, hackathons, or incentive campaigns.
Cross referencing onchain deployments of code with activity on their associated git repos.
Growth of different programming languages in a particular ecosystem.
Building ecosystem health dashboards for foundations.
LICENSING
We have adopted a dual license structure that makes both the methods and the data easy to use in open source and commercial contexts.
Code: MIT License
All source code in the open-dev-data repository is available under the MIT License. You can use, modify, and redistribute the code, including in proprietary products, as long as you include the original copyright notice and license text.
Data: Creative Commons Attribution 4.0 (CC BY 4.0)
All datasets are licensed under CC BY 4.0. You can copy, redistribute, remix, transform, and build on the data for any purpose, including commercial applications, provided that you give appropriate credit, link to the license, and indicate if changes were made.
This structure removes unnecessary legal friction, supports reproducible research, enables commercial products that build on the dataset, and keeps attribution transparent as the community extends the work.
GET INVOLVED
Open source developers are the life blood of the decentralized web and understanding that activity is the most important health metric for our ecosystem. Whether you are a security researcher, a protocol foundation, a degen analyst, a startup founder, or part of an enterprise data team, you can build on this public good.
We can’t wait to see what you all build with Open Dev Data.




