Scale or Die 2025: Solving The Pains of Indexing On Solana
By accelerate-25
Published on 2025-05-19
The Graph's Substreams solution addresses major challenges in indexing Solana blockchain data, offering significant performance improvements and developer-friendly features.
In a groundbreaking presentation at Scale or Die 2025, Giuliano Francescangeli, Substreams Product Manager at The Graph, unveiled a revolutionary solution to the most pressing challenges faced by developers when indexing data on the Solana blockchain. This new technology promises to dramatically improve the efficiency and ease of building applications on Solana, potentially unlocking a new wave of innovation in the ecosystem.
Summary
The Graph's Substreams technology addresses five major pain points in Solana indexing: real-time indexing, historical data processing, handling blockchain reorganizations (reorgs), accessing account state, and decoding Interface Description Language (IDL) data. By tackling these issues, Substreams offers a comprehensive solution that could significantly reduce development time and complexity for Solana projects.
Francescangeli highlighted the unique challenges posed by Solana's high-speed, high-throughput architecture, which requires near-instantaneous data processing to keep up with the blockchain's rapid pace. Traditional indexing methods often fall short, leading to delays and missed data. Substreams' approach involves parallel processing, efficient data compression, and innovative handling of blockchain-specific issues like reorgs and account state changes.
The presentation revealed impressive performance metrics, including a mere 1.5-second average drift from the blockchain's head, with speeds as low as 600 milliseconds in some cases. For historical data processing, Substreams boasts a staggering 10,000% performance boost compared to traditional methods, potentially reducing processing times from weeks to days.
Key Points:
Real-time Indexing Solutions
Solana's high transaction throughput and short block times present a significant challenge for real-time indexing. Traditional approaches using RPCs or streaming services like Yellowstone often fall behind quickly, leading to missed data and inconsistencies.
Substreams tackles this issue head-on with a high-performance, deterministic approach. By achieving an average of just 1.5 seconds drift from the blockchain's head, with speeds as low as 600 milliseconds in some cases, it ensures that indexers can keep pace with Solana's rapid block production. The use of gRPC protocol with protocol buffers for message encoding further optimizes data transmission, reducing bandwidth requirements and improving overall efficiency.
Historical Data Processing
Processing historical blockchain data on Solana has been a major bottleneck for many projects. The sheer size of Solana blocks (up to 40 megabytes for 100 blocks) and the complexity of data structures like inner instructions make this task particularly challenging.
Substreams introduces a revolutionary approach to historical data processing. By grouping blocks into 1,000-block flat files and allocating workers to process these files in parallel, Substreams achieves a remarkable 10,000% performance boost compared to traditional linear processing methods. This improvement can reduce processing times from weeks to mere days, significantly accelerating development and data analysis tasks.
Reorg Handling
Blockchain reorganizations (reorgs) are a common occurrence on Solana, presenting a significant challenge for data consistency and reliability. Traditional approaches often involve waiting for finalized blocks or implementing complex custom signals and database management systems.
Substreams simplifies reorg handling with an in-memory replication of the blockchain and its branches. This approach allows for flexible, safe, and easy management of reorgs without imposing additional burden on developers. Each unique Substreams request maintains its own chain representation, eliminating conflicts between different users and ensuring data consistency.
Account State Access
Solana's account-based model offers advantages in terms of parallelization but presents challenges when tracking historical state changes. Existing solutions often lack access to historical data and may face issues with large or frequent requests.
Substreams addresses these limitations by providing a three-month moving window of historical account state data. This feature allows developers to back-process state changes, track ownership transfers, and access the most recent account changes efficiently. By rounding account changes to the block level, Substreams reduces data overhead and simplifies state tracking for developers.
IDL Compatibility and Decoding
The lack of standardization in Interface Description Language (IDL) formats and frequent breaking changes in Solana's ecosystem pose significant challenges for developers. Manual importing and mapping of instructions and events can be time-consuming and error-prone.
Substreams aims to automate IDL handling and decoding, addressing edge cases and streamlining the development process. By decoding data at the point of extraction and offering community support for unsupported IDLs, Substreams significantly reduces the burden on developers in managing versioning and compatibility issues.
Facts + Figures
- Substreams achieves an average of 1.5 seconds drift from Solana's blockchain head
- In some cases, Substreams can process data as quickly as 600 milliseconds from the head
- Historical data processing sees a 10,000% performance boost with Substreams
- 100 Solana blocks can be up to or above 40 megabytes in size
- Substreams processes historical data in groups of 1,000-block flat files
- A three-month moving window of account state history is provided by Substreams
- Substreams is natively Rust-based, enhancing compatibility with Solana's ecosystem
- The solution can reduce historical data processing times from weeks to days
- Substreams uses gRPC protocol with protocol buffers for efficient data transmission
- The technology addresses five major pain points in Solana indexing: real-time indexing, historical data processing, reorg handling, account state access, and IDL compatibility
Top quotes
- "If you're not sitting extraction next to execution, you're missing out right away."
- "We really tried to push performance while keeping maintaining determinism."
- "What we've seen in recent history is that 100 blocks may be up to or above 40 megabytes."
- "We're seeing 10,000% performance boost. You have to think weeks to days. It's crazy."
- "Reorgs will vary in size. The legacy approach to solving this problem is either you're going to wait for finalized blocks, but this again puts you quite behind head, or you implement custom signals."
- "We have an in-memory replication of the chain and its branches."
- "The account state model in Solana is great because you don't have this monolithic state structure like you have on other ecosystems like EVM."
- "We're really aiming for ease of mind here."
- "The core crux of the problem is the versioning itself. They impose breaking changes on and breaking changes are imposed on developers every week."
- "The most important thing is the user experience. The user experience is affected. It's very important. And especially on a chain like Solana, that's not acceptable."
Questions Answered
What is Substreams and how does it help with Solana indexing?
Substreams is a technology developed by The Graph to address major challenges in indexing Solana blockchain data. It offers solutions for real-time indexing, historical data processing, handling blockchain reorganizations, accessing account state, and decoding IDL data. By tackling these issues, Substreams significantly improves the speed and efficiency of data indexing on Solana, allowing developers to build applications more easily and quickly.
How does Substreams improve real-time indexing on Solana?
Substreams achieves near real-time indexing on Solana by maintaining an average of just 1.5 seconds drift from the blockchain's head, with speeds as low as 600 milliseconds in some cases. It uses a high-performance, deterministic approach and employs gRPC protocol with protocol buffers for efficient data transmission. This ensures that indexers can keep up with Solana's high transaction throughput and short block times, which is crucial for maintaining data consistency and providing a smooth user experience.
What performance improvements does Substreams offer for historical data processing?
Substreams provides a remarkable 10,000% performance boost for historical data processing compared to traditional methods. It achieves this by grouping blocks into 1,000-block flat files and allocating workers to process these files in parallel. This approach can reduce processing times from weeks to mere days, significantly accelerating development and data analysis tasks on the Solana blockchain.
How does Substreams handle blockchain reorganizations (reorgs)?
Substreams simplifies reorg handling with an in-memory replication of the blockchain and its branches. This approach allows for flexible, safe, and easy management of reorgs without imposing additional burden on developers. Each unique Substreams request maintains its own chain representation, eliminating conflicts between different users and ensuring data consistency. This is a significant improvement over traditional methods that often require waiting for finalized blocks or implementing complex custom signals.
What solutions does Substreams offer for accessing account state on Solana?
Substreams provides a three-month moving window of historical account state data, allowing developers to back-process state changes and track ownership transfers efficiently. It also rounds account changes to the block level, reducing data overhead and simplifying state tracking. This approach addresses the limitations of existing solutions that often lack access to historical data and may face issues with large or frequent requests.
How does Substreams address IDL compatibility and decoding issues?
Substreams aims to automate IDL handling and decoding, addressing edge cases and streamlining the development process. It decodes data at the point of extraction and offers community support for unsupported IDLs. This significantly reduces the burden on developers in managing versioning and compatibility issues, which are particularly challenging in the Solana ecosystem due to frequent breaking changes and lack of standardization in IDL formats.
Why is Substreams important for developers building on Solana?
Substreams is crucial for Solana developers because it addresses major pain points in data indexing, which is a fundamental requirement for many blockchain applications. By simplifying and accelerating indexing processes, Substreams allows developers to focus more on building innovative features and improving user experience, rather than dealing with complex infrastructure issues. This can potentially lead to faster development cycles, more robust applications, and ultimately, a more vibrant Solana ecosystem.
On this page
- Summary
- Key Points:
- Facts + Figures
- Top quotes
-
Questions Answered
- What is Substreams and how does it help with Solana indexing?
- How does Substreams improve real-time indexing on Solana?
- What performance improvements does Substreams offer for historical data processing?
- How does Substreams handle blockchain reorganizations (reorgs)?
- What solutions does Substreams offer for accessing account state on Solana?
- How does Substreams address IDL compatibility and decoding issues?
- Why is Substreams important for developers building on Solana?
Related Content
Scale or Die 2025: Solving The Pains of Indexing On Solana
The Graph's Substreams solution addresses major challenges in indexing Solana blockchain data, offering improved performance and developer experience.
Scale or Die 2025: Adapting DEX Aggregation to Solana: Routing Under Constraints
0x's Duncan Townsend explains the challenges and solutions for bringing DEX aggregation to Solana's unique blockchain environment.
Scale or Die at Accelerate 2025: Scale to win: agave's performance arc
Alessandro Decina from Anza reveals groundbreaking performance improvements for Solana, debunking scalability myths and showcasing innovative solutions.
Ship or Die 2025: Solana Attestation Service
Solana Attestation Service launches on mainnet, enabling seamless KYC and data verification for on-chain applications
Breakpoint 2023: How to Build Neon on Solana
Neon Labs co-founder unveils advancements in Neon EVM, promising high transaction throughput and interoperability for Ethereum apps on Solana.
Ship or Die 2025: Enterprise Adoption of Stablecoins
Exploring the future of stablecoins and their impact on global financial systems
Anatoly Yakovenko: What's Next for Solana? | Permissionless II
Anatoly Yakovenko discusses Solana's momentum, Firedancer's performance improvements, and the vision for multi-leader slots in this insightful Permissionless II interview.
Scale or Die 2025: No-strings-attached programs w/ Pinocchio
Fernando Otero introduces Pinocchio, a new dependency-free SDK for writing efficient Solana programs
Scale or Die at Accelerate 2025: Fireside: zkSVMs
Industry experts discuss the potential of zkSVMs and rollups for scaling Solana and improving DeFi applications
Scale or Die at Accelerate 2025: The State of Solana MEV
An in-depth look at MEV on Solana, focusing on sandwich attacks and their impact on the ecosystem
Ship or Die 2025: Fireside Chat with Cathie Wood
Cathie Wood discusses Solana's rise, crypto ETFs, and ARK's new blockchain ventures at a Solana event.
Ship or Die 2025: University Research Driving Innovation
Experts discuss the future of decentralized science funding and its impact on university research
Scale or Die at Accelerate 2025: Indexing Solana programs with Carbon
Carbon, a new Rust framework for indexing Solana data, simplifies developer tooling and streamlines the process of creating indexers for Solana programs.
Scale or Die at Accelerate 2025: Introducing Alpenglow - Solana's New Consensus
Solana unveils Alpenglow, a revolutionary new consensus protocol promising dramatic improvements in speed and security
Scale or Die at Accelerate 2025: Solver Infrastructure
RockawayX Labs' Krystof Kosina discusses the challenges and solutions in developing cross-chain solvers on Solana
- Borrow / Lend
- Liquidity Pools
- Token Swaps & Trading
- Yield Farming
- Solana Explained
- Is Solana an Ethereum killer?
- Transaction Fees
- Why Is Solana Going Up?
- Solana's History
- What makes Solana Unique?
- What Is Solana?
- How To Buy Solana
- Solana's Best Projects: Dapps, Defi & NFTs
- Choosing The Best Solana Validator
- Staking Rewards Calculator
- Liquid Staking
- Can You Mine Solana?
- Solana Staking Pools
- Stake with us
- How To Unstake Solana
- How validators earn
- Best Wallets For Solana