Earn 7.17% APY staking with Solana Compass + help grow Solana's ecosystem

Stake natively or with our LST compassSOL to earn a market leading APY

Scale or Die 2025: Solving The Pains of Indexing On Solana

By accelerate-25

Published on 2025-05-19

The Graph's Substreams solution addresses major challenges in indexing Solana blockchain data, offering significant performance improvements and developer-friendly features.

The notes below are AI generated and may not be 100% accurate. Watch the video to be sure!

In a groundbreaking presentation at Scale or Die 2025, Giuliano Francescangeli, Substreams Product Manager at The Graph, unveiled a revolutionary solution to the most pressing challenges faced by developers when indexing data on the Solana blockchain. This new technology promises to dramatically improve the efficiency and ease of building applications on Solana, potentially unlocking a new wave of innovation in the ecosystem.

Summary

The Graph's Substreams technology addresses five major pain points in Solana indexing: real-time indexing, historical data processing, handling blockchain reorganizations (reorgs), accessing account state, and decoding Interface Description Language (IDL) data. By tackling these issues, Substreams offers a comprehensive solution that could significantly reduce development time and complexity for Solana projects.

Francescangeli highlighted the unique challenges posed by Solana's high-speed, high-throughput architecture, which requires near-instantaneous data processing to keep up with the blockchain's rapid pace. Traditional indexing methods often fall short, leading to delays and missed data. Substreams' approach involves parallel processing, efficient data compression, and innovative handling of blockchain-specific issues like reorgs and account state changes.

The presentation revealed impressive performance metrics, including a mere 1.5-second average drift from the blockchain's head, with speeds as low as 600 milliseconds in some cases. For historical data processing, Substreams boasts a staggering 10,000% performance boost compared to traditional methods, potentially reducing processing times from weeks to days.

Key Points:

Real-time Indexing Solutions

Solana's high transaction throughput and short block times present a significant challenge for real-time indexing. Traditional approaches using RPCs or streaming services like Yellowstone often fall behind quickly, leading to missed data and inconsistencies.

Substreams tackles this issue head-on with a high-performance, deterministic approach. By achieving an average of just 1.5 seconds drift from the blockchain's head, with speeds as low as 600 milliseconds in some cases, it ensures that indexers can keep pace with Solana's rapid block production. The use of gRPC protocol with protocol buffers for message encoding further optimizes data transmission, reducing bandwidth requirements and improving overall efficiency.

Historical Data Processing

Processing historical blockchain data on Solana has been a major bottleneck for many projects. The sheer size of Solana blocks (up to 40 megabytes for 100 blocks) and the complexity of data structures like inner instructions make this task particularly challenging.

Substreams introduces a revolutionary approach to historical data processing. By grouping blocks into 1,000-block flat files and allocating workers to process these files in parallel, Substreams achieves a remarkable 10,000% performance boost compared to traditional linear processing methods. This improvement can reduce processing times from weeks to mere days, significantly accelerating development and data analysis tasks.

Reorg Handling

Blockchain reorganizations (reorgs) are a common occurrence on Solana, presenting a significant challenge for data consistency and reliability. Traditional approaches often involve waiting for finalized blocks or implementing complex custom signals and database management systems.

Substreams simplifies reorg handling with an in-memory replication of the blockchain and its branches. This approach allows for flexible, safe, and easy management of reorgs without imposing additional burden on developers. Each unique Substreams request maintains its own chain representation, eliminating conflicts between different users and ensuring data consistency.

Account State Access

Solana's account-based model offers advantages in terms of parallelization but presents challenges when tracking historical state changes. Existing solutions often lack access to historical data and may face issues with large or frequent requests.

Substreams addresses these limitations by providing a three-month moving window of historical account state data. This feature allows developers to back-process state changes, track ownership transfers, and access the most recent account changes efficiently. By rounding account changes to the block level, Substreams reduces data overhead and simplifies state tracking for developers.

IDL Compatibility and Decoding

The lack of standardization in Interface Description Language (IDL) formats and frequent breaking changes in Solana's ecosystem pose significant challenges for developers. Manual importing and mapping of instructions and events can be time-consuming and error-prone.

Substreams aims to automate IDL handling and decoding, addressing edge cases and streamlining the development process. By decoding data at the point of extraction and offering community support for unsupported IDLs, Substreams significantly reduces the burden on developers in managing versioning and compatibility issues.

Facts + Figures

  • Substreams achieves an average of 1.5 seconds drift from Solana's blockchain head
  • In some cases, Substreams can process data as quickly as 600 milliseconds from the head
  • Historical data processing sees a 10,000% performance boost with Substreams
  • 100 Solana blocks can be up to or above 40 megabytes in size
  • Substreams processes historical data in groups of 1,000-block flat files
  • A three-month moving window of account state history is provided by Substreams
  • Substreams is natively Rust-based, enhancing compatibility with Solana's ecosystem
  • The solution can reduce historical data processing times from weeks to days
  • Substreams uses gRPC protocol with protocol buffers for efficient data transmission
  • The technology addresses five major pain points in Solana indexing: real-time indexing, historical data processing, reorg handling, account state access, and IDL compatibility

Top quotes

  1. "If you're not sitting extraction next to execution, you're missing out right away."
  2. "We really tried to push performance while keeping maintaining determinism."
  3. "What we've seen in recent history is that 100 blocks may be up to or above 40 megabytes."
  4. "We're seeing 10,000% performance boost. You have to think weeks to days. It's crazy."
  5. "Reorgs will vary in size. The legacy approach to solving this problem is either you're going to wait for finalized blocks, but this again puts you quite behind head, or you implement custom signals."
  6. "We have an in-memory replication of the chain and its branches."
  7. "The account state model in Solana is great because you don't have this monolithic state structure like you have on other ecosystems like EVM."
  8. "We're really aiming for ease of mind here."
  9. "The core crux of the problem is the versioning itself. They impose breaking changes on and breaking changes are imposed on developers every week."
  10. "The most important thing is the user experience. The user experience is affected. It's very important. And especially on a chain like Solana, that's not acceptable."

Questions Answered

What is Substreams and how does it help with Solana indexing?

Substreams is a technology developed by The Graph to address major challenges in indexing Solana blockchain data. It offers solutions for real-time indexing, historical data processing, handling blockchain reorganizations, accessing account state, and decoding IDL data. By tackling these issues, Substreams significantly improves the speed and efficiency of data indexing on Solana, allowing developers to build applications more easily and quickly.

How does Substreams improve real-time indexing on Solana?

Substreams achieves near real-time indexing on Solana by maintaining an average of just 1.5 seconds drift from the blockchain's head, with speeds as low as 600 milliseconds in some cases. It uses a high-performance, deterministic approach and employs gRPC protocol with protocol buffers for efficient data transmission. This ensures that indexers can keep up with Solana's high transaction throughput and short block times, which is crucial for maintaining data consistency and providing a smooth user experience.

What performance improvements does Substreams offer for historical data processing?

Substreams provides a remarkable 10,000% performance boost for historical data processing compared to traditional methods. It achieves this by grouping blocks into 1,000-block flat files and allocating workers to process these files in parallel. This approach can reduce processing times from weeks to mere days, significantly accelerating development and data analysis tasks on the Solana blockchain.

How does Substreams handle blockchain reorganizations (reorgs)?

Substreams simplifies reorg handling with an in-memory replication of the blockchain and its branches. This approach allows for flexible, safe, and easy management of reorgs without imposing additional burden on developers. Each unique Substreams request maintains its own chain representation, eliminating conflicts between different users and ensuring data consistency. This is a significant improvement over traditional methods that often require waiting for finalized blocks or implementing complex custom signals.

What solutions does Substreams offer for accessing account state on Solana?

Substreams provides a three-month moving window of historical account state data, allowing developers to back-process state changes and track ownership transfers efficiently. It also rounds account changes to the block level, reducing data overhead and simplifying state tracking. This approach addresses the limitations of existing solutions that often lack access to historical data and may face issues with large or frequent requests.

How does Substreams address IDL compatibility and decoding issues?

Substreams aims to automate IDL handling and decoding, addressing edge cases and streamlining the development process. It decodes data at the point of extraction and offers community support for unsupported IDLs. This significantly reduces the burden on developers in managing versioning and compatibility issues, which are particularly challenging in the Solana ecosystem due to frequent breaking changes and lack of standardization in IDL formats.

Why is Substreams important for developers building on Solana?

Substreams is crucial for Solana developers because it addresses major pain points in data indexing, which is a fundamental requirement for many blockchain applications. By simplifying and accelerating indexing processes, Substreams allows developers to focus more on building innovative features and improving user experience, rather than dealing with complex infrastructure issues. This can potentially lead to faster development cycles, more robust applications, and ultimately, a more vibrant Solana ecosystem.


Related Content

Scale or Die 2025: Solving The Pains of Indexing On Solana

The Graph's Substreams solution addresses major challenges in indexing Solana blockchain data, offering improved performance and developer experience.

Scale or Die 2025: Adapting DEX Aggregation to Solana: Routing Under Constraints

0x's Duncan Townsend explains the challenges and solutions for bringing DEX aggregation to Solana's unique blockchain environment.

Scale or Die at Accelerate 2025: Scale to win: agave's performance arc

Alessandro Decina from Anza reveals groundbreaking performance improvements for Solana, debunking scalability myths and showcasing innovative solutions.

Ship or Die 2025: Solana Attestation Service

Solana Attestation Service launches on mainnet, enabling seamless KYC and data verification for on-chain applications

Breakpoint 2023: How to Build Neon on Solana

Neon Labs co-founder unveils advancements in Neon EVM, promising high transaction throughput and interoperability for Ethereum apps on Solana.

Ship or Die 2025: Enterprise Adoption of Stablecoins

Exploring the future of stablecoins and their impact on global financial systems

Anatoly Yakovenko: What's Next for Solana? | Permissionless II

Anatoly Yakovenko discusses Solana's momentum, Firedancer's performance improvements, and the vision for multi-leader slots in this insightful Permissionless II interview.

Scale or Die 2025: No-strings-attached programs w/ Pinocchio

Fernando Otero introduces Pinocchio, a new dependency-free SDK for writing efficient Solana programs

Scale or Die at Accelerate 2025: Fireside: zkSVMs

Industry experts discuss the potential of zkSVMs and rollups for scaling Solana and improving DeFi applications

Scale or Die at Accelerate 2025: The State of Solana MEV

An in-depth look at MEV on Solana, focusing on sandwich attacks and their impact on the ecosystem

Ship or Die 2025: Fireside Chat with Cathie Wood

Cathie Wood discusses Solana's rise, crypto ETFs, and ARK's new blockchain ventures at a Solana event.

Ship or Die 2025: University Research Driving Innovation

Experts discuss the future of decentralized science funding and its impact on university research

Scale or Die at Accelerate 2025: Indexing Solana programs with Carbon

Carbon, a new Rust framework for indexing Solana data, simplifies developer tooling and streamlines the process of creating indexers for Solana programs.

Scale or Die at Accelerate 2025: Introducing Alpenglow - Solana's New Consensus

Solana unveils Alpenglow, a revolutionary new consensus protocol promising dramatic improvements in speed and security

Scale or Die at Accelerate 2025: Solver Infrastructure

RockawayX Labs' Krystof Kosina discusses the challenges and solutions in developing cross-chain solvers on Solana