Arrow
Back

Building an On-Chain Token Analytics Tool – Gemmer Case Study

Building an On-Chain Token Analytics Tool - Gemmer Case Study
Gemmer Logo

Gemmer

Infrastructure

Gemmer is a private-access on-chain analytics platform for early-stage tokens. It tracks new token launches, early buyers, liquidity, and token distribution across multiple blockchains.
Technology Stack:
Table of contents

Key Requirements for Building a Token Analytics Tool

  • Detecting new tokens and liquidity pairs across multiple blockchains and DEXs in real time.
  • Tracking early buyers, categorizing wallets by behavior, and analyzing purchase patterns.
  • Aggregating and standardizing data from heterogeneous networks into consistent, structured signals.
  • Keeping end-to-end latency low so alerts could be generated within seconds of token creation.
  • Scale with high transaction volumes without impacting data consistency or signal accuracy.

System Architecture Overview for On-Chain Token Tracking

The system is built around a set of bots, each following the same core workflow:

  1. Pair creation monitoring

Each bot tracks the creation of new liquidity pairs in real time across its assigned DEXs and launchpads.

  1. Purchase aggregation

When a new token appears, the system records all purchases of that token in pairs with the blockchain’s native asset (e.g., SOL, ETH) or stablecoins (USDC, USDT).

  1. Signal trigger

Once the cumulative number of purchases for a token across all monitored DEXs reaches a predefined threshold, the system generates and dispatches a signal.

  1. Signal uniqueness

Each token generates only one signal, regardless of how many pairs or DEXs it appears on.

Bot Specification

Bot 1 (Solana): Pump.fun

Bot 2 (Solana): Pumpswap, Raydium, LaunchLab, Meteora, Meteora DBC

Bot 3 (Ethereum): Uniswap v2, Uniswap v3, Uniswap v4

Bot 4 (Base): Uniswap v2, Uniswap v3, Uniswap v4, Aerodrome

Bot 5 (BNB Chain): PancakeSwap v2, PancakeSwap v3, 4meme launchpad

Signal Structure

token analytics tool development - signal strucutre

Each signal is a structured message containing multiple layers of information:

1. Token metadata

  • Token address: e.g., CoCkPn1rLVumueRLxtBfr1oGJ4sXcKfBDktPCNAkdA2D
  • Market capitalization (MC): e.g., $176,919
  • Total liquidity (Liq): e.g., $19,831
  • Number of holders: e.g., Holders: 127
  • Holder distribution:
    • Top 10: 23.2% — percentage of total supply held by the top 10 wallets
    • Fresh: 56.1% — percentage held by first-time buyers (“Fresh” wallets)

2. Additional analytics

  • Good early buys: tokens previously purchased by two or more of the current token’s buyers, useful for identifying wallets with strong investment patterns
  • Links: social and market references, e.g., Twitter, Telegram, website, Dexscreener

3. Creator information (launchpads only)

  • Dev: wallet address of the token creator
  • Ownership: percentage of total supply held by the creator, e.g., 0.0%
  • Native token balance: e.g., (19.8 SOL)
  • Activity status: e.g., active
  • Previously deployed tokens: list of other tokens created by this address, e.g., (BNBTARD, 404, 404, Henry, Coin)

Buyer Categories and Transaction Details

Buyers are classified based on prior activity and purchase size:

  1. Fresh IDxs
    • Wallets that have never purchased any tokens before (“Fresh”)
    • Purchase amount ≤5 SOL (or equivalent in other blockchains)
  2. Fresh >5 SOL IDxs
    • First-time wallets with purchases >5 SOL (or equivalent)
  3. Inactive 3d IDxs
    • Wallets that haven’t made any token purchases in the last 3 days
  4. Inactive 10d IDxs
    • Wallets inactive for the last 10 days

For each transaction in the “Fresh” categories, the system records:

  • Purchase amount: in the blockchain’s native asset
  • % of total supply bought: share of the token acquired in that transaction
  • Wallet top-up timestamp: last time the wallet received funds
  • Funding source:
    • Unknown: EOA wallet
    • Known: CEX / bridge / mixer
    • Same Funded: multiple buyers funded from the same wallet (EOA), which may indicate coordinated activity or use of “camps”

This categorization allows the system to distinguish between organic and potentially coordinated early activity, providing a granular view of token adoption dynamics.

Challenges in High-Volume Blockchain Data Ingestion

1. Database performance under high-volume ingestion

Tracking thousands of tokens and millions of transactions across multiple chains creates high-cardinality tables and complex query paths. Without intervention, queries on these tables can become slow, affecting downstream processing.

Solution:

  • Batch queries were introduced for grouped operations, and indexes were added to high-traffic fields.
  • A background pruning process removes stale data automatically, keeping the working set size manageable.

2. Infrastructure challenges (mainly Solana)

Node reliability

Many Solana node providers couldn’t handle high transaction volumes and lacked complete historical event data.

High block rate and transaction volume

With blocks produced roughly every 2 seconds and a high transaction volume, collecting and processing all events in real time is challenging.

Solutions

  1. Provider selection
    After testing multiple Solana node providers, we chose Quicknode with gRPC streaming. It reliably delivered the full set of historical and real-time events without restrictive limits.
  2. Custom indexer
    We built a high-performance indexer to take full control over data ingestion. This allowed us to handle Solana’s high block rate and transaction volume efficiently.
  3. Aggregator service
    On top of the indexer, we added an aggregator that processes raw data and serves it in a structured format to downstream consumers (Pump bot, Solana bot). This separation of responsibilities improved both reliability and maintainability.

Architecture Impact and Operational Results

  • Early-stage token monitoring: Captures new token launches across multiple blockchains in real time.
  • Buyer behavior analysis: Classifies wallets by activity and purchase size, separating first-time buyers from returning or coordinated investors.
  • Transaction-level detail: Records purchase amounts, share of total supply, wallet top-up timestamps, and funding sources to provide a granular view of token distribution.
  • Scalable data ingestion: Batch queries, targeted indexes, and background pruning maintain performance when processing millions of transactions across high-cardinality tables.
  • Blockchain data processing: Custom indexers and aggregator layers handle high block rates and transaction volumes, structuring data for downstream consumers.
  • Market coverage: Parallel indexing across chains and DEXs provides extensive visibility without impacting latency or data consistency.

“At Rock'n'Block, we take your blockchain ideas
and turn them into tangible, innovative solutions."
“At Rock'n'Block, we take your blockchain ideas and turn them into tangible, innovative solutions."
Get a Free consultation
arrow
Rock'n'Block team
Product manager | Rock’n’Block
To:
Rock'n'Block logo
Rock n Block
This site is protected by reCAPTCHA and the Privacy Policyand Terms of Service apply.
done
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

Let’s Talk

Write to us

We’ll be in touch soon.

Awards