You're drowning in leads that never convert
The symptom
Your team is spending thousands per month on ZoomInfo, Apollo, or Clay credits to build lists. The lists have thousands of names. Your BDRs email them. Response rates are 0.5–1%.
The leads that do convert were usually already looking — they found you through a Google search or a peer recommendation. The list didn’t find them. The list found everyone else.
The pattern is always the same. You buy a list of 5,000 contacts that match your ICP on paper — right titles, right industries, right company sizes. You load them into Outreach or Salesloft. Your BDRs run a 4-touch sequence. Open rates look fine. Reply rates are dismal. The leads that actually book a meeting? They were already in-market. Your list just happened to include them.
The deeper problem: list-based prospecting treats all contacts in a segment as equally likely to buy. A VP of Engineering who posted on Reddit last week asking for a Bright Data alternative is fundamentally different from a VP of Engineering who hasn’t thought about proxies in two years. They’re on the same list. They get the same sequence. One is ready now. The other is noise.
Why current solutions fail
The standard approach is some combination of contact databases, enrichment workflows, and intent data overlays.
ZoomInfo and Apollo give you contact data, not intent data. You know who someone is — their title, their company, their email. You don’t know whether they’re looking. A list of 10,000 VPs of Engineering tells you nothing about which ones have a problem you can solve right now. You’re spraying into the dark and calling it targeting.
Clay enrichment workflows can layer on firmographic data, technographic signals, funding events, and job postings. But they’re still operating on static lists. Enriching a bad list gives you a well-decorated bad list. The enrichment tells you more about the company — it doesn’t tell you whether anyone at that company is actively looking for what you sell.
Intent data vendors like Bombora and G2 sell aggregated signals at the account level — “Company X is researching web scraping tools.” But you don’t know who at Company X, what specifically they said, or how urgent it is. The signal is a black box: some anonymized combination of content consumption and search behavior, averaged across an account, delivered weekly. And every competitor buying the same intent feed gets the same signal at the same time.
The ceiling: you can buy more data, enrich it further, and score it with increasingly complex models. But you can’t buy the signal that a specific person expressed a specific need on a specific platform three hours ago. That requires a system, not a subscription.
What a real system looks like
A lead intelligence layer doesn’t start with a list and enrich it. It starts with expressed intent and builds backward to the company and contact.
4 parallel scrapers — Reddit, GitHub, HackerNews, Twitter — keyword-driven, every 4 hours
Intent tier assignment (1–4), company name extraction, signal type classification
Auto-create lead records, deduplication, Slack alerts for Tier 1 signals
LinkedIn company pages via Thor Data Scraper API — employee count, industry, tech stack
3-tier API fallback for decision-maker profiles — Scraper API → Web Unlocker → SERP
Scoring on signal tier × company fit × contact availability → priority queue
Six stages: signal capture from platforms where buyers express intent, AI-powered classification into urgency tiers, automatic lead creation with deduplication, company enrichment via LinkedIn scraping, contact identification for decision makers, and deterministic qualification scoring.
The system watches Reddit, GitHub, HackerNews, and Twitter — not for mentions of your brand, but for the language that signals buying intent. Someone posting “looking for a Bright Data alternative” in r/webscraping is a Tier 1 signal. Someone building a web scraping pipeline and asking about proxy infrastructure on GitHub is Tier 2. Someone discussing anti-bot detection theory on HackerNews is Tier 3. A student asking about proxies for a class project is Tier 4.
Classification is AI-powered — Claude Haiku assigns the tier, extracts the company name when identifiable, and categorizes the signal type. But the response to each tier is deterministic: Tier 1 gets same-day outreach. Tier 2 enters a priority queue. Tier 3 goes to nurture. Tier 4 is dropped. No ambiguity.
The result: 36% of captured signals qualify as Tier 1, versus 0.5–1% conversion from cold lists. You’re not finding more leads. You’re finding the right ones.
Company Context
DataForge AI builds price intelligence tools for e-commerce brands, scraping product data across 200+ retail sites daily. The team is 85 people, Series A funded, based in Austin. Their data pipeline is core infrastructure — proxy reliability directly impacts product accuracy and customer SLAs.
Key Contacts
- James Chen — CTO (LinkedIn)
- Sarah Okafor — VP Engineering (LinkedIn)
Signal Context
Posted in r/webscraping, a subreddit with 45K members focused on web scraping tools and infrastructure. The post received 12 replies, several recommending specific providers. The author described a specific use case (e-commerce price intelligence), a specific pain point ($6K/month cost), and is actively evaluating — all Tier 1 indicators.
Recommended Response
- Lead with cost comparison — they cited $6K/month on Bright Data. Thor Data's pricing at their volume would be roughly 40% lower.
- Reference e-commerce scraping specifically — Thor Data's Web Unlocker has strong success rates on Shopify, Amazon, and major retail platforms.
- The Reddit post mentions "evaluating alternatives" — they're in active buying mode. Response within 24 hours is critical.
This is what we built for Thor Data’s US market entry. Four platforms. 910 signals captured. 332 high-intent prospects identified. Under $0.05 per qualified lead.
The system we've built for this
Lead Intelligence Layer
Signal capture, intent classification, enrichment, and qualification — built on the platforms where buyers actually talk
See the full system →Proof
Building a Signal-Driven Lead Intelligence System for Thor Data
A web infrastructure company entering the US market with proxy, SERP, and scraping APIs
Read the case study →Does this sound like your situation? Let's talk.