Building a Trial Conversion Engine for a Mid-Market SaaS Platform

Context

A $30M B2B SaaS company with 4,000+ customers in asset-intensive industries — manufacturing, logistics, hospitality, facilities management. Their customer list includes Unilever, Shell, JetBlue, Marriott, and Chick-fil-A. The product is a mobile-first CMMS positioned as the modern alternative to legacy enterprise tools like SAP PM and Maximo. Y Combinator 2017 batch, $50M in funding, profitable for six consecutive years.

The company had recently shipped an AI assistant embedded across every customer portal — a multi-modal agent with read/write access to maintenance data, voice-first mobile interface, autonomous scheduled actions, and integrations into Gmail, Outlook, Google Sheets, and Microsoft Teams. The product was moving fast.

The GTM motion, less so. The company was spending $40-50K per month on Google Ads, with most conversions driven by branded search terms. An SDR team ran outbound sequences via Outreach, with phone as the primary touch point. ZoomInfo for contact data. Salesforce and HubSpot running in parallel.

Free trials were the primary conversion mechanism — and a known gap. The company’s own internal documentation called trial conversion “a significant opportunity area.”

Smoothed identified this as an architecture problem worth solving. We diagnosed the opportunity, built the full system, and proved the approach — a complete trial conversion engine with ~10,300 lines of code, processing pipeline, scoring infrastructure, and a three-view dashboard. This is how Smoothed works: when we see a problem worth engineering against, we build it.

The problem

The trial funnel had a structural gap: signups came in, but the GTM team couldn’t distinguish high-ICP prospects from tire-kickers.

The trial experience itself contributed to the problem. After signup, users landed directly in the product UI with no introductory flow, no guided onboarding, no contextual prompts. There was a chat interface in the bottom right corner, but it only expanded on hover — most users never found it. The help center had 21 onboarding articles under “Getting Started” — useful content that no one was being directed to.

Product usage data existed but wasn’t connected to sales routing. The SDR team had two options, both bad: call everyone on the signup list, burning hours on unqualified leads, or wait for prospects to book a meeting themselves, missing the window on high-intent leads who needed a push.

Trial expiration was silent attrition. A prospect could sign up, explore for two days, and disappear without generating a single signal to sales.

But the deeper issue was more fundamental than missing notifications.

This company serves seven distinct industry cohorts, each with different pain points, buying signals, and use cases. A logistics fleet manager evaluating the platform for vehicle maintenance tracking has completely different needs than a hospitality maintenance director managing HVAC systems across hotel properties. They need different conversations, different proof points, different urgency levels.

Generic scoring treats both of them as “trial signup with 3 logins.” A real system needs to understand who they are, what they’re doing in the product, and which of your existing customers look like them — and it needs to do this automatically, at scale, before the trial expires.

Diagnosis

Before writing a line of code, we conducted three parallel investigations.

The scrape

We scraped 1,500+ pages of the company’s public web presence — marketing site, blog, help center, customer stories, learning resources, and product documentation. This wasn’t a surface-level crawl. We built a 6-stage Python pipeline: site discovery, structured scraping, content analysis, keyword extraction via TF-IDF, competitive intelligence mapping, and ICP delta analysis.

The scrape produced a keyword corpus organized by content category, a competitive positioning map, and — most importantly — the raw material for understanding how the company talks about its own customers. Every case study, every testimonial, every “how Company X uses our product” page was captured and analyzed.

This matters because the ICP cohorts need to be grounded in real customer language, not hypothetical personas. When the system scores a logistics company against the fleet/logistics cohort, the pain points, use cases, and success metrics in that cohort come from actual customers — not from a marketing brainstorm.

The ICP analysis

From the scraped content, we distilled 40+ real customer stories into 7 distinct cohorts:

Manufacturing & Industrial Plants — the largest cohort. Companies like Unilever (migrated from SAP), Oshkosh Defense, Rehrig Pacific ($1M+/year savings). Typical profile: 3-50+ technicians, $3,500-$160,000/hour downtime costs.
Multi-Location Franchise & Chain Operators — Orangetheory Fitness (37 locations), McDonald’s franchisees, Chick-fil-A operators, Raceway Car Wash (scaling from 44 to 200 locations). The pain is visibility across locations and standardization.
Facility & Property Management — hotels, storage facilities, mixed-use properties. Data-driven decisions and tenant satisfaction.
Energy, Utilities & Natural Resources — companies like Certarus ($500K/year savings), distributed operations across remote sites. Compliance, safety, fleet management.
Institutional, Government & Non-Profit — Caltech, school districts, the Salvation Army. Often the most digitally immature — migrating from pen-and-paper.
Healthcare — multi-location facilities with Joint Commission compliance requirements and thousands of tracked parts.
Agriculture — seasonal operations where equipment readiness is time-critical.

Each cohort was defined by industry vertical, typical company size, primary pain point, buying trigger, product usage pattern, and financial impact data where available. These aren’t personas — they’re derived from how actual customers describe their own transformation.

The trial flow audit

We signed up for a real trial. The immediate observation: the product pre-seeds trial accounts with demo data — sample assets, work orders, a preventive maintenance schedule. This matters for any engagement scoring system because those system-seeded interactions aren’t real user behavior. A scoring system that counts them as engagement will inflate scores across the board.

We documented the full signup-to-first-session experience: the missing intro flow, the buried chat interface, the help articles that aren’t surfaced during onboarding, and the scheduling integration that only appears if you find the help center chat. We also mapped the API documentation, noting which product events are available for behavioral signal extraction.

The strategic opportunity was clear: use enrichment and cohort matching to contextualize the trial experience. When a logistics fleet manager signs up, the system should know — from firmographic enrichment, not from asking — that they’re in logistics, and proactively surface relevant features, case studies, and pain points.

The diagnosis as a whole is methodology, not just findings. The approach — scrape, analyze, segment, audit — is repeatable. The 7 cohorts are specific to this company; the method applies to any trial-driven SaaS.

The system

We built an 8-stage pipeline that processes trial signups from raw intake through to sales-ready output.

Intake Validation Deterministic

Schema validation, deduplication, ICP pre-filter

Enrichment AI · Claude

Firmographic research, industry classification, company intelligence

Product Signal Extraction Deterministic

Behavioral events → engagement classification (active / exploring / ghosted / churned)

ICP Scoring + Cohort Matching Hybrid

Deterministic score (0–100) + AI cohort rationale across 7 segments

Routing Engine Deterministic

ICP tier × engagement → sales motion (AE fast-track / BDR / nurture / suppress)

Lead Dossier Generation AI · Claude

Intelligence brief with lookalike customers, pain points, talking points

Email Generation AI · Claude

Personalized first-touch email matched to routing context

Quality Gate AI · Separate

Independent review — 6 criteria, numeric thresholds, max 2 regen attempts

Deterministic — auditable, reproducible AI-powered — judgment and synthesis Hybrid — deterministic formula + AI rationale

The architecture splits cleanly: five stages are deterministic (intake validation, product signal extraction, scoring formula, routing engine, and stage orchestration), three are AI-powered (enrichment, dossier generation, email generation), and one is hybrid (ICP scoring uses a deterministic formula for the numeric score and an LLM for cohort rationale).

Here’s what it looks like in practice. A logistics fleet manager signs up for a trial. The intake stage validates and deduplicates the record. Enrichment identifies the company — 800 employees, energy services sector, reliability engineering team. Product signal extraction parses their analytics events: they’ve created work orders, explored asset tracking, and viewed preventive maintenance reports across 4 sessions over 2 weeks. The system classifies them as “active” — this person is evaluating seriously.

ICP scoring runs against all 7 cohorts. The deterministic formula scores firmographic fit (industry match: 20 points, company size: 10, title match: 10), behavioral engagement (engagement score weighted at 0.3), and intent signals (acquisition source: 15, milestone completions: 6). Total: 73 out of 100 — Tier 1. The LLM cohort matcher identifies the energy/utilities cohort and writes a rationale explaining why.

Routing is deterministic: Tier 1 ICP + active engagement = AE fast-track. No ambiguity.

The dossier agent produces an intelligence brief: company context from enrichment, ICP match reasoning, behavioral signals, and lookalike customers from the same cohort. When the dossier references Certarus and their $500K/year savings from the same industry vertical, that’s real customer data. Suggested talking points are derived from the cohort’s known pain points: compliance, distributed operations, fleet management.

The email agent drafts a personalized first-touch email — not a template, but a message that references the prospect’s industry, their product usage patterns, and a relevant resource.

Before anything reaches sales, the quality gate reviews it. A separate AI instance — different model, different prompt, different evaluation criteria — grades the dossier and email against 6 criteria: accuracy, personalization, tone, actionability, brevity, completeness. If the score falls below threshold, the system regenerates. If it fails twice, it flags for human review.

What we chose not to build

Scope decisions matter as much as architecture decisions:

No real-time product analytics integration. The system processes exported event data, not a live analytics connection. The architecture proof doesn’t require real-time to demonstrate the pipeline.
No automated CRM writes. The pipeline produces a routing recommendation and a dossier. It doesn’t create Salesforce tasks or enroll leads in Outreach sequences. Sales stays in control of the last mile.
No ML model training. The ICP cohorts are defined by analysis, not learned from historical conversion data. At the calibration stage, you want to see exactly why each lead scored the way it did.

The pre-mortem

Before writing code, we documented 5 specific failure modes and built mitigations into the architecture:

API rate limits during bulk processing — mitigated by pre-computed results stored in files. Results persist across restarts.
Garbage-in-garbage-out cascade — one bad enrichment poisons every downstream stage. Mitigated by confidence scoring at every AI stage.
LLM output schema violations — mitigated by a robust JSON extraction layer with typed defaults. The pipeline never crashes on malformed output.
Quality gate theater — mitigated by architectural separation and numeric grading criteria.
Dashboard state desynchronization — mitigated by file-based persistence and defensive rendering.

Every system we build starts with a pre-mortem — identifying how the system will fail and designing against those failures before writing the first function.

Outcomes

This is a speculative build — there are no production conversion metrics. The outcomes are architectural, and they’re substantial:

~10,300 lines of code across the pipeline, API server, dashboard, and tooling
7 ICP cohorts with scoring baselines derived from 40+ real customer stories and quantified financial impact data
1,500+ pages scraped, analyzed, and transformed into a keyword corpus and competitive intelligence map via a purpose-built 6-stage Python pipeline
12 sample leads processed through the full pipeline — 8 routed correctly on the first run, 4 misrouted with fully diagnosed root causes
A three-view dashboard — pipeline (all leads with stage status and routing), dossier (interactive score breakdown), observability (aggregate metrics and cost tracking)
Full pipeline cost: under $0.04 per lead

Trial Conversion Pipeline

12 leads enriched, scored, and routed through an 8-stage pipeline

12 leads | $0.49 total cost | 46.7s avg

Company	Industry	Tier	Engagement	Route	Score
Shearer's Foods	Food Manufacturing	Tier 1	active	ae fast track	93
Tri-City Medical Center	Healthcare	Tier 1	active	ae fast track	95
Mueller Water Products	Water Infrastructure	Tier 1	active	ae fast track	85
Clearway Energy	Energy	Tier 1	churned	bdr priority	80
Drury Hotels	Hospitality	Tier 2	ghosted	nurture	57
Pretium Packaging	Manufacturing	Tier 2	churned	nurture	73
Tijuana Flats	Food Service	Tier 1	exploring	ae fast track	81
Cabot Creamery	Food & Beverage	Tier 2	churned	nurture	78
Salvation Army Kroc Centers	Non-profit	Tier 2	ghosted	nurture	62
Alum Rock Union School District	Education	Tier 2	churned	nurture	60
Braze	Technology	Tier 3	active	low priority	25
Plaid	Financial Technology	Tier 3	churned	low priority	25

Shearer's Foods

Food Manufacturing

Employees: 450

Routing Decision

Tier 1 ICP match (Manufacturing & Industrial Plants), score 93 with active engagement → ae fast track

ICP Score: 93/100

Firmographic 40/40

✓ Industry match (Manufacturing) +20 pts
✓ Company size (450 employees) +10 pts
✓ Title match (Reliability Engineer) +10 pts

Behavioral 27/30

✓ 8 active sessions over 2 weeks
✓ Created assets, work orders, PMs
✓ Invited 8 team members

Intent 26/30

✓ Google Ads acquisition (high intent) +15 pts
✓ Onboarding completed + milestones +11 pts

Company Summary

Shearer's Foods is a mid-market contract manufacturer and private label producer of snack foods serving major retail and foodservice brands. With 450 employees and complex food production lines, they likely face significant maintenance challenges around equipment downtime, FDA/cGPA compliance documentation, and inventory management across their manufacturing operations.

Pain Points

Production line downtime directly impacts contract fulfillment for major retail brands, potentially costing thousands per hour
FDA and cGPA compliance requirements demand detailed maintenance documentation and traceability
Contract manufacturing model requires consistent quality and uptime to maintain relationships with major retail partners

BDR Talking Points

Already exploring actively with 8 sessions — created work orders and assets. Ask about their maintenance team experience so far.
Companies like Water Lilies Food (who also serves Walmart and Target) reduced their downtime from 2–4 hours to just 8 minutes per shift.
With FDA and cGPA compliance requirements, digital maintenance records and traceability is crucial.

Pipeline Observability

Aggregate performance metrics, cost tracking, and quality monitoring across all processed leads

Total Leads 12

Total API Cost $0.49

Avg Confidence 91.3%

Total Processing 560.2s

Route Distribution

nurture

ae fast track

low priority

bdr priority

Tier Distribution

Tier 1

Tier 2

Tier 3

Stage Performance

Stage	Type	Avg Confidence	Pass Rate	Avg Duration	Cost / Lead	Total Cost
intake	deterministic	100.0%	100%	0ms	—	—
enrichment	claude-sonnet-4	80.0%	100%	5.3s	0.69¢	$0.083
product signals	deterministic	100.0%	100%	0ms	—	—
icp scoring	deepseek-v3	80.0%	100%	3.4s	0.03¢	$0.004
routing	deterministic	100.0%	100%	0ms	—	—
dossier	claude-sonnet-4	85.0%	100%	18.2s	2.10¢	$0.252
email	claude-sonnet-4	90.0%	100%	9.7s	1.05¢	$0.126
quality gate	deepseek-v3	88.0%	100%	5.1s	0.15¢	$0.024

The 4 misrouted leads deserve attention. Three independent root causes were diagnosed:

A hospitality company scored Tier 2 instead of Tier 1 because “Hospitality” wasn’t listed in the relevant cohort’s industry matching list — even though “Hotels” was. A 10-point scoring difference. Fix: add the missing industry terms.

An energy services company was classified as “exploring” instead of “churned” because the threshold required 2 or fewer features used — this company had used 3 features across 4 sessions but hadn’t logged in for over a week. Fix: raise the threshold.

Two Tier 2 leads with ghosted engagement were routed to BDR outreach instead of nurture, because the routing table had no engagement sub-rules for Tier 2. Active and ghosted leads got identical treatment. Fix: add engagement sub-rules.

Each fix was scoped to a specific file and line number. Each root cause was independent. The scoring calibration analysis runs to 13 pages.

Real systems need calibration. The architecture surfaces where scoring breaks and why — that’s the point.

What’s next

For this system, production deployment means CRM integration — connecting the pipeline to live trial signups rather than sample data. Real behavioral data flowing through product signal extraction. Ongoing scoring calibration as cohort definitions sharpen against live conversion outcomes. And eventually, closing the loop: the same intelligence that routes BDR outreach could contextualize the trial experience itself — surfacing relevant features, case studies, and pain points inside the product based on who the user is.

For anyone reading this with a similar problem: the architecture pattern — intake, enrich, score, route, generate, verify — applies to any trial-driven or product-led SaaS. The ICP cohorts change. The scoring weights change. The routing rules change. The pipeline doesn’t.

The full system is Smoothed IP, built to prove what’s possible when you engineer the trial funnel instead of automating it.