Why Case Studies Matter More Than Frameworks
There are plenty of frameworks for cold email. This entire series has been full of them — how to build your ICP, how to write a value proposition, how to structure a sequence, how to handle objections.
What frameworks can't fully convey is how these pieces actually interact in the real world. Which order to tackle things. What the early failures feel like before the program clicks. Which specific optimization moved the metrics and which ones were just noise. The judgment calls that don't fit neatly into a step-by-step guide.
This case study fills that gap. It's a ground-level walkthrough of a B2B cold email program built from scratch — a SaaS company in the HR tech space building its first outbound motion after 18 months of growing primarily through inbound and referrals. The names and specific metrics have been adjusted for confidentiality, but the structure, sequence of events, and lessons are real.
By the end of the case study, the program was generating 100+ qualified leads per quarter from cold outreach. Getting there took about five months and looked nothing like the clean linear process a framework suggests.
The Starting Conditions
The company: A SaaS product in the HR tech space — specifically a performance management and continuous feedback tool for mid-market companies (200–2,000 employees). ACV around $25,000. Sales cycle 2–4 months. Primary buyer: Chief People Officers, VP of HR, or HR Directors.
Why outbound now: The company had grown to $2M ARR primarily through inbound content, product-led virality in small pockets, and founder relationships. Growth had plateaued. The leadership team wanted to build an outbound motion to reach companies that weren't finding them organically and to have a more predictable, controllable pipeline source.
The team starting point: One SDR hired specifically for this program. No prior cold email infrastructure. No sending domains. No list. No sequences. No tracking setup. The SDR had done cold calling at a previous company but had never run a cold email program at this level.
Month 0 timeline: 30 days to have infrastructure live, first sequences running, and initial data coming in.
Month 1: Building the Infrastructure — And Getting It Wrong
The first month was almost entirely infrastructure — and the first significant mistake happened here.
The initial instinct was to use the company's primary domain for cold email. It was clean, had good reputation, and avoiding an extra setup step felt efficient. Within three weeks, that decision was reversed. Open rates on the primary domain were fine — but the first time a reply hit "spam" and a prospect told them "I found your email in my junk folder," the reputational risk became concrete.
A secondary domain structure was set up from scratch: two secondary domains with slight variations on the brand name, two inboxes per domain (four inboxes total), all pointed back to the primary domain with proper 301 redirects. SPF, DKIM, and DMARC set up correctly per the guidance in SPF, DKIM, and DMARC Explained. Warm-up started through Lemwarm with a 6-week schedule before any real campaigns launched.
The list situation was more complex. The initial assumption was that the ICP — HR leaders at mid-market companies — was easy to find. Apollo had plenty of them. But early data revealed three problems: Apollo contact data for HR titles was less fresh than for Sales or Engineering titles (HR leaders move frequently and the database lagged), many of the contacts lacked verified email addresses, and the segment was much more heterogeneous than expected. A company with 300 employees in manufacturing and a company with 300 employees in tech were both in the "200–2,000 employee" filter but responded to completely different messages.
The ICP definition went through its first revision before the first email was sent. The initial ICP document was refined to separate companies by growth stage (fast-growth tech and Series B/C startups vs. stable mid-market traditional businesses) after it became clear that the problem the product solved — lightweight continuous feedback at scale — was primarily a pain point for companies growing fast enough that their performance management process was visibly straining.
The first month's real output: infrastructure live, ICP revised, list rebuilt with better segmentation, warm-up in progress. Zero emails sent to prospects.
Month 2: First Sequences — And the Humbling Reality of Early Reply Rates
With warm-up complete and list rebuilt, the first sequences launched in week 5. Two sequences: one for VP/Director of HR at Series B/C tech companies (50–500 employees), one for Chief People Officers at slightly larger companies (300–1,000 employees).
The first version of the cold email for the Series B/C segment:
Subject: Performance reviews at [Company]
Hi [Name],
I noticed [Company] has grown from 80 to 140 people in the past year — at that growth rate, the lightweight performance approach that worked at 50 people typically starts to create friction with managers and employees.
We work with Series B HR teams to replace the annual review cycle with continuous feedback that takes managers 15 minutes per month instead of 40 hours twice a year. Teams like [Reference Company] cut manager time on performance admin by 60% in the first quarter.
Worth a 15-minute call to compare notes on how you're handling this at [Company]'s current stage?
[Signature]
Reply rate after two weeks: 1.8%.
Not terrible for a first attempt — but below the 3%+ target. And the replies were revealing. Of the 9 replies from the initial 500-person send, 4 were positive interest, 2 were "we already have a solution," 2 were "not the right time," and 1 was "remove me." The positive replies were from companies that had explicitly been dealing with scaling pains around performance management. The non-positive replies were from companies where the problem apparently didn't resonate clearly.
Month 3: The First Iteration That Moved the Needle
The critical insight from month 2 came from reading the non-replies carefully — which is to say, re-reading the email as a recipient rather than as the person who wrote it.
The problem: the email led with a growth reference ("grown from 80 to 140 people") that was specific but didn't immediately connect to why growth was relevant to performance management. The connection was logical — but it required the reader to make a mental leap from "our headcount grew" to "therefore our performance process is strained." The best cold email doesn't require the reader to make the leap; it makes it for them.
The revised version:
Subject: Manager time on reviews at [Company]
Hi [Name],
At [Company]'s current size, manager bandwidth is probably the hidden tax on your performance process — the quarterly or annual review cycle that takes each manager 30–40 hours and still doesn't give the team real-time feedback they can actually act on.
We help Series B HR teams replace that cycle with a continuous feedback structure that takes managers 15 minutes per month. [Reference Company] went from dreading review cycles to running them in the background — while managers spent the time they saved actually developing their people.
Is this a current frustration, or have you found a way to solve it that's actually working?
[Signature]
Two changes: led with the specific pain (manager bandwidth) rather than the growth signal, and changed the CTA from a meeting ask to an honest question about whether the problem was real for them. The second change was counterintuitive — asking "is this a problem?" rather than "let's meet" felt like giving up leverage. In practice, it generated more replies because it made responding feel low-commitment.
New reply rate: 3.7%. Positive reply rate: 2.1%. Both meaningfully above the first version.
Month 3–4: Scaling While Keeping Quality
With a working first email, the next challenge was scaling volume without degrading quality. The initial sequences had been running at 30–50 sends per day across the four inboxes. The goal was to reach 100–150 per day while maintaining deliverability and reply rates.
Infrastructure first: two additional secondary domains, two additional inboxes per domain, all warmed up over 6 weeks. This created the capacity headroom before volume was increased — not after. The mistake of scaling a list before scaling infrastructure is covered in Sending Limits & Scaling Safely, and it was deliberately avoided here.
List quality became a significant focus. An early batch sent to a larger Apollo export showed bounce rates of 4.2% — above the safe threshold. Running the full export through BulkMailVerifier before sending became a non-negotiable step. Post-verification, bounce rates dropped to 1.1%.
Segmentation was also refined. The single "Series B/C tech companies" segment was split into three:
- Fast-growth tech, 50–150 employees: Earliest stage. Pain is starting to emerge. Copy led with the anticipatory framing — "you're about to hit this."
- Growth-stage tech, 150–400 employees: Core segment. Pain is currently felt. Copy was direct about the specific frustration.
- Scaling tech, 400–700 employees: Larger, more mature HR function. Needed different positioning — less about pain, more about upgrading from a legacy process.
Each segment got its own sequence, its own value proposition framing, and its own proof points (reference companies in the same stage range). Reply rates across segments ranged from 2.8% to 5.1%, with the 150–400 employee segment consistently performing best.
Month 4: The Follow-Up Sequence That Changed Everything
A month 4 audit revealed something useful: 65% of positive replies were coming from emails 2–4 in the sequence, not from the initial cold email. The first email was planting seeds; the follow-ups were doing the converting.
This prompted a full rewrite of the follow-up sequence, which had been fairly generic up to this point. The revised follow-up structure, built around the framework from Follow-Up Emails: Timing & Structure:
Email 2 (Day 4): Added a social proof hook not in Email 1 — a specific mini case study in 2 sentences.
Email 3 (Day 9): Reframed the value proposition from the manager's perspective to the employee's perspective. "We talk a lot about manager time — the other side of this is that employees at fast-growth companies consistently cite unclear expectations and lack of feedback as the top reason they leave. The cost of a bad review cycle isn't just manager hours."
Email 4 (Day 15): Offered a genuinely useful resource — a one-page guide to assessing whether a company's performance process was ready for its next growth stage. No pitch. Just "this might be useful regardless of our timing."
Email 5 (Day 22): The direct honest ask — "I've reached out a few times now. Two questions: is continuous feedback on your radar for this year, and if so, are you the right person to talk to or should I be speaking with someone else?"
Email 6 (Day 30): The breakup email, written with specific references to the company's current growth stage and a genuine warm close.
The revised sequence increased overall sequence reply rate from 3.7% to 6.2% — a 67% lift driven almost entirely by better follow-up emails.
Month 5: Hitting 100 Qualified Leads per Quarter
By month 5, the program had found its operating rhythm. The metrics at the end of month 5:
- Sends per week: ~600 (across 8 inboxes, 3 domains)
- Delivery rate: 97.1%
- Open rate: 46% (directional, not absolute)
- Positive reply rate: 4.8%
- Meetings booked per week: 5–7
- Meeting-to-qualified-opportunity rate: 62%
- Qualified opportunities created in Q: 104 (first quarter at full pace)
- Average deal value in pipeline: $23,400
- Pipeline created: $2.4M
The path to 100 leads per quarter wasn't a straight line from the first week to the last. It was: infrastructure mistake → rebuild → first campaign under-performance → rewrite → segmentation expansion → sequence overhaul → volume scale. Five months, four significant pivots, one SDR running the full operation.
The Specific Decisions That Mattered Most
Looking back across the five months, a handful of decisions had the most disproportionate impact:
1. Rebuilding the ICP before the first send. The revised segmentation by growth stage — not just company size — was probably the single highest-leverage decision. Sending the right message to the right stage of company converted dramatically better than sending a generic message to a size range.
2. Running every list through verification. The difference between 4.2% bounce rate and 1.1% bounce rate is the difference between deliverability degradation and a clean sender reputation. The 30 minutes per list batch was never skipped after month 2.
3. Rewriting the follow-ups as independent conversion opportunities. Treating emails 2–6 as real content rather than check-in filler nearly doubled the sequence-level reply rate. This was the single change with the largest numerical impact.
4. Changing the CTA from meeting ask to discovery question. "Is this a current frustration?" generated more replies than "Worth a 15-minute call?" — and the replies themselves were more useful because they told the team which prospects had the active problem and which didn't.
5. Not scaling until infrastructure was ready. Building the additional inbox capacity before increasing volume prevented the deliverability problems that typically occur when teams add volume to existing infrastructure without appropriate headroom.
What Would Have Gone Faster With Better Tools
The program was built with a relatively lean stack — Apollo for prospecting, Instantly for sending, BulkMailVerifier for list hygiene, Lemwarm for warm-up, and standard spreadsheet-based tracking for the first two months before HubSpot integration was set up.
In retrospect, adding Clay earlier would have saved significant manual research time in the ICP refinement stage. The growth stage segmentation that drove the biggest performance improvement required manually researching funding rounds for each contact — a process that Clay could have automated through a Crunchbase integration.
Better attribution tracking from day one would also have made the month 4 audit easier. The finding that 65% of positive replies came from emails 2–4 required manually reviewing the reply attribution data, which was trackable but not surfaced automatically in the initial Instantly setup.
The Lessons Generalized
The specific tactics from this case study won't transfer identically to every program — different ICP, different product, different price point, different market. But the underlying patterns generalize:
The first version is almost always wrong, and that's okay. The initial campaign underperformed. The response was to use the data — which replies came in, what they said, where engagement dropped — to make a specific change. Not to abandon cold email, and not to add more volume. To iterate.
Infrastructure problems are invisible until they're serious. The bounce rate problem could have been caught earlier with a standard pre-send verification step. The secondary domain decision should have been made before the first email went to a prospect. Infrastructure debt in cold email is paid with compounding interest.
Better segmentation outperforms better copy. Splitting one broad ICP segment into three stage-specific segments — each with tailored copy — improved results more than any single copywriting change. The best email to the wrong person performs worse than a good email to the right person.
The program compounds. Month 5 was significantly better than month 2 not because something magically clicked, but because 12+ specific improvements had compounded across targeting, infrastructure, copy, and process. Each improvement was small; collectively they were transformative.
Next up: Building a Cold Email System That Runs on Autopilot — how to systematize everything in this series into a program that operates reliably without constant manual intervention.
