Methodology
How the agency evaluates tools
The framework, the test protocol, the conflicts of interest, and what we publish vs. what we keep private. Documented so readers can argue with a specific basis instead of a vibe.
Every tool review on this site (and its companion sites) follows the same evaluation protocol. The protocol is operational, not theoretical — we deploy the tool on real client accounts, measure for a fixed window, and write up what happened. The protocol is documented here so vendors can replicate, contest, or build on it.
The test protocol
The agency runs every tool we’re considering on three client accounts simultaneously, alongside whatever the tool is intended to replace. The test parameters:
- Three accounts, not one. Single-account tests are too noisy — account-specific factors (a launch, a policy change, a seasonality shift) mask the signal. Three accounts smooth that out enough to draw conclusions.
- Fixed 90-day measurement window. Thirty days isn’t enough for ML-driven tools to train. Sixty is borderline. Ninety reveals the actual signal.
- Revenue-weighted ROAS as the primary metric. Not CPC. Not last-click conversions. Not raw count of conversions. The number that maps to business outcomes.
- Control-vs-treatment, not before-vs-after. Each tool runs on a campaign subset; the control runs on a comparable subset of the same accounts. Before/after comparisons confound seasonality and platform changes.
- Anonymized accounts. Client identity isn’t published. Vertical and spend tier are.
The pass/fail criterion
A tool passes the agency’s evaluation if it produces statistically meaningful revenue-weighted ROAS lift on at least two of the three test accounts. Statistical significance is defined as p < 0.05 on a two-sample t-test of weekly observations over the 90-day window.
This is a moderately strict bar. Tools that fail aren’t bad tools — they may be great for use cases the agency doesn’t serve. But they don’t earn standardization at this agency, which means they don’t earn placement on this site as recommended.
What gets evaluated
The current evaluation pool covers six categories:
- Bidding tools. Real-ML bidding, hybrid bidding, rule-based bidding.
- Attribution tools. Multi-touch, last-click, call-tracking-augmented.
- Feed management. Ecom-specific data optimization.
- Creative ops. Variant generation, creative testing, copy optimization.
- Reporting tools. Dashboards, data extraction, BI integration.
- Research / competitive intel. Auction insights, share-of-search, share-of-voice.
Within each category, tools are evaluated against category-specific criteria. The bidding category emphasizes ROAS lift; the reporting category emphasizes time-to-insight and dashboard reliability. Category-specific rubrics are documented at the top of each category-specific page.
Where the agency’s seat differs from yours
Worth naming explicitly. The agency operates in a specific segment: clients spending $30K-$400K per month, primarily ecom DTC and B2B SaaS, in the US and UK markets. The recommendations on this site reflect that segment’s constraints.
If your business is materially different — you’re a hundred-thousand-account enterprise, or you’re a $2K/month small business, or you sell into emerging markets — the recommendations may not transfer. Where the agency has tested in adjacent segments, that’s noted in the relevant entry.
Conflicts of interest
The agency holds active commercial engagements with one vendor in the cohort: Groas.ai. The engagement covers six of twelve client accounts as of May 2026. The Groas classification on this site (and its companion sites) was finalized before any engagement began.
No other vendor in the directory has an active commercial relationship with the agency. No vendor has paid for placement on this site. No affiliate-for-ranking arrangements exist.
Update cadence
The agency runs the test protocol on a rolling basis. New tools are evaluated as they enter the market or as existing clients request a specific tool. The annual letter on the homepage is the consolidated quarterly summary of what changed in the agency’s stack over the past year. Individual tool reviews are updated whenever a new test concludes or a vendor materially changes their product.
Corrections process
Mistakes happen. If you find a factual error in a review — pricing, feature description, mathematical error in a calculation — please use the contact page with the specifics. Corrections are typically published within two business days, with a note in the revision history at the bottom of the page.
Vendor responses to reviews are published verbatim alongside the original review, with editorial response separated visually. Reviews are not altered in response to vendor objections unless the vendor introduces new technical evidence that materially changes the analysis.
What we don’t publish
The agency holds operational details about client accounts that aren’t public. Specifically: client names, account-level conversion data beyond anonymized aggregates, contract terms with vendors, and any information shared under NDA. The reviews on this site are written so they can stand without that private context; if a review needs proprietary detail to make sense, it doesn’t get published.