Analytics Strategy

Choosing Your North Star Metric: A Framework for B2B SaaS

Abstract star or focal point visualization representing north star metric concept

Almost every product team claims to have a north star metric. When you ask what it is, the answers tend to fall into two categories. The first category: revenue metrics. ARR, MRR, net revenue retention. These are important numbers and you should absolutely track them, but they are not north star metrics — they are outcome metrics. The second category: something that sounds like a product metric but is actually a lagging proxy for what the product is supposed to do. "Monthly active users" is the most common example. It's not meaningless, but it's also not a north star — it's a measurement of whether users showed up, not whether they got value.

This matters because the metric you optimize toward shapes the product decisions you make. A team optimizing for MAU will build features that increase login frequency. A team optimizing for a genuine north star — one that measures the rate at which users are getting the core value the product delivers — will build features that increase that rate, which may or may not include features that increase login frequency. The difference compounds over 12–18 months into meaningfully different products.

What a North Star Metric Actually Is

The north star concept, popularized through the Reforge curriculum and widely adopted in product-led growth thinking, rests on a specific claim: there exists a single metric for your product that simultaneously captures value delivery to users and predicts long-term revenue. The north star is the metric at the intersection of these two signals.

The formal test for a north star metric has three criteria:

  1. It captures value delivery. When the metric increases, users are getting more of what the product is supposed to give them — not just using the product more, but getting the outcome they came for.
  2. It predicts long-term retention and revenue. Users or accounts with higher north star metric values should retain at higher rates and expand at higher rates than those with lower values. If you can't draw a correlation between the metric and retention in your data, it's not a north star candidate — it's a usage metric.
  3. The product team can influence it. A metric the product team can't move is a measurement, not a target. Revenue and NPS fail this criterion for most product teams because the levers that move them are distributed across sales, marketing, and customer success, not just product.

Why Revenue Metrics Fail as North Stars

We're not saying revenue doesn't matter — it matters enormously and should be tracked closely as a guardrail metric. What we're saying is that revenue is too downstream from product decisions to be useful as a directional metric for day-to-day product work.

The signal path from a product change to revenue impact is: product change → user behavior change → retention/expansion change → revenue change. The first link in that chain is the only one the product team controls. Revenue is three links removed. If you're using revenue as your north star, you're flying with a three-week lag on your instruments.

North stars need to be close enough to product actions that you can see the effect of product changes on them within 2–4 weeks. Revenue at monthly or quarterly cadence is too slow to iterate against.

North Star Patterns by Product Type

There's no universal north star metric, but there are patterns by product category that are worth knowing as starting points:

Productivity and Workflow Tools

North stars tend to be completion or throughput metrics: "tasks completed per active user per week," "documents published per account per month," "workflows run per team per week." The key is that the metric captures value delivered (work completed) not just usage (sessions started). A PM at a task management tool might initially propose "tasks created" as their north star — but tasks created without tasks completed is a graveyard metric. The value is in the completion, not the creation.

Collaboration and Communication Tools

North stars tend to be network-depth metrics: "messages sent across team" tells you the collaboration network is active, but "threads resolved per team per week" (for a support or asynchronous communication tool) tells you the collaboration is productive. The depth dimension — not just that users are communicating but that communication is reaching closure — is usually more predictive of retention.

Data and Analytics Tools

North stars tend to be insight or output metrics: "dashboards viewed by non-creator" (a dashboard that only the creator views is not delivering value to the team), "queries resulting in action" (if you can track downstream decisions, even through proxy), or "reports shared per account per week." The metric should capture whether the data is actually being used to make decisions, not just whether users are logging in and generating queries.

Developer Tools and APIs

North stars tend to be integration-depth or production-usage metrics: "API calls in production environment per account" (not just in development), "events processed successfully per week per integration," or "deployments triggered per team per week." Development-environment usage is cheap; production-environment usage is commitment. North stars for developer tools should anchor to production.

Input Metrics and the North Star Tree

A north star metric in isolation is a measurement target. A north star metric with a decomposed set of input metrics is a strategy. The input metric framework (sometimes called the north star tree or metric tree) works as follows:

  • North star: "Weekly messages sent across teams" (a collaboration tool example)
  • Input metric 1 (breadth): "% of accounts with 3+ active users in past week" — how many accounts are generating cross-team activity at all
  • Input metric 2 (depth): "Messages per active account per week" — for accounts that are active, how actively are they communicating
  • Input metric 3 (frequency): "% of active users who send ≥1 message per day" — is communication happening daily (habit) or episodically

Each input metric is a lever that product teams can act on independently. Breadth is an activation problem — more accounts need to reach the threshold where cross-team messaging is happening. Depth is a feature-richness problem — within active accounts, what makes some use the product more intensively? Frequency is a habit-formation problem — what drives daily vs. weekly engagement patterns?

Decomposing the north star into input metrics converts a directional goal into a set of diagnostic questions, each of which maps to a specific area of the product. This is where the north star metric earns its value as a strategic tool rather than just a reporting number.

Guardrail Metrics: The North Star's Safety Net

Optimizing for a single metric always risks moving that metric in ways that harm other important outcomes. Guardrail metrics are the counters you watch to ensure that north star improvements aren't coming at the cost of something else:

  • If your north star is engagement-depth, your guardrail might be support ticket volume (to catch engagement that's happening because the product is confusing, not because it's valuable)
  • If your north star is messages sent, your guardrail might be user satisfaction score or reply rate (to catch volume increases that represent spam or low-quality communication)
  • If your north star is events processed, your guardrail might be API error rate (to catch throughput increases that are degrading data quality)

A metric tree with a north star, 3–5 input metrics, and 2–3 guardrail metrics is the standard operating model for a mature product analytics practice. Everything else in the analytics stack — funnels, cohort analyses, feature adoption metrics — is in service of explaining movements in this tree.

The Alignment Problem

One underappreciated dimension of north star metric selection: it needs to be legible to the whole organization, not just the product team. A north star metric that requires a three-paragraph explanation to understand is a bad north star metric, even if it perfectly captures product value. The metric needs to be something that a salesperson, a customer success manager, and a board member can intuitively grasp and connect to their own work.

This is why "weekly active teams" (for a collaboration tool) beats "weighted engagement score computed from a mix of events normalized by team size and filtered to accounts with ≥30 days tenure." Both might be equally predictive of retention. Only one can be the north star. The simpler one drives alignment; the complex one drives analysis paralysis.

The north star metric selection conversation is ultimately a strategic conversation, not a statistical one. The statistics tell you which metric is predictive. Judgment tells you which predictive metric will focus your team and your product in the direction you want to go. Both inputs matter. Neither is sufficient alone.