The photograph problem
You cannot zoom in on a blurry photograph and reveal detail that was never captured. The pixels don't exist—the information was never recorded.
Private equity investors experience the same frustration with portfolio company data. The questions they ask are precise: why did margin erode in Q3, which customer segments are churning fastest, whether pricing discipline is holding across the sales organization. The answers come back in averages and totals. Revenue was €2.1 million. Gross margin was 33 percent. Active customers reached 851.
These figures reconcile to the general ledger. They are not wrong, they satisfy auditors. But they lack the resolution to answer the questions being asked.
Three ways it shows up
The resolution gap appears in predictable patterns. Each represents a question that aggregated reporting cannot answer.
The margin that improved while everything got worse
The quarterly results show gross margin improving from 32 to 33 percent. Management presents this as evidence of operational progress—pricing discipline, procurement optimization, operating leverage. The trend line is positive, the narrative is coherent, and the board moves to the next agenda item.
Statisticians have a name for the phenomenon that makes this narrative unreliable: Simpson's Paradox. Identified in the mid-twentieth century, it describes situations where trends observed in aggregate data reverse when the data is segmented. The classic example involves UC Berkeley admissions in the 1970s, where overall data suggested gender bias against women, but department-level analysis revealed the opposite pattern in nearly every individual department.
The paradox is not a rare statistical curiosity. It is the predictable outcome of any dataset where composition shifts alongside the metric being measured. In business terms: whenever mix changes, aggregate trends become unreliable indicators of underlying performance.
Consider what actually happened with that margin improvement. The high-margin product line grew from 40 to 55 percent of revenue. The low-margin line declined proportionally. Both product lines experienced margin compression—the premium product fell from 44 to 42 percent, the volume product from 24 to 22 percent. But because the mix shifted toward the higher-margin line, the aggregate improved even as every component deteriorated.
Detection requires transactional data tagged by product, channel, and customer segment. Many portfolio companies book revenue to a single sales account. The resolution to identify the paradox does not exist in their reporting systems.
The stable price that hid a collapsing distribution
Average revenue per unit holds steady at €127 quarter over quarter. The metric suggests pricing power remains intact, validating a core assumption of the investment thesis.
The problem with averages is well understood in statistical theory but persistently underweighted in business practice. Francis Anscombe demonstrated this in 1973 with four datasets that share identical means, variances, and regression lines but look completely different when visualized. The lesson is straightforward: summary statistics can be identical for datasets with fundamentally different structures.
Applied to pricing: a stable average can mask a deteriorating distribution. Sixty-five percent of transactions might cluster between €140 and €160, aligned with list pricing. Another twenty percent might fall around €100, reflecting negotiated volume discounts. And fifteen percent might scatter between €50 and €80—aggressive discounting to close pipeline before quarter-end.
The mean holds at €127. But the distribution has developed a long left tail that represents systematic erosion of pricing discipline. The sales team is trading margin for volume, and the standard P&L does not surface the behavior. By the time degradation appears in the average, multiple quarters of value have already leaked.
The growing customer count that masked accelerating churn
Active customers: 851, up from 799 the prior year. The retention assumption in the model appears validated. Customer count is growing, so churn must be under control.
Point-in-time metrics are not trajectory metrics. They capture state without revealing dynamics. A snapshot of customer count cannot distinguish between a business with stable 90 percent retention and a business where retention is collapsing but acquisition is temporarily keeping pace.
Cohort analysis—tracking behavior by acquisition vintage—reveals dynamics that snapshots conceal. Restructure the customer data by acquisition year: 2021 customers show 73 percent still active. The 2022 cohort retains at 68 percent. For 2023, retention drops to 61 percent.
Retention is degrading five to seven percentage points with each vintage. The headline customer count holds only because new acquisition masks accelerating churn. This is a business with a serious problem that will eventually surface in aggregate metrics—but by then, the cohort structure of the customer base will have shifted substantially, and the cost of remediation will have compounded.
Why resolution stays low
These resolution failures persist not because management teams are negligent or unsophisticated. They persist because SME financial systems were designed for a different purpose entirely.
The architecture of most portfolio company finance functions reflects their origins in compliance rather than analysis. The chart of accounts supports tax filing and statutory reporting. The ERP configuration prioritizes transaction processing over dimensional tagging. The monthly close process optimizes for producing reconciled statements, not segmented analytics.
When an investor asks for margin analysis by customer segment, the finance team faces a genuine structural problem. The data exists somewhere in the transactional systems, but extracting it requires days of manual work—pulling exports, building mappings, reconciling to control totals. The analysis is not a query away. It is a project.
This creates a painful dynamic. Investors request analytical views that would take weeks to produce. Management teams respond with the aggregated metrics their systems generate easily. Both parties are frustrated, and the underlying resolution problem remains unsolved.
Building the analytical layer
The solution is not better analysis of aggregated data. You cannot enhance the resolution of a photograph after it has been compressed. The solution is data architecture that preserves resolution from the point of capture.
Dimensional modeling—the approach Ralph Kimball formalized in the 1990s—provides the conceptual framework. Rather than designing databases around the transactions themselves, dimensional modeling designs around the questions analysts want to ask. The result is a star schema: a fact table containing metrics surrounded by dimension tables that define the "who, what, where, when" context.
In practical terms, this means implementing tagging at the transaction level. Every invoice carries codes for product line, customer segment, sales channel, geography, and sales rep. Customer records maintain acquisition dates and activity histories. The chart of accounts supports multi-dimensional reporting rather than just statutory requirements.
The result is a multidimensional model: a structure that can be sliced across any dimension without losing fidelity. When an investor asks why margin declined, the data exists to decompose the answer by product, by customer segment, by region, by time period—at whatever level of granularity the question requires.
The resolution gap is not a reporting problem. It is an investment risk problem. Every quarter that passes with low-resolution data is a quarter where margin erosion goes undetected, pricing discipline unmeasured, and retention decay invisible until it surfaces in numbers too aggregated to act on.
The fix is not complex. It is architectural. Tag transactions at the point of capture. Preserve the dimensions that answer the questions you will eventually ask. Build once, query indefinitely.
This is where Fortivis operates—as the catalyst between raw transactional data and investor-grade analytics. The goal is not more data. It is data structured to answer the questions investors actually ask—no more, no less.
The alternative is another board meeting where sophisticated questions meet unsatisfying answers—and another quarter where the pixels that would reveal the picture were never captured at all.
Key terms
Simpson's Paradox
A statistical phenomenon where trends visible in aggregated data reverse when segmented. Common in any dataset where composition shifts alongside the metric being measured.
Anscombe's Quartet
Four datasets constructed by Francis Anscombe (1973) that share identical summary statistics but display completely different patterns when graphed. A demonstration that averages alone cannot characterize data.
Cohort Analysis
Tracking behavior by grouping customers according to acquisition date. Reveals trajectory dynamics that point-in-time snapshots obscure.
Dimensional Modeling
A database design approach (Ralph Kimball) that structures data around analytical questions rather than transactions. Fact tables (metrics) surrounded by dimension tables (context).
Sources
- Bickel, P.J., Hammel, E.A., & O'Connell, J.W. (1975). "Sex Bias in Graduate Admissions: Data from Berkeley." Science, 187(4175), 398-404.
- Anscombe, F.J. (1973). "Graphs in Statistical Analysis." The American Statistician, 27(1), 17-21.
- Kimball, R. & Ross, M. (2013). The Data Warehouse Toolkit: The Definitive Guide to Dimensional Modeling (3rd ed.). Wiley.
Maria Ntavou is an Analyst at Fortivis, where she develops analytical frameworks that identify operational inefficiencies and quantify improvement opportunities for portfolio companies. She holds an Integrated Master's Degree in Applied Mathematics and Physical Sciences from the National Technical University of Athens, specializing in Analysis and Applied Statistics.
