HomeJournalThis post

Case study metrics without theater

Useful portfolio metrics explain the product question, baseline, evidence, attribution, and caveat instead of decorating the page.

JP
JP Casabianca
Designer/Engineer · Bogotá

Metrics make a case study stronger only when they make the work more legible.

The bad version is metric theater. A number appears because the page needs to feel impressive, but the reader cannot tell what changed, who owned the change, whether the number is clean, or how the number connects to the work shown on the page. The case study gets shinier and less trustworthy at the same time.

The useful version is quieter. It explains the product question, the baseline, the decision, the measurement window, the constraints, and the interpretation. It says what the number proves, what it does not prove, and what other evidence supports the claim. That kind of metric does not only make the project look bigger. It makes the candidate look more rigorous.

This matters for my own site because the goal is not to look like a marketer. The goal is to look like a product engineer who understands outcomes. If I show revenue, conversion, support load, activation, performance, or cycle time, the number should help a hiring manager understand my judgment. It should not ask them to accept a vague win at face value.

ClaimWhat moved?

Name the behavior, business result, system health signal, or team operating measure.

ContextWhy it matters

Explain the baseline, audience, constraint, and product surface connected to the work.

EvidenceHow we know

Pair the metric with artifacts, screenshots, logs, release notes, or qualitative signal.

Figure 1: A useful metric is a claim, context, and evidence working together.

Start with the product question

Before choosing a metric, I want the product question written in plain language.

Not "increase conversion." That is too broad. The question should be closer to the actual work: did the new fit guidance reduce size anxiety? Did the pricing page make plan comparison easier? Did the admin table reduce support requests about who changed what? Did the AI review checklist catch release risks before merge? Did the checkout drawer make the next action clearer on mobile?

When the product question is specific, the metric becomes less theatrical. It stops being a loose badge and becomes a way to read the work.

For example, if the case study says I improved a checkout, there are many possible numbers:

  • checkout conversion rate
  • cart drawer open-to-checkout rate
  • express payment selection
  • address validation error rate
  • support questions about shipping
  • mobile drop-off between cart and payment
  • checkout page performance

Those are not interchangeable. Each one answers a different question. If I pick the wrong one, the case study might look impressive while saying very little.

The same is true for portfolio work. A portfolio project may use revenue, but if the actual contribution was a better product detail page, I need to explain how the page participated in the revenue story. If the number belongs to the whole business, I should say so. If I owned one part of the system, I should show the part I owned.

That honesty makes the page stronger, not weaker. It tells the reader I understand attribution.

Baseline is part of the story

A metric without a baseline is usually decoration.

If checkout conversion increased by 12 percent, I need to know from what. If support tickets dropped, I need to know which ticket type. If a build got faster, I need to know the old build time and whether the environment changed. If a dashboard improved decision speed, I need to know what the old decision process looked like.

Baseline is not only the old number. It is the old condition.

The old condition might include:

  • a confusing flow
  • a slow page
  • a manual process
  • an unclear owner
  • missing instrumentation
  • repeated support questions
  • a brittle implementation
  • a screen that worked only with demo data
Baseline Intervention Signal Read The number belongs inside a narrative. Otherwise the reader cannot tell why it moved.
Figure 2: The metric needs the old condition, the intervention, the signal, and the interpretation.

The baseline also protects against overclaiming. If a metric improved because a seasonal campaign started, the case study should not imply the interface did everything. If a conversion rate improved after traffic quality changed, the page should mention that. If support tickets dropped because a product was discontinued, that is not a UX win.

Senior people are trusted because they can separate signal from convenient story.

Use ranges when exact numbers are not right

Some numbers are sensitive. That does not mean the case study has to become vague.

Ranges are useful:

  • five-figure revenue
  • low six-figure annual volume
  • 20 to 30 percent reduction
  • hundreds of weekly support interactions
  • sub-second cart drawer load
  • single-digit error rate

The range still gives the reader scale. It also avoids pretending that the exact number is the only valid evidence.

I like ranges when:

  • client confidentiality matters
  • the business is private
  • the number is directionally useful but not clean enough for exact display
  • the metric belongs to a broader system than my contribution
  • the public version of the page should not disclose operating details

What matters is that the range is honest. A range should not inflate the story. It should make the story publishable.

Pair metrics with artifacts

A number becomes more credible when the reader can see what produced it.

If I claim onboarding completion improved, show the old and new activation path. If I claim support load dropped, show the confusing state that caused support and the clearer state that replaced it. If I claim checkout got faster, show the performance budget, bundle change, or rendering path. If I claim a portfolio became stronger, show the proof artifacts added to the page.

Metric+18% checkout CVR

The number tells the reader what changed in behavior.

ArtifactMobile flow map

The artifact shows the decisions that could plausibly affect that behavior.

SupportFewer shipping questions

The qualitative signal explains why the number was not only a traffic artifact.

Figure 3: A metric is stronger when an artifact and a secondary signal make it believable.

This is also a good way to make a portfolio feel less AI-generated. Generic AI content often states outcomes without artifacts. Authored work shows the trace: the system map, the state table, the decision matrix, the release note, the analytics taxonomy, the screenshot with real constraints.

Pick metrics by project type

Different work needs different proof.

For commerce work, I look for:

  • conversion rate
  • add-to-cart rate
  • product detail page engagement
  • checkout drop-off
  • return or exchange rate
  • support questions by topic
  • average order value
  • campaign revenue
  • page speed
  • payment error rate

For product workflow work:

  • activation
  • task completion
  • time to complete
  • error recovery
  • repeated action success
  • permission failures
  • support tickets
  • usage by role
  • retained usage after first week

For platform or design-system work:

  • adoption by surface
  • override rate
  • duplicated component count
  • release cycle time
  • bug class reduction
  • accessibility issue count
  • contribution time
  • token drift
  • migration progress

For AI workflow work:

  • review defects caught before merge
  • hallucinated assumptions removed
  • prompt-to-PR cycle time
  • tests added per change
  • human review time
  • rollback rate
  • production incidents
  • reused patterns
  • agent context quality

The metric should fit the work. A design-system project does not need a fake conversion story. A checkout project should not hide behind component adoption. A portfolio project should not use traffic if the goal is proof of judgment.

Name the measurement window

Metrics need time.

The measurement window might be first week after launch, first complete campaign, before-and-after month, cohort after activation, or release window. Without that window, the number floats.

A good case study can say:

  • measured across the first two post-launch campaigns
  • compared to the previous 30 days with similar traffic mix
  • tracked during the first release week
  • counted after excluding internal test traffic
  • measured on mobile only because that was the affected surface
  • reviewed after the migration completed and old records were safe

This level of detail makes the page feel operated, not decorated.

Be explicit about attribution

Attribution is where many case studies get loose.

If the work shipped with a pricing change, a campaign, a traffic change, or a product launch, the case study should say that. The metric may still be useful, but it should be framed as a product result, not a solo design result.

I like language such as:

  • This work contributed to...
  • The product team saw...
  • The surface was part of...
  • My owned contribution was...
  • The clearest signal tied to my work was...
  • I would not attribute the full lift to this screen alone.

That language does not make the work smaller. It makes the author more trustworthy.

The proof table I want in every serious case study

For each important claim, I want a small proof table.

Claim: what I am saying changed.

Owned work: what I actually designed, built, led, or operated.

Evidence: metric, artifact, screenshot, log, QA result, support signal, or customer behavior.

Confidence: high, medium, low, with a short reason.

Open caveat: what the number does not prove.

ClaimWhat changed?

Use product language, not only metric language.

OwnershipWhat did I own?

Separate team outcome from personal contribution.

CaveatWhat else mattered?

Campaigns, seasonality, traffic quality, data limits, or team changes.

Figure 4: A proof table lets the page be confident without pretending attribution is cleaner than it is.

That table is not just for the reader. It is useful for me. If I cannot fill it in, I probably do not understand the story well enough yet.

Metrics are interview hooks

The best case-study metric gives the interviewer a better question.

If I write "support tickets dropped 31 percent," the interviewer can ask which tickets, how support tagged them, whether the copy changed, and how I would validate causality. If I write "cart drawer FCP reached 0.4s," the interviewer can ask what changed in rendering, what tradeoffs I made, and how I protected the experience on slower devices.

That is the value. The metric opens a technical and product conversation.

Metric theater closes conversation because it asks the reader to clap. A good metric invites scrutiny.

A practical rewrite pass

When I review a case study, I can pressure-test each number with this pass:

  • Does the number answer a specific product question?
  • Is the baseline clear?
  • Is the measurement window named?
  • Is my contribution separated from the team outcome?
  • Is there a visual artifact connected to the number?
  • Is there a secondary signal?
  • Is the caveat stated without apology?
  • Does the metric create a useful interview question?
  • Would I be comfortable explaining how the number was collected?
  • Would the claim survive a skeptical engineering manager?

If the answer is no, the metric needs a rewrite.

The goal is not to make the page less impressive. The goal is to make it more durable. A durable case study can handle questions.

Where placeholders are honest

I am comfortable using placeholders while a case study is being drafted, but only when the placeholder is clearly marked and the structure around it is real. A placeholder like "XX orders" is fine during layout. A vague claim like "major growth" is not fine because it teaches the page to avoid evidence.

The placeholder should preserve the kind of proof the final page needs:

  • exact number or range
  • time window
  • source of truth
  • audience or segment
  • baseline
  • contribution
  • caveat
  • artifact

That means a case study draft can say: "Placeholder: Shopify orders during launch window, compared against the previous campaign window, excluding wholesale orders." That sentence is much better than a decorative metric card. It tells me what data I need to pull, what comparison will matter, and what can distort the read.

For a brand or commerce project, the placeholder might cover revenue, order count, conversion rate, average order value, support volume, return reasons, inventory turns, email signups, page speed, or campaign click-through. For a product engineering project, it might cover activation, completion rate, build time, incident count, support tickets, QA escapes, or cycle time.

The important thing is to avoid letting the placeholder become a fake result. If the public page is not ready for real numbers, I can still show the evidence model: "I would validate this with Shopify conversion, refund reasons, support questions, and checkout drop-off." That is still a useful signal because it shows how I think about outcomes.

Build the proof spine first

The easiest way to keep a case study from drifting into theater is to write the proof spine before designing the page.

My proof spine usually looks like this:

  1. The product needed to change because something was unclear, slow, fragile, expensive, or underperforming.
  2. I owned a specific part of the work.
  3. The work had constraints that made judgment necessary.
  4. I made decisions that can be inspected through artifacts.
  5. The result changed a product, user, team, or business signal.
  6. The caveat is clear enough that the claim still feels trustworthy.

If I cannot write those six sentences, the case study is not ready. The problem may be that the project was too small, the result was not measured, the story is not understood yet, or I am trying to make a support task sound like a strategic redesign. Any of those can still become a good page, but the page needs the right scale.

For example, a small admin-table improvement may not deserve a dramatic revenue claim. It might deserve a sharper operational proof: fewer support escalations, faster review, less manual lookup, fewer mistakes in fulfillment, or clearer audit history. That kind of proof is more credible than pretending the table singlehandedly moved the business.

The proof spine also helps me decide what visual assets belong on the page. A hero image can set the mood, but the proof needs artifacts. If the story is about checkout trust, I want a state map, payment error flow, and support-ticket before/after. If the story is about AI workflow, I want a review checklist, failure taxonomy, and example prompt context. If the story is about a brand and store, I want catalog logic, campaign structure, product photography, inventory states, and the operating dashboard.

Use visuals to reveal decisions

Visual assets should not be decoration. They should reveal a decision the prose would otherwise have to explain slowly.

The most useful visuals for metrics are rarely glossy screenshots. They are usually:

  • baseline-to-after diagrams
  • event funnels
  • support-tag matrices
  • tradeoff tables
  • annotated product states
  • before-and-after copy samples
  • measurement plans
  • release timelines
  • system maps
  • dashboards with only the relevant slice shown

If the visual is a chart, I want the caption to say what changed and why the chart matters. If the visual is a matrix, I want the rows and columns to encode the decision, not just fill space. If the visual is a screenshot, I want callouts that name the product problem, not generic labels like "clean layout" or "improved UI."

This is especially important for candidate positioning. Hiring managers have seen many polished portfolios. The page becomes more legitimate when the visual asset feels like something used during the work, not something invented afterward to make the case study look bigger.

A small artifact can carry a lot of weight. A table with "Claim, evidence, source, caveat, interview hook" makes the case study feel honest. A checkout state map shows that I understand recovery and edge cases. A support taxonomy shows that I can turn qualitative mess into product decisions. A component variant decision table shows that I understand design systems as operating models.

The visual should make the reader think: this person did not just style the page. They understood the system.

Separate vanity metrics from operating metrics

Not every impressive number helps the case study.

Vanity metrics are numbers that sound good but do not change the reader's understanding of the work. Total page views can be useful, but not if the project was about checkout error recovery. Social impressions can be useful, but not if the project was about reducing fulfillment mistakes. Revenue can be useful, but not if the page cannot explain how the work participated in that revenue.

Operating metrics are closer to the work. They show whether the product behaved better:

  • fewer users abandon after shipping appears
  • fewer customers ask the same sizing question
  • fewer support tickets require manual lookup
  • fewer QA defects escape into release
  • more users complete setup without handholding
  • faster render time on the route that matters
  • less duplicate code after a shared component ships
  • fewer failed payments require support recovery

Operating metrics may look smaller, but they often create stronger proof. They are closer to the decision. They also give engineering interviewers something concrete to discuss.

If I show a large business number, I want to pair it with an operating number. The business number says why the project mattered. The operating number says how the work made the product better.

The metric note I would put in the case study file

For real projects, I like keeping a short metric note next to the case study draft. It does not have to be public, but it keeps the public claims honest.

The note would include:

  • the exact sentence I plan to publish
  • the source for the number
  • the date range
  • any filters used
  • the baseline
  • the release or campaign that changed the surface
  • what else changed during the same period
  • why I think the claim is fair
  • what I will say if asked about attribution

That last line matters. The case study should not collapse when someone asks a reasonable follow-up. If I cannot explain attribution in a calm way, the metric is probably too broad or too confident.

This is also useful when the number is a placeholder. The note can say: "Replace with Shopify sales report for launch week once exported." Or: "Use range if exact revenue should stay private." Or: "Do not publish until support tags are cleaned." The page then has a path to truth instead of a pile of optimistic text.

What I would say in an interview

A good metric should give me a natural interview answer.

If someone asks, "How do you know the change worked?" I do not want to answer with only "conversion went up." I want to say something like:

"The cleanest signal was mobile checkout completion in the two weeks after launch, but I did not rely on that alone because campaign traffic changed during the same window. I also looked at payment-related support questions and cart abandonment after total changes. The direction was consistent, so I was comfortable saying the checkout became clearer. I would not claim the entire revenue lift belonged to that UI work."

That answer is more credible than a louder number. It shows measurement judgment, product judgment, and humility about causality.

For engineering roles, this matters because teams do not only need people who can ship code. They need people who can understand whether the code helped. The metric conversation is where product engineering becomes visible.

The artifact I would keep downloadable

This is one of the places where a companion resource makes sense. A case-study proof template is useful because the hard part is not writing prettier sentences. The hard part is forcing each claim to earn its place.

The downloadable version should not be a fancy worksheet with vague prompts. It should be a working document with rows like:

  • project claim
  • user or business problem
  • baseline
  • decision I made
  • artifact that proves the decision
  • metric or signal
  • caveat
  • interview question it supports

That structure helps me write, but it also helps me edit. If a section has no artifact, it may be too vague. If a metric has no caveat, it may be too confident. If a claim creates no interview question, it may be portfolio decoration instead of proof.

I would use the same template before publishing any major Work page. For Casabianca, the rows might cover fit confidence, catalog structure, local payment trust, product photography, campaign merchandising, fulfillment communication, and support learning. For a software product, the rows might cover activation, admin workflows, analytics, performance, release process, and design-system contribution.

The point is to turn portfolio writing into evidence design. A strong candidate page should be able to survive a reviewer asking, "How do you know?" The template makes that question less scary because it has already been answered before the page goes live.

The publish test

Before publishing a case study with metrics, I want to run one final test: could I sit with a skeptical engineering manager and explain every number without changing my tone?

That test catches weak claims quickly. If I find myself getting defensive, the claim probably needs more context. If I have to say "I think" too many times, the source may be weak. If I cannot explain the baseline, the number is not ready. If I cannot separate my contribution from the team outcome, the page is claiming too much.

The publish test is not about making the story smaller. It is about making it steadier. A steady story can still be ambitious. It can still show growth, revenue, craft, and leadership. It just does not need to hide the messy parts.

That steadiness is what makes a candidate feel senior. The page should sound like someone who has shipped real work, read the results, understood the limits, and can still explain why the work mattered.

It should also make the next step obvious. If a reviewer likes the story, they should know which technical conversation to start: analytics quality, attribution, checkout recovery, support reduction, performance, or system adoption. Metrics are useful when they point toward that conversation instead of ending it.

The final standard

The final standard is simple: every number should make the work clearer.

If a number only makes the page feel louder, cut it or rewrite it. If the number explains the user behavior, the operating constraint, the system improvement, or the business result, keep it and give it context.

That is how metrics make a portfolio more legitimate. They do not decorate the work. They explain the work.

Companion artifacts

Use this after reading.

Practical downloads and templates that turn the article into something you can bring into a product review, implementation pass, or agent workflow.

TemplateJun 2026

Portfolio Case Study Proof Template

A case-study structure for proving judgment, constraints, tradeoffs, messy-middle artifacts, and outcomes.

PortfolioHiringProof
View details
DownloadJun 2026

Personal Site Content Audit Template

A portfolio audit template for sharpening positioning, credibility, proof, content structure, and recruiter-facing signals.

PortfolioContentHiring
View details
TemplateJun 2026

Product Analytics Event Taxonomy

A naming and planning template for defining product events, properties, funnels, activation signals, and instrumentation ownership.

AnalyticsProductGrowth
View details