Design System Health Scorecard

A scoring framework to measure your design system's health across 6 dimensions. Use it for quarterly reviews, stakeholder updates, or before starting an AI integration.

What This Is

A structured scorecard for measuring your design system’s health. Score each dimension 1-5, get an overall grade, and know exactly where to invest your time.

Use it for:

  • Quarterly health checks
  • Stakeholder reporting (“here’s where we are, here’s what needs work”)
  • Pre-assessment before adding AI tooling
  • Tracking improvement over time

Take the quiz

Click through the six dimensions below. Your grade and the dimensions to fix first appear when you finish. The detailed reference tables are further down if you want to read what each score means before answering.

Quick quiz

Score your design system in about five minutes

Pick the row that matches your system for each dimension. Your running total appears at the bottom, and the grade unlocks once all six are answered.

0 of 6 answered Score so far: 0 / 30
  1. 01 Token Coverage
  2. 02 Component Health
  3. 03 Naming Consistency
  4. 04 Documentation
  5. 05 Accessibility
  6. 06 Design-Code Parity
Result locked

Answer all six dimensions to unlock your grade and action plan.

0 / 30
Your grade

Focus here first

The Scorecard

1. Token Coverage (1-5)

ScoreMeaning
1No tokens. Raw hex values everywhere.
2Some primitive tokens exist but inconsistently used.
3Primitive + semantic layers exist. Most components use tokens.
4Full coverage. All components use semantic tokens. Dark mode works.
5Tokens drive everything. Schema documented. AI can validate naming.

Your score: ___ Evidence:

  • Primitive tokens defined
  • Semantic layer aliases to primitives
  • No raw hex values in components
  • Dark mode/theming works via token swap
  • Token naming follows documented convention

2. Component Health (1-5)

ScoreMeaning
1Components exist but no consistency. Many detached instances.
2Core components standardized. Many one-offs still in use.
3Component library covers 70%+ of UI. Variants documented.
4Full library with variants, states, and responsive behavior.
5Components have intent docs, usage guidelines, and automated testing.

Your score: ___ Evidence:

  • Component inventory exists
  • Variants cover all use cases
  • States documented (hover, active, disabled, focus, error)
  • Responsive behavior defined
  • Detached instance count is low (under 5%)

3. Naming Consistency (1-5)

ScoreMeaning
1No convention. Mixed formats (camelCase, kebab, slash, dot).
2Convention exists but not enforced. Many violations.
3Convention documented. Most tokens/components follow it.
4Convention enforced via tooling. Violations flagged automatically.
5Naming is machine-readable. AI tools can parse and validate.

Your score: ___ Evidence:

  • Naming convention documented
  • Consistent separator usage
  • Consistent casing
  • Automated validation exists
  • New additions follow convention

4. Documentation (1-5)

ScoreMeaning
1No documentation beyond component names.
2Some components have descriptions. No usage guidelines.
3Most components documented. Props, variants, and basic usage.
4Full docs with do/don’t examples, accessibility notes, and code.
5Machine-readable docs. Component intent, knowledge graph, AI-ready.

Your score: ___ Evidence:

  • Component descriptions exist
  • Usage guidelines (when to use / when not to use)
  • Props/API documented
  • Do/don’t examples
  • Accessibility notes per component

5. Accessibility (1-5)

ScoreMeaning
1No accessibility considerations.
2Some color contrast checks. No keyboard or screen reader testing.
3WCAG AA compliance for core components. Basic keyboard support.
4Full AA compliance. Keyboard nav, focus management, ARIA attributes.
5AAA where possible. Automated a11y testing in CI. Reduced motion support.

Your score: ___ Evidence:

  • Color contrast meets AA (4.5:1 text, 3:1 UI)
  • Focus indicators visible
  • Keyboard navigation works
  • ARIA attributes correct
  • Screen reader tested

6. Design-Code Parity (1-5)

ScoreMeaning
1Design and code are completely disconnected.
2Some alignment. Manual sync. Frequent drift.
3Token values match between Figma and code. Components differ.
4Automated sync for tokens. Components closely match.
5Bi-directional sync. Drift detected automatically. Parity dashboard.

Your score: ___ Evidence:

  • Token values match between design and code
  • Component structure matches
  • Variant coverage matches
  • Automated drift detection exists
  • Sync pipeline documented

Overall Score

TotalGradeMeaning
25-30AAI-ready. Your system can power automated workflows.
19-24BStrong foundation. A few gaps to close before AI integration.
13-18CFunctional but needs investment. Start with naming and tokens.
7-12DEarly stage. Focus on foundations before adding tools.
6FStarting from scratch. That’s okay. Everyone starts here.

Your total: ___ / 30 Your grade: ___

Action Plan

Based on your lowest scores, prioritize:

  1. Lowest dimension first: _______________
  2. Second lowest: _______________
  3. Quick win (easiest to improve): _______________

Review again in 3 months.