Index Weights
The S&P 500 as a single pie — not a logo wall
Most visualizations of the S&P 500 show you the top ten and call it a day. This one draws every one of 503 constituents at its live market-cap weight — then lets you flip to the NASDAQ-100 (101 slices, $40.28T), switch cap-weight to equal-weight, and watch the top-10 share collapse from 42.10% → 2.00%. The chart is the argument: index concentration is a design problem, not a chart-library default.
Concentration is a design problem before it is a risk problem
Both indices are cap-weighted — a stock’s share of the pie scales with its market cap. When that cap doubles, its slice doubles. Six numbers summarize what the pie is actually doing.
Source · data/indices/spx.json + data/indices/ndx.json, refreshed
via scripts/refresh-indices.py. See
methodology for refresh cadence and known limitations.
Every constituent, drawn to live weight
Four controls: index (SPX / NDX), weighting (cap / equal), focus (top N shaded, tail dimmed), and search (ticker or name). Hover a slice or row to cross-highlight the other. The readout panel on the right is the same field pattern a terminal user would expect — ticker, name, sector, cap, weight, price — no screenshots, all derived from the live snapshot.
Loading data/indices/spx.json…
Same top five names. Very different pies.
The top five constituents are identical across the two indices — NVDA, GOOGL, GOOG, AAPL, MSFT. What differs is how much air around them there is. In the SPX, 490 smaller names absorb the rest of the chart; in the NDX, only 91 smaller names do, and they do it in a quarter of the area.
If you are a risk officer, the NDX HHI of 638.2 is the number that changes your position sizing. If you are a product designer, it is the number that tells you the chart has to show the difference at a glance, not in a footnote.
Eleven sectors. Two of them own the pie.
GICS splits the investable universe into 11 sectors. In the SPX, Information Technology + Communication Services already add to 48.21%; in the NDX, the same two add to 76.68%. Real Estate in the NDX is 0.04% — a single-pixel slice on a 1000×1000 canvas.
GICS is the taxonomy S&P Global + MSCI agreed on in 1999. Every asset manager in the world prices, reports, and hedges against these 11 buckets — which is why the colors in this chart are not decorative: they are the vocabulary compliance will use when it asks you to explain a position.
Four choices that separate a demo from a product surface
Most index visualizations default to a treemap and a top-10 table. I made the opposite call for four reasons. Each is a trade-off I would defend to a design director, a CIO, and a compliance lead in the same room.
Pie, not treemap.
Treemaps are spatially accurate but semantically wrong for “what share of the market is this?” A treemap rectangle’s area is two-dimensional — viewers read width and height, not area. A pie slice’s angle is one-dimensional and matches the mental model of “portion of whole.” I took the 10% hit on fine-detail accuracy to gain the 100% hit on semantic fit.
When I need the fine detail — ordering of names 40–80 — I add the leaderboard beside the chart. Two views, one screen, no mode toggle.
All 503 slices. Every one.
The common compromise is “top 10 + ‘Other 1.9%’ lump.” That lump is the point of the chart. The tail is what makes the SPX different from the NDX — 490 small companies absorbing 57.9% of the pie. Collapsing them to one slice erases the distinction that a risk officer most wants to see.
At the tail, slices are smaller than a pixel — fine. They are not meant to be individually readable; they are meant to make the shape of the tail visible.
Color by sector, not by name.
503 distinct colors is a perceptual catastrophe. 11 sectors is the taxonomy the industry already speaks. Compliance memos talk sectors. Model-risk docs talk sectors. SEC 10-Ks are filed by sector. Using GICS as the color primary maps the chart onto a vocabulary the viewer has already paid for.
The eleven hues are deuteranopia-tested and avoid the gold reserved for the portfolio accent. They are tokens, not decisions a designer makes per page.
Cap-weighted vs equal-weighted on one click.
The quietest signal in institutional investing is the gap between SPX and SPXEW — the equal-weight sibling. When cap-weighted outperforms equal-weighted, mega-caps are pulling the market; when equal outperforms cap, the broad economy is. Making that flip a single button is the difference between a “nice chart” and a morning-meeting artifact.
Equal-weight is where the animation earns its keep — watching 503 slices re-snap to 1/503 each is how a viewer feels the concentration, not reads it.
The pipeline, the caveats, the credits
A chart that can’t explain its own numbers is a screenshot in a fancy frame. Here is exactly where every slice came from, when it was refreshed, and what it will not tell you.
Data pipeline
-
1
Constituent list — scraped from Wikipedia’s List of S&P 500 companies and NASDAQ-100 articles via
pandas.read_html. Both pages are curated by editors against the underlying index methodologies. UA header required to avoid Wikipedia’s 403 on default urllib. -
2
Market cap & price — pulled per-ticker from yfinance’s
fast_infoendpoint (camelCase keysmarketCap,lastPrice), with fallback to the.infodict. 0.12s throttle between calls, 2 retries. -
3
Sector normalization — NDX Wikipedia page lacks a sector column, so sectors are carried from the SPX overlap first (87 tickers), fall back to yfinance
info.sector(14 tickers), then normalized with an alias map (Technology → Information Technology, Healthcare → Health Care, Consumer Cyclical → Consumer Discretionary, etc.) to one canonical GICS 11 taxonomy. -
4
Weight & write — each constituent’s weight is
100 × market_cap / ∑market_cap, then sorted descending and written atomically (.tmp→os.replace) todata/indices/spx.jsonandndx.json. Missing caps are set tonulland excluded from the denominator.
Known limitations
-
Snapshot, not stream. yfinance’s free tier is intraday delayed. The JSON is dated
(
as_of: 2026-04-20). For a real desk, swap in an IEX or Polygon feed — the shape of the JSON stays identical, the numbers refresh in seconds. - Class A + Class C siblings. GOOGL and GOOG are both in the pie, as they are in the index itself. Combined their share is 11.90% of SPX — which is the real Alphabet weight if you treat them as one economic entity. I left them split because the index does.
- Weights ≠ methodology rules. S&P and Nasdaq apply float-adjusted, capping-rule-constrained weights (Nasdaq reweights quarterly to cap the largest constituents at 24% of NDX). The raw market-cap ratios here are a first-order approximation — typically within a few basis points of the published weights, not guaranteed equal.
- No time series (yet). Today’s pie, not history. A second pass could add a “playback” slider across quarterly snapshots to show the tech rotation through 2020–2025. Explicitly out of scope for v1.
References & credits
- Index methodologies
- S&P U.S. Indices Methodology · Nasdaq-100 Index Methodology
- GICS taxonomy
- MSCI Global Industry Classification Standard (S&P Global + MSCI, 1999).
- Concentration math
- U.S. DOJ — Herfindahl-Hirschman Index · “under 1,500” = unconcentrated; “1,500–2,500” = moderately; “above 2,500” = highly.
- Data tooling
- pandas + yfinance + vanilla SVG arc math. No charting library.
- Design precedent
- Bloomberg Terminal WEI function (world equity indices overview) · finviz.com/map (treemap precedent — the thing I chose to not copy).
Where this case study sits in the larger web
Every problem we solve for clients has multiple valid approaches — different costs, different ROI, different risk profiles. These threads show how the approach on this page compares to others in the portfolio.
Concentration, Risk & Agents
Portfolio-level math primitives — HHI, beta, VaR, regime — rendered into UI defaults and AI-assisted decision surfaces.
- Index Weights · SPX + NDX Concentration math exposed Low eng cost · 503+101 slices · HHI 230.6 / 638.2
- Macro Signal Network Regime classifier feeding routing Low eng cost · 4 prints → 14 surfaces
- Nova · Institutional Trading Terminal Portfolio risk surfaces on one shell High eng cost · Beta/Duration/Kelly/VaR unified
- TradeX Institutional Stress test & position sizing High eng cost · 12 holdings · 3-factor composite
- Hedge Fund · 19-Agent Committee Multi-persona investment consensus High eng cost · Agent reliability overhead
- AI Cowork · Agent UX AI in the decision loop Med eng cost · Trust fallbacks · hallucination guardrails
- Private Banking Advisory Diversification mandate High eng cost · Concentration compliance · FINRA 2111
Evidence & Verification Discipline
How we prove design claims with data — A/B, pooled-SD, cohort analysis, and the rigor behind every number quoted on this site.
- Finlogix · Retail Trader Education A/B tested retention Low eng cost · Cohen's d 2.47 · 90d post-launch
- ACY Securities · Regulated Broker System Mixpanel audit High eng cost · 90-day funnel · component adoption
- Index Weights · SPX + NDX Concentration math verified Low eng cost · Wikipedia + yfinance pipeline
- AI Cowork · Agent UX AI reliability measurement Med eng cost · Hallucination rates · fallback UX