How to scrape Steam Market API at scale (2026)

I run case-sim.com, a free CS2 case opening simulator. One of the things people like about it is that every item shows its real Steam Market price, updated daily. Twenty-thousand-something items, refreshed automatically, served free.

I'm going to walk through how that actually works, because Steam's pricing API is one of the most aggressively rate-limited and weakly-documented public endpoints I've worked with, and the standard "just hit it in a loop" approach gets your IP soft-banned within the hour.

If you're building any tool that needs Steam Market prices — a trading bot, a portfolio tracker, an inventory valuation page, a case ROI calculator — most of this will apply directly. The patterns also generalize to any rate-limited third-party API where you need to keep a large dataset fresh.

The endpoint everyone uses

The unofficial endpoint:

https://steamcommunity.com/market/priceoverview/
  ?appid=730
  &currency=1
  &market_hash_name=AK-47%20%7C%20Redline%20%28Field-Tested%29

appid=730 is CS2 (formerly CS:GO; same app ID). currency=1 is USD. The response:

{
  "success": true,
  "lowest_price": "$11.43",
  "volume": "1,247",
  "median_price": "$11.51"
}

That's it. No API key needed. No documentation. Steam doesn't officially endorse this endpoint — it's reverse-engineered from the Market UI — and that means it can change at any time. In four years of running case-sim it has changed exactly once (a response field added in 2022). Pretty stable for an undocumented endpoint, but treat that stability as luck, not contract.

Why naive scraping breaks

If you write the obvious loop:

for (const item of items) {
  const price = await fetchPrice(item);
  await save(price);
}

You'll get:

About 20 successful requests
A few hundred milliseconds of nothing in particular
Then 429 Too Many Requests
Then your IP gets soft-banned for ~5 minutes
You retry the loop and the ban gets longer
After about an hour of this, Steam holds your IP in the doghouse for the better part of a day

Steam doesn't publish their rate limits anywhere I've found. From measurement: roughly 20 requests per minute per IP on /priceoverview, with small bursts allowed but a sliding window that pulls you back to ~1 request every 3 seconds sustained. The longer you keep the sustained rate up, the more aggressive the throttle becomes.

If you have 20,000 items and you're doing 20/min, refreshing the whole catalog takes about 17 hours. That's the budget you're working with.

Tiering: not all items deserve fresh prices

Here's the insight that made the whole thing tractable: nobody cares about the exact current price of an AK-47 Safari Mesh (Battle-Scarred). It's worth $0.04. It will be worth $0.04 tomorrow. It will be worth $0.04 next year.

Knives, gloves, and high-tier covert skins, on the other hand, change price meaningfully day-to-day and are the ones users actually look at.

So I tier items by importance:

const REFRESH_TIERS = {
  S: { interval: '6h',  description: 'knives + gloves',          count: 500   },
  A: { interval: '24h', description: 'covert skins',             count: 800   },
  B: { interval: '72h', description: 'classified + popular caps',count: 2000  },
  C: { interval: '7d',  description: 'restricted skins',         count: 6000  },
  D: { interval: '30d', description: 'mil-spec + stickers',      count: 10700 },
};

The math:

S: 500 items × 4 fetches/day = 2,000 requests/day
A: 800 × 1/day = 800
B: 2,000 ÷ 3 ≈ 670/day
C: 6,000 ÷ 7 ≈ 860/day
D: 10,700 ÷ 30 ≈ 360/day

Total: ~4,690 requests per day. At a safe sustained rate of 15/min, that's about 5 hours of actual work, comfortably distributable across the day with breathing room.

Critically: it scales by what users care about, not what's easiest to query. You can shift items between tiers based on traffic — if a particular skin is suddenly trending because of a tournament, bump it to A or S manually.

The scheduler

import PQueue from 'p-queue';
 
// Steam tolerates ~20/min sustained. We stay below that.
const queue = new PQueue({
  concurrency: 1,
  interval: 4000,    // 4s between requests = 15/min
  intervalCap: 1,
});
 
async function refreshTier(items) {
  return Promise.all(
    items.map(item =>
      queue.add(() => fetchWithBackoff(item))
    )
  );
}
 
async function fetchAndSavePrice(item) {
  const url = `https://steamcommunity.com/market/priceoverview/`
    + `?appid=730&currency=1`
    + `&market_hash_name=${encodeURIComponent(item.marketHashName)}`;
  
  const res = await fetch(url, {
    headers: { 'User-Agent': 'case-sim price refresher (contact@case-sim.com)' },
  });
 
  if (res.status === 429) {
    throw new RateLimitError(item);
  }
  if (!res.ok) {
    throw new Error(`HTTP \({res.status} for \){item.marketHashName}`);
  }
 
  const data = await res.json();
  if (!data.success) {
    // Item exists but has no current listings. Don't overwrite the previous price.
    return;
  }
 
  await db.prices.insert({
    itemId: item.id,
    currencyCode: 1,
    lowestPrice: parsePrice(data.lowest_price),
    medianPrice: parsePrice(data.median_price),
    volume: parseVolume(data.volume),
    fetchedAt: new Date(),
  });
}

p-queue handles the rate gating. Four seconds between requests is conservative — three works most of the time, but I've watched it slip into ban territory under sustained load. The few extra seconds are cheap insurance.

The User-Agent matters more than people think. Steam doesn't require it but if your IP shows up in their logs as suspicious, having a contactable user-agent is the difference between a permanent ban and a temporary one. Be a good citizen.

Handling 429s with exponential backoff

When Steam throttles you, the response is unhelpful:

HTTP/1.1 429 Too Many Requests
Content-Length: 0

No Retry-After header. No JSON body. No hint at how long to wait. You just have to back off and try again.

The pattern that works:

async function fetchWithBackoff(item, attempt = 0) {
  try {
    return await fetchAndSavePrice(item);
  } catch (err) {
    if (err instanceof RateLimitError && attempt < 4) {
      const delay = 60_000 * Math.pow(2, attempt); // 1m, 2m, 4m, 8m
      console.warn(`Rate limited on \({item.marketHashName}, sleeping \){delay/1000}s`);
      await sleep(delay);
      return fetchWithBackoff(item, attempt + 1);
    }
    throw err;
  }
}

After the fourth retry I drop the item, log it, and move on. If 429s keep happening across multiple consecutive items, I pause the whole queue for an hour. Steam holds onto your IP's reputation for a while; aggressive retries make it worse, not better.

The `market_hash_name` encoding gotcha

The market_hash_name is finicky. Some examples that bit me:

StatTrak items use StatTrak™ with the actual ™ character: StatTrak™ AK-47 | Redline (Field-Tested)
Souvenir items prefix with Souvenir: Souvenir AWP | Dragon Lore (Factory New)
Stickers use Sticker | <name> (<finish>) | <event>: Sticker | Astralis (Holo) | Antwerp 2022
Cases use the exact case name with no wear: Operation Riptide Case
Doppler phases use Phase X in the name: ★ Karambit | Doppler (Factory New) - Phase 2

The ™ character is what catches most people. URL-encoded it's %E2%84%A2. Some tutorials online tell you to strip it. Do not strip it. If you query for StatTrak AK-47 (no ™) instead of StatTrak™ AK-47, you get success: false and no price. Your StatTrak prices then quietly fall behind by months without anyone noticing.

I keep a single source-of-truth registry of market_hash_name strings, generated from Valve's items_game.txt and validated against a known-good list of recent listings. When Valve adds a new case, that registry gets a new entry, and only then does the scheduler know to fetch it.

The ★ character on knives (e.g., ★ Karambit | Fade) is similarly mandatory. Don't strip the star.

Multiple currencies (the cheap way)

The currency query param accepts numeric codes:

Code	Currency
1	USD
2	GBP
3	EUR
5	RUB
7	BRL
8	JPY
23	CNY

If you want multi-currency on your site, you have two options:

Fetch every item in every currency. With my 4,690/day USD budget and 7 currencies, that's ~33,000/day — about 36 hours of work, no longer fits in a day.
Fetch in USD and convert client-side using FX rates from a much faster API. I do this. The Steam-quoted USD price is the source of truth; everything else is derived.

Option 2 isn't perfectly accurate — Steam's own non-USD prices are slightly different from a clean FX conversion because they bake in regional pricing adjustments — but it's within a few percent for any major currency, which is fine for a free simulator. If you're running a real trading platform, option 1 might be necessary, in which case you need multiple worker IPs and that's a whole different conversation.

The fallback nobody talks about

Sometimes Steam just... goes down. Or returns success: false for an item that definitely has listings. Or returns a stale price because the underlying market data is itself behind.

Every price in my database has a fetched_at timestamp. The site surfaces "as of X hours ago" if the data is older than expected. If the latest fetch failed, I serve the previous successful fetch's value. Users see real prices that occasionally lag, instead of $0 or "unavailable" placeholders.

This is the "graceful degradation" pattern but it matters more here than in most apps because Steam's market is genuinely flaky on the scale of hours, not just minutes. Plan for it from day one.

Schema

The minimal table:

CREATE TABLE prices (
  item_id            INT NOT NULL,
  currency_code      SMALLINT NOT NULL DEFAULT 1,
  lowest_price       DECIMAL(10, 4),
  median_price       DECIMAL(10, 4),
  volume             INT,
  fetched_at         TIMESTAMPTZ NOT NULL DEFAULT NOW(),
  PRIMARY KEY (item_id, currency_code, fetched_at)
);
 
CREATE INDEX idx_prices_latest 
  ON prices (item_id, currency_code, fetched_at DESC);

Then a latest-price query using PostgreSQL's DISTINCT ON:

SELECT DISTINCT ON (item_id)
  item_id, lowest_price, median_price, volume, fetched_at
FROM prices
WHERE item_id = ANY($1::int[])
  AND currency_code = $2
ORDER BY item_id, fetched_at DESC;

I keep all historical fetches, not just the latest. Storage is cheap; the historical record is what lets you draw a price chart for any item without separate scraping. After two years of operation my prices table has ~3M rows, takes ~240MB, and queries the latest set in <5ms with the index. Easy.

If you're on MySQL or another DB without DISTINCT ON, the equivalent is a window function:

WITH ranked AS (
  SELECT *,
    ROW_NUMBER() OVER (PARTITION BY item_id ORDER BY fetched_at DESC) AS rn
  FROM prices
  WHERE currency_code = ?
)
SELECT item_id, lowest_price, median_price, volume, fetched_at
FROM ranked
WHERE rn = 1 AND item_id IN (?);

What I'd do differently

If I were starting today:

Use Steam's published Web API where possible. There's a real API at api.steampowered.com that requires a developer key but has documented rate limits. It doesn't expose the market endpoints I need, but for inventory or profile data, use it. Don't scrape what you can authenticate to.
Don't bother with proxy rotation. People will tell you to rotate residential proxies to bypass the rate limit. It's a maintenance nightmare and gets you banned permanently when Steam catches on. The tier-and-schedule approach is enough for any legitimate use case.
Subscribe to a third-party feed if you can afford it. SteamDT, Steamlytics, CSFloat, and a few others aggregate prices and serve them via a real API with documented limits. They're cheap relative to engineering time. I rolled my own because case-sim is free and the cost has to stay near zero, but if I were running a commercial product, the build-versus-buy answer leans heavily toward buy.
Cache the cache. I serve prices from Postgres directly. Adding a Redis layer in front for the hottest 1,000 items would shave 90% of the latency on popular pages. Haven't bothered yet but I should.
Track price changes, not just snapshots. For interesting analytics — "biggest gainers this week" — what you want is the delta, not the latest. A materialized view that pre-computes 24h/7d/30d changes per item is way easier to query than computing it on the fly.

Wrap

Steam Market scraping is one of those problems where the simple version (loop + fetch) doesn't work, and the over-engineered version (residential proxies, headless browsers, account farms) is the bad-vibes-flavored overkill that ends in account bans. The middle path — tier your refresh frequency by item importance, respect 429s with exponential backoff, keep historical data for fallback — is what actually scales.

If you want to see this in practice, every item on case-sim.com shows its current Steam Market price with a "last updated" timestamp. You can also see how stale the prices get on rarely-traded items, which is part of the honest tradeoff: free, accurate, and occasionally a few days old on niche stuff. There's a full breakdown of every CS2 case at case-sim.com/cases if you want to compare item pricing across collections.

Comments open if you've solved similar API-rate-limited refresh problems. Curious if anyone's gotten cleaner solutions for the 429-with-no-Retry-After pattern, because that one always feels like a hack no matter how I write it.

Command Palette