{"@context":"https://schema.org","@type":"BlogPosting","headline":"How ChatGPT Pulls Hotel Prices (2026): It Scrapes the OTAs, Then Cites Reddit","description":"A network-source forensic study of how ChatGPT sources hotel prices. Across 240 ChatGPT captures and 3,092 fetched documents for hotel price questions: 98.8% trigger a web search (never the no-search text bucket), OTAs are 46% of fetched documents but cited only 11% of the time, Reddit is cited 100% of the time it is fetched, and every tier-tagged cited price source comes through the licensed labrador retrieval tier rather than the Bright Data scraper tier. Price provenance shifts by hotel segment: palaces are priced via official and loyalty channels, independents via OTAs.","datePublished":"2026-06-26","dateModified":"2026-06-26","url":"https://nicolassitter.com/research/chatgpt-hotel-price-sources-2026","category":"research","keywords":["how ChatGPT gets hotel prices","ChatGPT hotel pricing","ChatGPT result_source","OTA scraping ChatGPT","labrador tier","AI search retrieval tiers","GEO hotels"],"articleSection":"Research","wordCount":1700,"readTime":"7 min","articleBody":"June 2026AI Search · Pricing\n\n# How ChatGPT pulls hotel prices:it scrapes the OTAs, then cites Reddit\n\n**TL;DR:** Ask ChatGPT what a hotel costs and it goes to the live web **98.8%** of the time. It fetches online travel agencies (Booking, Expedia, Kayak) **more than every other source combined** — but cites them **least (11%)**. It reads the price off those pages, then footnotes someone more readable: **Reddit gets cited 100%** of the time it's fetched. And the price you see almost always rides one licensed retrieval tier (`labrador`), not the scraper tier.\n\nNS\n\nNicolas Sitter\n\nPublished June 26, 2026\n\n240\n\nPrice captures\n\n3,092\n\nDocuments fetched\n\n11%\n\nOTA cite rate\n\n100%\n\nReddit cite rate\n\n[Read the Report](#always-searches)\n\n[Summary](#executive-summary)[1\\. Always searches](#always-searches)[2\\. Fetch ≠ cite](#fetch-vs-cite)[3\\. Who gets cited](#reddit-wins)[4\\. The licensed tier](#tiers)[5\\. By hotel tier](#by-segment)[6\\. The actual prices](#real-prices)[Methodology](#methodology)[FAQ](#faq)\n\n## Executive Summary\n\nChatGPT treats online travel agencies as price oracles to read from — not as the sources it shows you.\n\nWe asked ChatGPT 30 hotel-price questions — 12 named hotels across four segments and three cities, plus open “cheapest hotel” queries — and ran each eight times through Bright Data. For every answer we parsed ChatGPT's raw response stream down to its _network-source layer_: each document it fetched, the retrieval tier that served it (`result_source`), and whether it was actually cited. That's **240 captures and 3,092 fetched documents**.\n\nThe plumbing is consistent and a little surprising. A price question almost always hits the live web. ChatGPT casts a wide net over the OTAs to read the number, then cites a much smaller, more human-readable set — forums and editorial win the footnote, official sites win for chains and luxury, and the cited price rides the licensed `labrador` tier rather than the Bright Data scraper tier the trade press associates with shopping.\n\nSection 1\n\n## A price question always reaches the web\n\nBefore searching, ChatGPT files each query into a bucket (`turn_use_case`) that decides whether the web is touched. For generic hotel questions a large share land in `text` — answered from memory, no retrieval. Attach a **price** and that disappears: every price prompt routed to a live-search bucket.\n\nQuery routing across 240 ChatGPT hotel-price captures. 98.8% triggered a web search.\n\nturn\\_use\\_case\n\nShare of price turns\n\nHits the web?\n\ninstant search\n\n85%\n\nYes\n\nlocal\n\n15%\n\nYes (maps / places)\n\ntext\n\n0%\n\nNo — none landed here\n\n**Asking for a price forces ChatGPT onto the live web.** Where a generic “best hotels in Paris” can be answered from training data, “how much is a room at the Ritz on these dates” is treated as freshness-sensitive — so the answer is only as good as what it can fetch and read in that moment.\n\nSection 2\n\n## It fetches the OTAs the most and cites them the least\n\nHere is every document ChatGPT fetched for a price question, grouped by source class, against how often a fetched document of that class was actually cited. The gap between the two is the whole story.\n\nFetched vs cited per source class across 3,092 documents (471 cited). OTAs are 46% of everything fetched.\n\nSource class\n\nFetched\n\nCited\n\nCite rate\n\nForum / social (Reddit)\n\n51\n\n51\n\n100%\n\nEditorial\n\n29\n\n17\n\n59%\n\nDeal / loyalty / points\n\n170\n\n54\n\n32%\n\nMetasearch (Kayak, Trivago, Google)\n\n477\n\n89\n\n19%\n\nOfficial hotel site\n\n554\n\n87\n\n16%\n\nOTA (Booking, Expedia, Klook)\n\n1,434\n\n154\n\n11%\n\nOTAs are **46% of every document ChatGPT fetches** — nearly as much as all other classes combined — yet they have the **lowest cite rate of any real source class (11%)**. ChatGPT scrapes a wide OTA spread to read the live nightly rate off the page, then footnotes a much smaller, more human-readable set. The aggregator is the price oracle; it is not the citation.\n\nSection 3\n\n## Reddit gets cited every single time\n\nThe flip side of the OTA gap: when ChatGPT pulls a forum thread or a magazine piece about what a hotel _really_ costs, it almost always surfaces it. These are the most-cited individual price domains.\n\nMost-cited price domains across the study. Reddit is the #2 most-cited domain after Booking.com.\n\nDomain\n\nClass\n\nTimes cited\n\nbooking.com\n\nOTA\n\n64\n\nreddit.com\n\nForum / social\n\n51\n\nkayak.com\n\nMetasearch\n\n37\n\nexpedia.com\n\nOTA\n\n36\n\nhilton.com\n\nOfficial\n\n32\n\nklook.com\n\nOTA\n\n20\n\nsaverrooms.co.uk\n\nDeal / loyalty\n\n14\n\ngoogle.com\n\nMetasearch\n\n11\n\n**Two different games.** Being on Booking gets your number _read_; being discussed on Reddit or written up in a magazine gets you _shown_. For a hotel chasing visibility in price answers, earned forum and editorial coverage is worth more per citation than another OTA listing.\n\nSection 4\n\n## The cited price rides the licensed tier, not the scraper\n\nChatGPT stamps each fetched page with `result_source` — the pipeline that served it. The trade press associates the `bright` (Bright Data) tier with shopping and finance. For hotels it does show up in the _fetch_ layer — but it nearly vanishes from what gets _cited_.\n\nresult\\_source tier of fetched vs cited documents. Every tier-tagged cited price source resolves to labrador.\n\nRetrieval tier\n\nFetched\n\nCited\n\nlabrador (licensed / quality-gated)\n\n2,447\n\n392\n\nbright (Bright Data datasets)\n\n170\n\n3\n\nserp (open-web baseline)\n\n4\n\n0\n\nuntagged (footnote-only)\n\n471\n\n76\n\n`bright` appears in the fetch layer (170 documents) but is cited just 3 times. Every cited price source we can tier resolves to `labrador`. So the “Bright Data dominates shopping” story holds for what ChatGPT **fetches** — but for hotels, the price you **read** came through the licensed tier. (The untagged-but-cited rows are Reddit and editorial, which arrive via a separate footnote path and carry no tier.)\n\nSection 5\n\n## Where the price comes from depends on the hotel\n\n“Where does the price come from” has no single answer — it tracks how much of a hotel's inventory the OTAs control. Cited source-class share, by hotel segment:\n\nCited source-class share by hotel segment. Each row is the mix of sources ChatGPT cited for that group's price.\n\nSegment\n\nOTA\n\nOfficial\n\nDeal / loyalty\n\nMetasearch\n\nForum\n\nPalace (Ritz, Savoy, Plaza)\n\n21%\n\n30%\n\n23%\n\n11%\n\n7%\n\nGlobal chain (Hilton)\n\n32%\n\n31%\n\n10%\n\n12%\n\n15%\n\nBoutique indie\n\n37%\n\n15%\n\n10%\n\n19%\n\n10%\n\nBudget chain (ibis, Premier Inn, Pod)\n\n30%\n\n17%\n\n15%\n\n27%\n\n11%\n\nCity-level open queries\n\n44%\n\n~0%\n\n—\n\n25%\n\n12%\n\n**Luxury is priced off official and loyalty channels; independents off OTAs.** Palaces lean on their own site plus Amex Fine Hotels & Resorts and points blogs (OTAs only 21%). Global chains see official roughly tie with OTAs — strong `brand.com` holds its own. But boutique independents get OTA-priced 37% of the time versus their own site only 15%, even when they run a strong direct-booking site. Open “cheapest hotel” questions lean hardest on aggregators of all.\n\nSection 6\n\n## What it actually quoted — and which numbers to distrust\n\nThe per-night figure ChatGPT typically quoted for each named hotel (median across 16 captures), next to the source it leaned on. The source is a tell: when it's the official site, the number tends to hold; when it's a deal blog or an unrelated chain, distrust it.\n\nMedian per-night price ChatGPT quoted per named hotel for the same dates, with its dominant cited source. Figures are estimates; the source predicts reliability.\n\nHotel\n\nSegment\n\nTypical / night\n\nDominant cited source\n\nHilton Times Square\n\nGlobal chain\n\n~$235\n\nhilton.com (official)\n\nThe Plaza\n\nPalace\n\n$1,250–$1,900\n\ntheplazany.com (official)\n\nHilton Paris Opéra\n\nGlobal chain\n\n€260–€350\n\nhilton.com + Kayak\n\nThe Savoy\n\nPalace\n\n£1,000–£1,900\n\nBooking.com\n\nRitz Paris\n\nPalace\n\n~$2,900\n\nTravelzoo, Amex FHR, points blogs\n\nThe Greenwich Hotel\n\nBoutique indie\n\n$1,200–$1,800\n\nKayak + official\n\nHilton London Park Lane\n\nGlobal chain\n\n£250–£500\n\nBooking.com + hilton.com\n\nGrand Hôtel du Palais Royal\n\nBoutique indie\n\n~$800–$850\n\nExpedia / Booking ⚠\n\nHazlitt’s\n\nBoutique indie\n\n£450–£600\n\nhotelpricewatch, Expedia, deal blogs\n\nibis Paris Tour Eiffel\n\nBudget chain\n\n€160–€170\n\nKlook (unusual channel)\n\nPod 51\n\nBudget chain\n\n$105–$130\n\nBooking, Google, Kayak\n\nPremier Inn County Hall\n\nBudget chain\n\n£100–£260\n\nsaverrooms.co.uk (deal blog) ⚠\n\n**Three tells where the source flags a shaky number.** (1) **Premier Inn** was priced almost entirely from `saverrooms.co.uk`, a third-party deal blog — not premierinn.com, which publishes hard, JavaScript-light rates of its own; the £100–£260 spread is suspiciously wide. (2) **ibis Paris** was priced off Klook, a Southeast-Asia-centric OTA, over Accor's own site. (3) **Grand Hôtel du Palais Royal**, an independent, cited `hilton.com` — a wrong-entity citation, so its ~$800 may be anchored to an unrelated property. By contrast the official-sourced figures (Hilton Times Square, The Plaza) should track reality closely.\n\nMethodology\n\n## Study Design\n\n### Data Collection\n\n-   30 frozen price prompts: 12 named hotels (4 segments × 3 cities) plus open city-level questions, each run 8× via Bright Data, country US, 25 Jun 2026 — 240 captures.\n-   Each capture parsed from the raw SSE stream into its network-source layer: every fetched document with `result_source` tier and cited flag — 3,092 documents, 471 cited.\n-   Prices extracted from the prose answer per capture; source classes curated from observed domains.\n\n### Caveats\n\n-   N = 240, one capture day, US proxy, English only — a snapshot, not a trend.\n-   Cited documents arrive via a footnote path with different URL strings than the tier-tagged fetches, so a cited source's tier is imputed from its domain's modal tier. All raw rows are stored.\n-   The structured shopping block never fired for hotels (it is a products surface); prices come via the map panel and prose.\n-   `oxylabs` did not appear in this hotel run; `serp` appeared 4× and was never cited.\n\n### The 30 frozen prompts\n\nEvery prompt carries the same concrete, live-bookable dates (**July 10–12**, ~2 weeks out from the run) and was sent logged-out from a US IP, 8× each. The set is the four templates below crossed with the 12-hotel panel and 3 cities — 24 named + 6 open = 30.\n\nNamed hotel`price_point`\n\nHow much is a room at \\[hotel\\] for the nights of July 10–12?\n\nNamed hotel`cheapest`\n\nWhat’s the cheapest way to book \\[hotel\\] for July 10–12?\n\nOpen · city`budget_list`\n\nWhat are the best hotels in \\[city\\] under \\[budget\\] a night for July 10–12?\n\nOpen · city`cheapest`\n\nWhere can I find the cheapest hotel in \\[city\\] for July 10–12?\n\n`[hotel]` draws from the 4 segments × 3 cities panel; `[city]` is Paris, London or New York; `[budget]` is €250 / £250 / $250 respectively.\n\nSegment\n\nParis\n\nLondon\n\nNew York\n\nPalace\n\nRitz Paris\n\nThe Savoy\n\nThe Plaza\n\nGlobal chain\n\nHilton Paris Opéra\n\nHilton London Park Lane\n\nHilton Times Square\n\nBoutique / indie\n\nGrand Hôtel du Palais Royal\n\nHazlitt’s\n\nThe Greenwich Hotel\n\nBudget chain\n\nibis Paris Tour Eiffel Cambronne 15ème\n\nPremier Inn London County Hall\n\nPod 51 Hotel\n\n**Open data.** Headline stats and the underlying tables are published as CSV: [summary.csv](/data/chatgpt-hotel-price-sources-2026/summary.csv), [fetch\\_vs\\_cite.csv](/data/chatgpt-hotel-price-sources-2026/fetch_vs_cite.csv), [tier\\_distribution.csv](/data/chatgpt-hotel-price-sources-2026/tier_distribution.csv), [by\\_segment.csv](/data/chatgpt-hotel-price-sources-2026/by_segment.csv), [top\\_cited\\_domains.csv](/data/chatgpt-hotel-price-sources-2026/top_cited_domains.csv), [quoted\\_prices.csv](/data/chatgpt-hotel-price-sources-2026/quoted_prices.csv).\n\n### Summarize with AI\n\nFAQ\n\n## Frequently Asked Questions\n\n## Continue Reading\n\nMore field tests of how AI engines find, price and cite sources.\n\n[ChatGPT's Hidden result\\_source](/research/chatgpt-result-source-retrieval-tiers-2026)[Dates & Prices Move the Answer](/research/chatgpt-hotel-dates-prices-2026)\n\n[All Research](/research)","author":{"@type":"Person","name":"Nicolas Sitter","url":"https://nicolassitter.com/about","sameAs":["https://www.linkedin.com/in/nicolassitternolleau/","https://github.com/Nicositter88","https://hotelrank.ai"]},"publisher":{"@type":"Person","name":"Nicolas Sitter","url":"https://nicolassitter.com"},"image":"https://nicolassitter.com/api/og/chatgpt-hotel-price-sources-2026","mainEntityOfPage":{"@type":"WebPage","@id":"https://nicolassitter.com/research/chatgpt-hotel-price-sources-2026"},"tags":["AI Search","ChatGPT","Pricing","GEO","Hotels"],"sameAs":["https://hotelrank.ai/research/chatgpt-hotel-price-sources-2026"],"alternateFormat":{"html":"https://nicolassitter.com/research/chatgpt-hotel-price-sources-2026","json":"https://nicolassitter.com/api/post/chatgpt-hotel-price-sources-2026","rss":"https://nicolassitter.com/rss.xml"},"datasets":[{"name":"summary","contentUrl":"https://nicolassitter.com/data/chatgpt-hotel-price-sources-2026/summary.csv","encodingFormat":"text/csv"}]}