{"@context":"https://schema.org","@type":"BlogPosting","headline":"ChatGPT Hotel Index vs Live Web: What Changes When Search Goes Offline","description":"400 hotel queries on GPT-5.4 and GPT-5.3 in live and cached mode. 83% of cited domains differ.","datePublished":"2026-04-09","dateModified":"2026-04-09","url":"https://nicolassitter.com/research/chatgpt-hotel-index-vs-live-web-2026","category":"research","keywords":["ChatGPT index","ChatGPT live search","GPT-5.4","ChatGPT hotel sources"],"articleSection":"Research","wordCount":5400,"readTime":"22 min","articleBody":"April 2026Live vs Cached\n\n# ChatGPT's Hotel Index Is a Different Web\n\nWe flipped one API parameter and got different hotel recommendations. 400 queries, 2 models, 2 search modes. **83% of cited domains change** when ChatGPT uses its own index instead of the live web.\n\n**TL;DR:** OpenAI's API has live web switch: `external_web_access`. Set it to `false` and ChatGPT searches only its cached corpus. We ran 100 hotel prompts on GPT-5.4 and GPT-5.3 in both modes. **83–85% of cited domains differ.** For 13 prompts, **the overlap is literally zero.**\n\nNS\n\nNicolas Sitter\n\nPublished April 9, 2026\n\n[Read Methodology](#methodology)[What This Means for Hotels](#implications)\n\n## Key Findings\n\nOpenAI has an [external\\_web\\_access parameter](https://developers.openai.com/api/docs/guides/tools-web-search#live-internet-access) in their web search tool. Set it to `false` and the model searches only cached/indexed results — confirming that OpenAI maintains its own search index alongside live web access.\n\nThis is not a minor technical detail. For hotel marketers asking “is my property visible in ChatGPT?”, the answer depends on _which ChatGPT_ — the one with live web (aka Google) access, or the one running on OpenAI's own index. They return fundamentally different source sets, cite different domains, and recommend different properties.\n\n17%\n\nDomain overlap (GPT-5.4)\n\nLive vs cached\n\n15%\n\nDomain overlap (GPT-5.3)\n\nLive vs cached\n\n0%\n\nOverlap on 13 prompts\n\nCompletely disjoint\n\n**The one-parameter audit trap:** if a hotel marketer runs a “is my property visible in ChatGPT?” check through the API without controlling `external_web_access`, they could get two contradictory answers from the same model, on the same prompt, seconds apart.\n\n1\n\n## The Index Is a Different Web\n\nAcross 100 prompts, only 6–17% of cited domains overlap between live and cached responses. The overlap at the URL level is even lower (6–10%).\n\nOverall Jaccard similarity between live and cached modes\n\nModel\n\nDomain Jaccard\n\nURL Jaccard\n\nQuery Jaccard\n\ngpt-5.4\n\n0.17\n\n0.06\n\n0.02\n\ngpt-5.3-chat-latest\n\n0.15\n\n0.10\n\n0.43\n\nFor **13 prompts** (across both models), the domain Jaccard is **exactly 0.0** — the live and cached answers don't share a single source domain.\n\n### Zero-overlap examples\n\ngpt-5.4Best hotels in Cape Town\n\ngpt-5.4Best hotels in Tokyo\n\ngpt-5.4Best hotels in Barcelona Gothic Quarter\n\ngpt-5.3Best hotels in Dubai\n\ngpt-5.3Best boutique hotels in Rio de Janeiro\n\ngpt-5.3Best hotels in Marrakech Medina\n\n### Example: “Best hotels in Dubai”\n\nGPT-5.3-chat-latest · Domain Jaccard = 0.0 — not a single shared source\n\nLive web4 sources\n\nen.wikipedia.org\n\nHotel history & location\n\nbulgarihotels.com\n\nOfficial brand site\n\nworldtravelawards.com\n\nAward listing\n\nagluxuryproperties.com\n\nDubai luxury guide\n\nCached index4 sources\n\ntraveltodubai.ae\n\nDubai tourism portal\n\nthemiddleeastinsider.com\n\nRegional travel blog\n\ntimesofindia.indiatimes.com\n\nIndian newspaper travel section\n\ndubaivisitvisa.online\n\nVisa & travel guide site\n\n**Same model, same prompt, same moment.** The live web pulls Wikipedia and the actual Bulgari hotel site. The index pulls a Dubai visa site and an Indian newspaper. These are not slightly different source mixes — they are entirely different information ecosystems producing different hotel recommendations.\n\n### Example: “Best boutique hotels in Tokyo”\n\nGPT-5.3-chat-latest · Domain Jaccard = 0.0 — 17 sources, zero overlap\n\nLive web10 sources\n\nen.wikipedia.org\n\nHotel & neighborhood articles\n\nwallpaper.com\n\nDesign & architecture magazine\n\nthehoteljournal.com\n\nBoutique hotel editorial\n\nsmallboutique-hotels.com\n\nBoutique hotel directory\n\ntravel.rakuten.com\n\nJapanese booking platform\n\nwhimsysoul.com\n\nTravel blog\n\nblog.bespoke-discovery.com\n\nJapan travel blog\n\njasumo.com\n\nJapan travel guide\n\ncccj.or.jp\n\nCanadian Chamber of Commerce Japan\n\nteam.interaction-design.org\n\nDesign community\n\nCached index7 sources\n\ntripadvisor.com\n\nReview platform\n\nthehotelguru.com\n\nHotel comparison site\n\nluxuryhotel.guru\n\nLuxury hotel directory\n\ntrulytokyo.com\n\nTokyo travel guide\n\ntouristjapan.com\n\nJapan tourism site\n\nhikemasterjapan.com\n\nJapan outdoor travel\n\nlocalsinjapan.com\n\nJapan expat blog\n\n**17 total sources, not a single one in common.** The live web finds Wikipedia, Wallpaper\\* magazine, and Rakuten Travel. The index falls back to TripAdvisor, niche hotel directories, and Japan-focused blogs. A hotel visible on one side is invisible on the other.\n\n### Example: “Best hotels in Singapore Marina Bay”\n\nGPT-5.4 · Domain Jaccard = 0.67 — when there _is_ overlap, it's hotel brand sites\n\nLive web5 sources\n\n●marinabaysands.com\n\nOfficial hotel site\n\n●fullertonhotels.com\n\nOfficial hotel site\n\n●mandarinoriental.com\n\nOfficial hotel site\n\n●ritzcarlton.com\n\nOfficial hotel site\n\nhilton.com\n\nOfficial hotel site\n\nCached index5 sources\n\n●marinabaysands.com\n\nOfficial hotel site\n\n●fullertonhotels.com\n\nOfficial hotel site\n\n●mandarinoriental.com\n\nOfficial hotel site\n\n●ritzcarlton.com\n\nOfficial hotel site\n\npanpacific.com\n\nOfficial hotel site\n\n**Green = shared across both modes.** When there _is_ convergence, it's on **official hotel brand websites** — the domains GPT-5.4 actively hunts with `site:` queries. Marina Bay Sands, Fullerton, Mandarin Oriental, and Ritz-Carlton appear in both modes because they're major brands with strong web presence. The only difference: live mode picks Hilton, cached mode picks Pan Pacific. Brand authority is the stabilizing force.\n\n**Cape Town, Tokyo, Barcelona, Dubai, Rio, Marrakech** — these are not obscure destinations. These are tier-1 travel cities where the index and the live web produce _zero_ shared sources. If your hotel is in one of these markets, which ChatGPT your guest uses matters.\n\n2\n\n## Africa Is the Index's Blind Spot\n\nGPT-5.3's index barely covers Africa — domain Jaccard of just 0.061, meaning the cached and live results share almost nothing. GPT-5.4 is dramatically more uniform across continents.\n\n### GPT-5.3 — Uneven coverage\n\n### GPT-5.3 live vs cached domain overlap by continent\n\nBar chart showing GPT-5.3 domain Jaccard by continent, Africa highlighted in red at 6.1%\n\n### GPT-5.4 — Uniform coverage\n\n### GPT-5.4 live vs cached domain overlap by continent\n\nBar chart showing GPT-5.4 domain Jaccard by continent, relatively uniform between 15-20%\n\nPer-continent domain Jaccard (live vs cached)\n\nContinent\n\nGPT-5.3\n\nGPT-5.4\n\nMENA\n\n0.217\n\n0.202\n\nNorth America\n\n0.210\n\n0.187\n\nOceania\n\n0.177\n\n0.173\n\nLatin America\n\n0.169\n\n0.165\n\nAsia\n\n0.137\n\n0.165\n\nEurope\n\n0.101\n\n0.163\n\nAfrica\n\n0.061\n\n0.155\n\n**GPT-5.4's fan-out strategy equalizes coverage.** Because it issues brand-targeted `site:` queries, it finds common ground in both modes even where coverage is thin. GPT-5.3's simple one-shot searches expose the raw state of the index — and in Africa, that index is nearly empty.\n\n3\n\n## 3-Star Queries Are 2x More Reproducible\n\nBudget/star-rating queries collapse to a small set of OTAs (Booking, Expedia, Hotels.com, TripAdvisor) that exist in _both_ the index and the live web. Boutique and persona queries fan out to editorial sources where the divergence is much higher.\n\n### Live vs cached domain overlap by query type\n\nBar chart showing 3-star queries at ~28% overlap versus all other tiers at 10-18%\n\nPer-tier domain Jaccard (live vs cached)\n\nQuery Type\n\nGPT-5.3\n\nGPT-5.4\n\nBroad (\"Best hotels in {city}\")\n\n0.111\n\n0.146\n\nBoutique\n\n0.098\n\n0.125\n\n3-star\n\n0.298\n\n0.262\n\nNeighborhood\n\n0.102\n\n0.184\n\nPersona (couples)\n\n0.123\n\n0.142\n\n**Operational takeaway:** if you're auditing your hotel's visibility in ChatGPT, 3-star queries give the most stable results across modes. Boutique and luxury audits are **mode-dependent** — always run both and reconcile.\n\n4\n\n## GPT-5.4 Does Keyword Research; GPT-5.3 Does Not\n\nThe two models have completely different search strategies. GPT-5.4 behaves like an SEO analyst; GPT-5.3 is a simple one-shot retriever.\n\nSearch behavior comparison\n\nMetric\n\nGPT-5.3\n\nGPT-5.4\n\nSearches per response\n\n1.0\n\n~2.0\n\nAvg query length (words)\n\n6.5\n\n10.9\n\nMax query length\n\n11\n\n27\n\n% with year (2023+)\n\n53%\n\n27%\n\n% with site: operator\n\n0%\n\n31%\n\n% containing \"official\"\n\n0%\n\n87%\n\n% containing \"review\"\n\n3%\n\n13%\n\n### GPT-5.3 queries\n\n\"best boutique hotels Paris 2026\"\n\n\"top luxury hotels Tokyo 2026\"\n\n\"best 3-star hotels Barcelona\"\n\nSimple, natural language. No operators.\n\n### GPT-5.4 queries\n\n\"site:cntraveler.com best boutique hotels paris 2025\"\n\n\"site:michelin.com MICHELIN Guide Barcelona hotel\"\n\n\"site:booking.com Rome 3-star hotel official rating\"\n\nLong, intent-loaded. 87% include \"official\".\n\n**GPT-5.4 searches by brand name — both editorial and hotel brands.** Forbes, Michelin, CN Traveler, Booking.com appear by name in its queries. It also targets individual hotel domains with `site:` queries to verify location, amenities, and room types directly from the source. If your hotel's own website is unindexed, blocked to AI crawlers, or has poor structure, GPT-5.4 cannot find it through this path.\n\n### Brands GPT-5.4 searches for by name (across 381 queries)\n\nForbes ×48Michelin ×43Booking ×24TripAdvisor ×21Hyatt ×11Conde Nast ×8Park Hyatt ×7Four Seasons ×7Hilton ×7Mandarin ×6\n\nGPT-5.3 issued **zero queries** containing any brand or publisher name.\n\n5\n\n## Who Powers Each Mode\n\nThe source mix shifts dramatically between modes. Wikipedia dominates GPT-5.3 live mode. Michelin dominates GPT-5.4 live mode. TripAdvisor leads the cached index for both models.\n\n### GPT-5.3 — Cached (index)\n\n### GPT-5.3 cached: top cited domains\n\nTripAdvisor leads at 49, followed by Oyster at 27 and Expedia at 17\n\n### GPT-5.3 — Live\n\n### GPT-5.3 live: top cited domains\n\nWikipedia explodes to #1 with 56 citations, TripAdvisor drops to 23\n\n**Wikipedia: absent from the index, #1 on the live web.** For GPT-5.3, Wikipedia jumps from _not appearing in the top 20_ in cached mode to **#1 with 56 citations** in live mode. Hotel Wikipedia pages are an underrated visibility lever — but only for the live web path.\n\n### GPT-5.4 — Cached (index)\n\n### GPT-5.4 cached: top cited domains\n\nTripAdvisor leads at 26, CN Traveler at 19, Forbes Travel Guide at 18\n\n### GPT-5.4 — Live\n\n### GPT-5.4 live: top cited domains\n\nMichelin Guide jumps to #1 with 22 citations in live mode\n\n**The index favors TripAdvisor. The live web favors editorial authority.** In cached mode, TripAdvisor leads for both models. In live mode, Michelin Guide (#1 for GPT-5.4) and Wikipedia (#1 for GPT-5.3) take over. This means TripAdvisor is a critical presence in OpenAI's own index — but editorial prestige (Michelin Keys, Forbes ratings) matters more when the live web is accessible.\n\n6\n\n## GPT-5.4 Actually Browses Pages\n\nGPT-5.4 doesn't just search — it opens pages and reads them. It issued 17 `open_page` and 4 `find_in_page` actions in live mode. GPT-5.3 issued zero.\n\nBrowsing actions by model and mode\n\nModel + Mode\n\nsearch\n\nopen\\_page\n\nfind\\_in\\_page\n\nGPT-5.4 live\n\n181\n\n17\n\n4\n\nGPT-5.4 cached\n\n200\n\n5\n\n0\n\nGPT-5.3 live\n\n100\n\n0\n\n0\n\nGPT-5.3 cached\n\n100\n\n0\n\n0\n\n### Where GPT-5.4 opens pages (live mode)\n\ntripadvisor.com×6\n\nbooking.com×4\n\nguide.michelin.com×3\n\ncntraveler.com×2\n\noneandonlyresorts.com×1\n\ncasonaroma.com×1\n\n### What GPT-5.4 searches for _inside_ pages\n\n\"#1 Best Value\" on tripadvisor.com/Hotels-...-Paris\n\n\"4.5 of 5 bubbles\" on tripadvisor.com/Hotels-...-London\n\n\"Palacio Duhau\" on cntraveler.com/gallery/best-hotels-in-buenos-aires\n\nThe model has learned TripAdvisor's ranking labels and hunts for them by name. This is competitive research, not text generation.\n\n**21 out of 22 `open_page` URLs also appear in citations.** Browsing doesn't unlock new sources — it's a deep-read of sources the model already found via search. The signal is _which_ domains GPT-5.4 considers worth reading: TripAdvisor, Booking, Michelin, CN Traveler. Those are the publishers it actively trusts enough to read the body, not just the snippet.\n\n7\n\n## What This Means for Hotels\n\n### 1It's about authority sources — not just one publisher\n\nGPT-5.4 searches for authority sources by name: Michelin Guide, Forbes Travel Guide, CN Traveler, TripAdvisor, Booking.com — 75+ brand mentions across 100 prompts. But the broader point is that **any trusted editorial source** matters: travel magazines, award bodies (World Travel Awards, World's 50 Best Hotels), national tourism boards, and respected travel blogs all feed into the live web path. In cached mode, the index falls back to a narrower set dominated by OTAs and niche aggregators. The takeaway isn't “get on Michelin” — it's that **editorial authority is the currency of live-web AI recommendations**, and hotels that invest in PR, awards, and media coverage have a structural advantage in the live path.\n\n### 2Your hotel's own website matters — by name\n\n87% of GPT-5.4's queries contain “official” and ~30% use `site:` against specific hotel domains. Hotels with sites that are blocked to AI crawlers, slow to render, or missing structured data are invisible to GPT-5.4's strongest research pattern. See our [robots.txt study](/research/hotel-robots-ai-blocking-study-2026) for how many hotels block AI crawlers.\n\n### 3TripAdvisor is the index workhorse\n\nTripAdvisor leads the cached index for both models. GPT-5.4 reads specific TripAdvisor list pages with `find_in_page` and pulls the “#1 Best Value” label. TripAdvisor visibility translates more directly into LLM citations than any other aggregator. See our [TripAdvisor in ChatGPT study](/research/tripadvisor-chatgpt-hotels-study-2026).\n\n### 4Wikipedia is GPT-5.3's live-mode favorite\n\nWikipedia jumps from absent in cached mode to **#1 with 56 citations** in live mode for GPT-5.3. Hotel Wikipedia pages are an underrated visibility lever for the chat-tuned model line.\n\n### 5Always audit in both modes\n\n3-star queries give ~2x higher live-vs-cached overlap than any other tier. Boutique and luxury audits are mode-dependent. Run both modes and reconcile. And for African markets, only GPT-5.4 produces stable cross-mode results.\n\nMethodology\n\n## How We Collected This Data\n\n### Setup\n\n-   **Models:** GPT-5.4 (latest model, available to paid users) and GPT-5.3-chat-latest (the model currently powering ChatGPT.com for free users). We chose these two to cover both ends of the user base. Note: using the API does not perfectly replicate the ChatGPT.com UI experience (different system prompt, no memory, no tool orchestration), but it lets us isolate the `external_web_access` variable cleanly.\n-   **API:** OpenAI Responses API with `web_search` tool\n-   **Modes:** `external_web_access=true` (live) and `false` (cached)\n-   **Tool choice:** forced via `tool_choice={type: \"web_search\"}` — every call searches\n\n### Prompts\n\n-   **100 hotel discovery prompts**\n-   **20 cities** × 5 prompt tiers spanning all inhabited continents\n-   **5 tiers:** broad, boutique, 3-star, neighborhood, persona (couples)\n-   **1 run per (model, mode, prompt)** → 400 total calls, all succeeded\n\n**Why 100 prompts, not 1,000?** This study measures _source-level divergence_ (which domains get cited), not answer-level variation (which hotels get named). Source sets stabilize fast — by 100 prompts across 20 cities and 5 tiers, the signal is already unambiguous: 83% domain divergence with 13 zero-overlap cases. More prompts would add volume without changing the conclusion.\n\n### Captured Per Call\n\n-   Full response text\n-   All `web_search_call.action` items (search, open\\_page, find\\_in\\_page)\n-   All `url_citation` annotations (URL + title + offsets)\n-   Latency and token usage\n\n### Analysis Metrics\n\n-   **Domain Jaccard:** intersection/union of cited domains between live and cached for each prompt\n-   **URL Jaccard:** same metric at the exact-URL level\n-   **Query Jaccard:** overlap in the search queries the model issues\n-   Query Jaccard vs Query Jacquouille: Just watch Les Visiteurs\n-   Results aggregated per continent, per prompt tier, and per model\n\n### Data Summary\n\n-   **400 API calls** (100 prompts × 2 models × 2 modes)\n-   **100% success rate** — all 400 calls returned results\n-   Data collected: **April 2026**\n\n### Data Access\n\nWe believe in open research. Contact us for access to the raw data, analysis scripts, and methodology details.\n\n[Request Data Access](/contact)\n\n## Frequently Asked Questions\n\n## Continue Reading\n\nThis study is part of our ongoing research into how AI search engines recommend hotels.\n\n[Anatomy of ChatGPT Hotel Search](/research/anatomy-chatgpt-hotel-search-2026) [AI Hotel Landscape 2026](/research/ai-hotel-landscape-2026)","author":{"@type":"Person","name":"Nicolas Sitter","url":"https://nicolassitter.com/about","sameAs":["https://www.linkedin.com/in/nicolassitternolleau/","https://github.com/Nicositter88","https://hotelrank.ai"]},"publisher":{"@type":"Person","name":"Nicolas Sitter","url":"https://nicolassitter.com"},"image":"https://nicolassitter.com/api/og/chatgpt-hotel-index-vs-live-web-2026","mainEntityOfPage":{"@type":"WebPage","@id":"https://nicolassitter.com/research/chatgpt-hotel-index-vs-live-web-2026"},"tags":["ChatGPT","Web Search","Index","GPT-5.4"],"sameAs":["https://hotelrank.ai/research/chatgpt-hotel-index-vs-live-web-2026"],"alternateFormat":{"html":"https://nicolassitter.com/research/chatgpt-hotel-index-vs-live-web-2026","json":"https://nicolassitter.com/api/post/chatgpt-hotel-index-vs-live-web-2026","rss":"https://nicolassitter.com/rss.xml"},"datasets":[{"name":"summary","contentUrl":"https://nicolassitter.com/data/chatgpt-hotel-index-vs-live-web-2026/summary.csv","encodingFormat":"text/csv"}]}