June 2026AI Memory

AI Hotel Memory 2026:What does a chatbot remember about hotels — with the web turned off?

Every other study here measures what AI recommends after it searches the web. This one measures the opposite: what it has actually memorised. We turned web search off and asked three cheap models to name hotels — and to give each one’s website. Verifying those websites turns recall into a confabulation lie-detector.

TL;DR. Hotel chains are known cold — every model names Marriott, Hilton, Four Seasons and returns their correct website ~99% of the time. For individual hotelsit cracks: models confidently return a dead or wrong website 3% to 53% of the time, worst in Paris. The cheapest model tested (Gemini 3.1 Flash-Lite) had the best memory. And the failure mode is unsettling — the model knows the hotel exists, then invents its web address.

Summarize with AI

ChatGPTPerplexityClaudeGeminiGrok
0
web searches
pure parametric memory
99%
chain websites correct
every model knows chains
47%
worst city accuracy
nano, Paris hotels
~1,400
generations
3 models · chains + 4 cities

People increasingly ask chatbots for hotels. Usually the model searches the web first — so its answer reflects what’s online, not what it knows. We removed that crutch. With tools disabled, an LLM can only answer from the patterns baked into its weights during training: its parametric memory. So we asked, repeatedly: name hotels you know — in JSON, with each hotel’s website and address. Because a website is checkable, we can do something a pure recall test can’t: measure whether the memory is correct, not just present.

The result is a clean gradient. Chains live in every model’s memory perfectly. Famous palace hotels mostly do too. But the long tail of real, individual hotels is where models start to confabulate — and they do it with total confidence, returning a tidy JSON record for a hotel whose website doesn’t exist.

1. The experiment

The design borrows from Dejan Marketing’s AI Brand Authority Index, which ranks brands by how often a model names them unprompted. We adapt it to hotels and add one twist that fixes Dejan’s biggest limitation — he could only normalise raw name strings, never verify them.

  • Web search OFF. No tools, no retrieval. Pure memory. (In normal use, ChatGPT searches the web — we deliberately don’t.)
  • Three cheap models: GPT-5.4-nano, GPT-5.4-mini, and Google’s Gemini 3.1 Flash-Lite.
  • Two cuts: “name hotel chains” (global), and “name hotels in {city}” for Paris, Dubai, London and New York.
  • JSON output: each run returns {name, website, address}. The website becomes the hotel’s identity — no fuzzy matching against a database needed.
  • Verification: we check whether each returned domain actually resolves. A live domain ≈ real memory; a dead one ≈ confabulation.
  • ~150 runs per model for chains, ~80 per model per city, temperature 1.0, repeated to surface what’s consistently top-of-mind.
Asking for the website is the whole trick. “Name a hotel” is unfalsifiable — almost any string could be a hotel somewhere. “Name a hotel and its website” forces a checkable commitment, and a model with only a fuzzy memory of a place will reliably invent a plausible-but-wrong domain.

2. Chains are known cold

The base layer of hotel memory is rock-solid. Asked for hotel chains, every model returns the majors — and their correct websites — essentially every time. GPT-5.4-mini hit a 99% live-website rate on chains; even nano managed 88%. Chains appear in training data on a single, predictable domain millions of times, so the association is overlearned.

Top hotel chains by recall (GPT-5.4-mini, 150 runs, web search off). 'Recalled' = share of runs that named the chain. All websites shown resolve.
RankChainRecalledAvg rankWebsite (verified)
#1Marriott Hotels100%19.1marriott.com
#2Hilton Hotels & Resorts100%16hilton.com
#3Hyatt Regency100%9.6hyatt.com
#4InterContinental Hotels & Resorts100%12.7ihg.com
#5Best Western100%22.6bestwestern.com
#6Radisson Blu99%23.7radissonhotels.com
#7Ritz-Carlton98%15.7ritzcarlton.com
#8Quality Inn93%25.1choicehotels.com
#9Four Seasons Hotels and Resorts91%16.2fourseasons.com
#10Super 885%27.4wyndhamhotels.com

3. What AI remembers, city by city

Drop to the individual-hotel level and a city’s “memory leaderboard” emerges — the properties a model names again and again, unprompted. These are the hotels with the strongest grip on the model’s mind. Tables below are GPT-5.4-mini; “Recalled” is the share of 80 runs that named the hotel.

Paris

Most-remembered hotels in Paris (GPT-5.4-mini, 80 runs, web search off).
RankHotelRecalledAvg rankWebsite (verified)
#1Hôtel de Crillon, A Rosewood Hotel100%6.3rosewoodhotels.com
#2Four Seasons Hotel George V100%3.8fourseasons.com
#3Mandarin Oriental, Paris100%6.6mandarinoriental.com
#4Shangri-La Paris100%5.1shangri-la.com
#5Hôtel Plaza Athénée100%6.1dorchestercollection.com
#6The Ritz Paris96%3.3ritzparis.com
#7Hôtel Molitor Paris - MGallery83%20.3all.accor.com
#8Le Bristol Paris81%5.6oetkercollection.com

Dubai

Most-remembered hotels in Dubai (GPT-5.4-mini, 80 runs, web search off).
RankHotelRecalledAvg rankWebsite (verified)
#1Burj Al Arab Jumeirah100%8.2jumeirah.com
#2Atlantis, The Palm100%2.9atlantis.com
#3Address Downtown100%11.3addresshotels.com
#4The Ritz-Carlton, Dubai100%10ritzcarlton.com
#5W Dubai - The Palm100%18.5marriott.com
#6Hilton Dubai The Walk99%22.2hilton.com
#7One&Only The Palm95%11.4oneandonlyresorts.com
#8Raffles Dubai95%16.3raffles.com

London

Most-remembered hotels in London (GPT-5.4-mini, 80 runs, web search off).
RankHotelRecalledAvg rankWebsite (verified)
#1The Savoy100%1.6thesavoylondon.com
#2The Dorchester100%5.3dorchestercollection.com
#3Shangri-La The Shard, London100%8.2shangri-la.com
#4The Ned100%14.5thened.com
#5Rosewood London100%8.3rosewoodhotels.com
#6The Ritz London99%2.6theritzlondon.com
#7Claridge's99%3claridges.co.uk
#8The Langham, London99%5.5langhamhotels.com

New York

Most-remembered hotels in New York (GPT-5.4-mini, 80 runs, web search off).
RankHotelRecalledAvg rankWebsite (verified)
#1The Plaza Hotel100%1theplazany.com
#2The St. Regis New York100%13.4marriott.com
#3Park Hyatt New York100%16hyatt.com
#4Conrad New York Downtown100%12.2hilton.com
#5The Langham, New York, Fifth Avenue99%8langhamhotels.com
#6Mandarin Oriental, New York98%5.8mandarinoriental.com
#7Four Seasons Hotel New York Downtown98%5.3fourseasons.com
#81 Hotel Central Park98%14.91hotels.com

4. The website lie-detector

Here is the headline. For each model and place, what share of the hotels it named came with a website that actually resolves? Chains: near-perfect. Individual hotels: a different story — and it gets worse the weaker the model and the more independent the city.

ai-hotel-memory-website-accuracy-2026
ModelGlobal chainsParisDubaiLondonNew York
Gemini 3.1 Flash-Lite92%71%97%96%89%
GPT-5.4-mini99%58%94%92%82%
GPT-5.4-nano88%47%77%67%68%

The failure mode is the interesting part. The model usually knows the hotel exists — it just fabricates the web address, often a clean, plausible guess that happens to be dead:

Hotel (real)What nano inventedThe real website
Le Bristol Paris· Parisbristolparis.com ✗ deadoetkercollection.com
Hôtel Plaza Athénée· Parisplazaathenee-paris.com ✗ deaddorchestercollection.com
The Ritz Paris· Paristheritzparis.com ✗ deadritzparis.com
Pod Times Square· New Yorkpod-hotels.com ✗ deadthepodhotel.com
GPT-5.4-nano returns bristolparis.com for Le Bristol Paris — a dead domain. The hotel is one of the most famous in the world; its real site isoetkercollection.com. The model didn’t fail to recall the hotel — it recalled the hotel and hallucinated the URL. That is exactly the kind of confident-but-wrong detail that slips past a reader.

5. Why Paris is the hardest city

Paris is the worst-remembered of the four cities for every model (down to 47% live-website rate on nano), and it also produces the longest tail of invented names — nano emitted 469 distinct “Paris hotels” across 80 runs, versus a tight 177 for Gemini. The reason is structural: Paris’s top hotels are independent palaces on collection domains the model can’t predict — Le Bristol on oetkercollection.com, Plaza Athénée on dorchestercollection.com. A chain trains the model on one obvious domain; an independent does not, so the model guesses — and misses.

ai-hotel-memory-invented-tail-2026

A tighter set with more live websites (Gemini) signals a sharper, more reliable memory; a sprawling set with dead domains (nano) signals a model padding its answer with invention.

6. The cheap-model surprise

The counterintuitive finding: the cheapest model had the best hotel memory. Google’s Gemini flash-lite returned the highest share of working websites in cities — 97% in Dubai, 96% in London— beating the pricier GPT-5.4-mini and far ahead of GPT-5.4-nano. It’s not about price; it’s about how much specific, rare detail a model retains. This is also why Dejan’s original brand index could run on cheap Gemini at all: that family punches above its cost on real-world entity recall. Two cheap models, two very different memories.

7. What it means for hotels

  • If you’re a chain or a famous palace, the model knows you — name and website. You’re in the memory layer, not just the search layer.
  • If you’re an independent or boutique hotel, the model may know your name but invent your web address. When a model answers from memory (or a user copies its output), that’s a wrong link pointing away from you.
  • Memory ≠ retrieval. With web search on, models ground their answers and this mostly disappears. But the memory layer still shapes which hotels a model reaches for first, and what it “believes” before it searches.
  • The fix is the same as the rest of AI visibility: a consistent, well-linked web presence is what turns a hotel from a fuzzy memory into a correctly-remembered one.

Methodology

Models: gpt-5.4-nano, gpt-5.4-mini (OpenAI), gemini-flash-lite-latest — which resolved to Gemini 3.1 Flash-Lite (Google), called with no tools and temperature 1.0. Runs: ~150 per model for the chains cut; ~80 per model for each of Paris, Dubai, London, New York. Output: JSON, requesting name + website + address per hotel.

Identity & scoring: hotels are keyed by their returned website domain (falling back to a normalised name). “Recalled” is the share of runs that named the hotel; “avg rank” is its mean position in the list. Website verification: each distinct domain is checked for DNS resolution. A resolving domain is treated as a (conservative) signal of real memory; a non-resolving one as confabulation.

Caveats: DNS-resolves is a lower bound on accuracy — a domain can resolve yet still be the wrong hotel, so true confabulation is somewhat higher than reported. This measures the memory of three specific cheap models, not the full ChatGPT or Gemini consumer products (which use larger models and web search). Cost of the entire study was under €20.

FAQ