Do AI models actually know real hotels without searching the web?

For hotel chains, yes — every model tested named Marriott, Hilton, Hyatt, Four Seasons and gave their correct websites essentially 100% of the time. For individual hotels it depends on the model and the city: the strongest cheap model returned a working website for ~90%+ of recalled hotels in Dubai and London, but only ~58–71% in Paris. The weakest model fell to 47%.

How can you tell if an AI is making a hotel up?

We asked each model to return the hotel’s official website, then checked whether the domain actually resolves. A real memory yields a real, live domain; a fabricated one yields a plausible-but-dead domain. Example: GPT-5.4-nano returns bristolparis.com for Le Bristol Paris — a dead domain — while the real site is oetkercollection.com. The hotel is real; the website is invented.

Why is Paris harder for AI than Dubai or London?

Paris’s hotel scene is dominated by independent and palace hotels with idiosyncratic, non-chain domains (Le Bristol lives on oetkercollection.com, Plaza Athénée on dorchestercollection.com). Chain hotels train the model on a single predictable domain; independents do not, so models guess a plausible standalone domain and get it wrong far more often.

Does this mean ChatGPT recommends fake hotels?

Not in normal use — when ChatGPT has web search on, it grounds answers in live results. This study deliberately turns search OFF to isolate what the model has memorized. The takeaway for hoteliers is about the memory layer: if a model has to answer from memory, it knows the big chains and famous properties cold, but will confidently invent your website if you are a smaller or independent hotel.

June 2026AI Memory

AI Hotel Memory 2026:Models remember the hotel, then invent the website

Every other study here measures what AI recommends after it searches the web. This one measures the opposite: what it has actually memorised. We turned web search off and asked three low-cost models to name hotels — and to give each one’s website. A hotel name is hard to verify in isolation; a website is a checkable commitment, which turns recall into a confabulation lie-detector.

TL;DR. Hotel chains are known cold — every model names Marriott, Hilton, Four Seasons and returns their correct website ~99% of the time. Individual hotels are different: with web search off, models confidently return a non-resolving (dead) website 3% to 53% of the time, worst in Paris. The cheapest model tested, Gemini 3.1 Flash-Lite, had the best live-domain rate. The failure mode is the point — the model knows the hotel exists, then invents its web address.

Summarize with AI

web searches

pure parametric memory

99%

chain websites correct

every model knows chains

47%

lowest live-domain rate

nano · Paris hotels

~1,400

generations

3 models · chains + 4 cities

People increasingly ask chatbots for hotels. Usually the model searches the web first — so its answer reflects what’s online, not what it knows. We removed that crutch. With tools disabled, an LLM can only answer from the patterns baked into its weights during training: its parametric memory. So we asked, repeatedly: name hotels you know — in JSON, with each hotel’s website and address. Because a website is checkable, we can do something a pure recall test can’t: measure whether the memory is correct, not just present.

The result is a clean gradient. Chains live in every model’s memory perfectly. Famous palace hotels mostly do too. But the long tail of real, individual hotels is where models start to confabulate — and they do it with total confidence, returning a tidy JSON record for a hotel whose website doesn’t exist.

1. The experiment

The design borrows from Dejan Marketing’s AI Brand Authority Index, which ranks brands by how often a model names them unprompted. We adapt it to hotels and add one twist that fixes Dejan’s biggest limitation — he could only normalise raw name strings, never verify them.

Web search OFF. No tools, no retrieval. Pure memory. (In normal use, ChatGPT searches the web — we deliberately don’t.)
Three low-cost models: GPT-5.4-nano, GPT-5.4-mini, and Google’s Gemini 3.1 Flash-Lite.
Two cuts: “name 40 hotel chains or brands” (global), and “name 30 hotels in {city}” for Paris, Dubai, London and New York.
JSON output: each run returns {name, website, address}. The website becomes the hotel’s identity — no fuzzy matching against a database needed.
Verification: we check whether each returned domain resolves (DNS). A dead domain is strong evidence of confabulation; a live one is only a lower-bound signal — it may still be a generic chain homepage, a parent-collection domain, or the wrong property.
~150 runs per model for chains, ~80 per model per city; temperature 1.0 — intentional, since the goal isn’t one safe answer but to sample the distribution of hotels that surface from memory across repeated generations.

Asking for the website is the whole trick. “Name a hotel” is unfalsifiable — almost any string could be a hotel somewhere. “Name a hotel and its website” forces a checkable commitment, and a model with only a fuzzy memory of a place will reliably invent a plausible-but-wrong domain.

2. Chains are known cold

The base layer of hotel memory is rock-solid. Asked for hotel chains or brands, every model returns the majors — and their correct websites — essentially every time. GPT-5.4-mini hit a 99% live-website rate on chains; even nano managed 88%. Chains appear in training data on a single, predictable domain millions of times, so the association is overlearned.

Top hotel chains by recall (GPT-5.4-mini, 150 runs, web search off). 'Recalled' = share of runs that named the chain. All websites shown resolve.

Rank	Chain	Recalled	Avg rank	Website (verified)
#1	Marriott Hotels	100%	19.1	marriott.com
#2	Hilton Hotels & Resorts	100%	16	hilton.com
#3	Hyatt Regency	100%	9.6	hyatt.com
#4	InterContinental Hotels & Resorts	100%	12.7	ihg.com
#5	Best Western	100%	22.6	bestwestern.com
#6	Radisson Blu	99%	23.7	radissonhotels.com
#7	Ritz-Carlton	98%	15.7	ritzcarlton.com
#8	Quality Inn	93%	25.1	choicehotels.com
#9	Four Seasons Hotels and Resorts	91%	16.2	fourseasons.com
#10	Super 8	85%	27.4	wyndhamhotels.com

3. What AI remembers, city by city

Drop to the individual-hotel level and a city’s “memory leaderboard” emerges — the properties a model names again and again, unprompted. These are the hotels with the strongest grip on the model’s mind. Tables below are GPT-5.4-mini; “Recalled” is the share of 80 runs that named the hotel.

Paris

Most-remembered hotels in Paris (GPT-5.4-mini, 80 runs, web search off).

Rank	Hotel	Recalled	Avg rank	Website (verified)
#1	Hôtel de Crillon, A Rosewood Hotel	100%	6.3	rosewoodhotels.com
#2	Four Seasons Hotel George V	100%	3.8	fourseasons.com
#3	Mandarin Oriental, Paris	100%	6.6	mandarinoriental.com
#4	Shangri-La Paris	100%	5.1	shangri-la.com
#5	Hôtel Plaza Athénée	100%	6.1	dorchestercollection.com
#6	The Ritz Paris	96%	3.3	ritzparis.com
#7	Hôtel Molitor Paris - MGallery	83%	20.3	all.accor.com
#8	Le Bristol Paris	81%	5.6	oetkercollection.com

Dubai

Most-remembered hotels in Dubai (GPT-5.4-mini, 80 runs, web search off).

Rank	Hotel	Recalled	Avg rank	Website (verified)
#1	Burj Al Arab Jumeirah	100%	8.2	jumeirah.com
#2	Atlantis, The Palm	100%	2.9	atlantis.com
#3	Address Downtown	100%	11.3	addresshotels.com
#4	The Ritz-Carlton, Dubai	100%	10	ritzcarlton.com
#5	W Dubai - The Palm	100%	18.5	marriott.com
#6	Hilton Dubai The Walk	99%	22.2	hilton.com
#7	One&Only The Palm	95%	11.4	oneandonlyresorts.com
#8	Raffles Dubai	95%	16.3	raffles.com

London

Most-remembered hotels in London (GPT-5.4-mini, 80 runs, web search off).

Rank	Hotel	Recalled	Avg rank	Website (verified)
#1	The Savoy	100%	1.6	thesavoylondon.com
#2	The Dorchester	100%	5.3	dorchestercollection.com
#3	Shangri-La The Shard, London	100%	8.2	shangri-la.com
#4	The Ned	100%	14.5	thened.com
#5	Rosewood London	100%	8.3	rosewoodhotels.com
#6	The Ritz London	99%	2.6	theritzlondon.com
#7	Claridge's	99%	3	claridges.co.uk
#8	The Langham, London	99%	5.5	langhamhotels.com

New York

Most-remembered hotels in New York (GPT-5.4-mini, 80 runs, web search off).

Rank	Hotel	Recalled	Avg rank	Website (verified)
#1	The Plaza Hotel	100%	1	theplazany.com
#2	The St. Regis New York	100%	13.4	marriott.com
#3	Park Hyatt New York	100%	16	hyatt.com
#4	Conrad New York Downtown	100%	12.2	hilton.com
#5	The Langham, New York, Fifth Avenue	99%	8	langhamhotels.com
#6	Mandarin Oriental, New York	98%	5.8	mandarinoriental.com
#7	Four Seasons Hotel New York Downtown	98%	5.3	fourseasons.com
#8	1 Hotel Central Park	98%	14.9	1hotels.com

4. The website lie-detector

Here is the headline. For each model and place, what share of the hotels it named came with a website that actually resolves? Chains: near-perfect. Individual hotels: a different story — and it gets worse the weaker the model and the more independent the city.

ai-hotel-memory-live-domain-rate-2026

Model	Global chains	Paris	Dubai	London	New York
Gemini 3.1 Flash-Lite	92%	71%	97%	96%	89%
GPT-5.4-mini	99%	58%	94%	92%	82%
GPT-5.4-nano	88%	47%	77%	67%	68%

The failure mode is the interesting part. The model usually knows the hotel exists — it just fabricates the web address, often a clean, plausible guess that happens to be dead:

Hotel (real)	What nano invented	The real website
Le Bristol Paris· Paris	bristolparis.com ✗ dead	oetkercollection.com ✓
Hôtel Plaza Athénée· Paris	plazaathenee-paris.com ✗ dead	dorchestercollection.com ✓
The Ritz Paris· Paris	theritzparis.com ✗ dead	ritzparis.com ✓
Pod Times Square· New York	pod-hotels.com ✗ dead	thepodhotel.com ✓

GPT-5.4-nano returns bristolparis.com for Le Bristol Paris — a dead domain. The hotel is one of the most famous in the world; its real site is oetkercollection.com. The model didn’t fail to recall the hotel — it recalled the hotel and hallucinated the URL. That is exactly the kind of confident-but-wrong detail that slips past a reader.

One nuance the live-domain test deliberately flattens: for a hotel, “correct” isn’t a single canonical domain. A property can live on its own domain (ritzparis.com), on a correct parent-collection domain (oetkercollection.com), or on an invented-but-plausible one (bristolparis.com). We score only the first two as live; a future pass should separate exact-property recall from brand-level recall, because naming marriott.com for a specific St. Regis is a weaker kind of memory than naming the property’s own site.

Pulling those apart, the misses fall into four distinct failure modes — only the first is a clean dead domain, but the middle two are “live yet weak” recall the resolving-domain test cannot tell from the real thing:

Failure mode	Example	What it means
Invented property domain	`bristolparis.com` ✗ dead	Knows the hotel, guesses a plausible URL. Pure confabulation — the only mode the dead-domain test catches.
Parent-collection domain	`oetkercollection.com` for Le Bristol	Correct, but brand-level. Resolves (scored live), yet it’s weaker than recalling the property’s own site.
Generic chain homepage	`marriott.com` for a St. Regis	Right parent, no property-level specificity. Live, but it doesn’t identify the hotel you asked about.
Name padding	nano’s 469 distinct “Paris hotels” in 80 runs	The model invents names to fill the requested list beyond what it actually remembers.

5. Why Paris is the hardest city

Paris is the worst-remembered of the four cities for every model (down to 47% live-website rate on nano), and it also produces the longest tail of invented names — nano emitted 469 distinct “Paris hotels” across 80 runs, versus a tight 177 for Gemini. The reason is structural: Paris’s famous hotels are often independent palaces, collection-owned properties, or historically branded names with non-obvious domains. The model knows “Le Bristol Paris” but its site lives under Oetker Collection (oetkercollection.com); it knows “Plaza Athénée” but the domain sits under Dorchester Collection (dorchestercollection.com). A chain trains the model on one obvious domain; an independent or collection-owned palace does not, so the model guesses — and misses.

ai-hotel-memory-invented-tail-2026

A tighter set with more live websites (Gemini) signals a sharper, more reliable memory; a sprawling set with dead domains (nano) signals a model padding its answer with invention.

6. The cheap-model surprise

The counterintuitive finding: the cheapest model had the best hotel memory. Google’s Gemini flash-lite returned the highest share of working websites in cities — 97% in Dubai, 96% in London— beating the pricier GPT-5.4-mini and far ahead of GPT-5.4-nano. It’s not about price; it’s about how much specific, rare detail a model retains. This is also why Dejan’s original brand index could run on low-cost Gemini at all: that family punches above its cost on real-world entity recall. Two low-cost models, two very different memories.

7. What it means for hotels

If you’re a chain or a famous palace, the model knows you — name and website. You’re in the memory layer, not just the search layer.
If you’re an independent or boutique hotel, the model may know your name but invent your web address. When a model answers from memory (or a user copies its output), that’s a wrong link pointing away from you.
Memory ≠ retrieval. With web search on, models ground their answers and this mostly disappears. But the memory layer still shapes which hotels a model reaches for first, and what it “believes” before it searches.
The fix: make the association Hotel Name → canonical domain boringly obvious everywhere — one canonical URL across your Google Business Profile, schema (Hotel with url + sameAs), OTAs, press and Wikidata; no abandoned domains; keep redirects alive. The risk isn’t that the model has never heard of you — it’s that it has heard of you imprecisely.

Retrieval can correct memory — but memory still decides what a model reaches for before it checks. For chains, that memory is crisp. For an independent hotel it can be a foggy postcard: the name is right, the address looks plausible, and the website is invented. That is the part worth fixing.

Methodology

Models: gpt-5.4-nano, gpt-5.4-mini (OpenAI), gemini-flash-lite-latest — which resolved at capture time to Gemini 3.1 Flash-Lite (Google), called with no tools and temperature 1.0. Runs: ~150 per model for the chains cut (40 brands requested per run); ~80 per model for each of Paris, Dubai, London, New York (30 hotels per run). Output: JSON, requesting name + website + address per hotel.

Identity & scoring: returned URLs are normalised to their registrable domain — protocol, path and www stripped — then keyed by that domain (falling back to a normalised name). “Recalled” is the share of runs that named the hotel; “avg rank” is its mean position in the list. Website verification: each distinct domain is checked for DNS resolution, not full HTTP or page validation. This is a conservative lie-detector: a non-resolving domain is treated as a confabulated website, while a resolving one is treated as live but not necessarily correct — it may be a generic chain homepage, a parent-collection domain, or the wrong property. So the reported live-domain rate is an upper bound on correctness: a non-resolving domain is definitely wrong, but some resolving ones are wrong too, so true confabulation is at least the dead-domain rate and probably higher.

Caveats: because a domain can resolve and still be the wrong hotel, the live-domain rate overstates true accuracy — read it as a ceiling, not a verdict. This measures the memory of three specific low-cost models, not the full ChatGPT or Gemini consumer products (which use larger models and web search). Cost of the entire study was under €20.

FAQ

← All research