{"@context":"https://schema.org","@type":"BlogPosting","headline":"Hotel robots.txt & AI Blocking Study 2026","description":"105,002 hotel robots.txt files parsed across 7 countries. Only 3.3% block any AI crawler. GPTBot most blocked at 2.9%.","datePublished":"2026-03-20","dateModified":"2026-03-21","url":"https://nicolassitter.com/research/hotel-robots-ai-blocking-study-2026","category":"research","keywords":["hotel robots.txt","AI blocking","GPTBot","ClaudeBot","CCBot"],"articleSection":"Research","wordCount":7800,"readTime":"31 min","articleBody":"ResearchResearch / robots.txt StudyMarch 2026\n\n# Do Hotels Block AI Crawlers?\n\nWe parsed 105,002 hotel robots.txt files. **96.7% have zero AI-specific blocking rules.** The industry is wide open.\n\n82.2%\n\nHave robots.txt\n\n3.3%\n\nBlock Any AI\n\n7.5%\n\nFrance (Outlier)\n\n2.1%\n\n\"Smart\" Strategy\n\n[Summary](#executive-summary)[robots.txt](#robots-adoption)[AI Blocking](#ai-blocking-overview)[Per Bot](#per-bot-rates)[Selective](#selective-blocking)[By Country](#by-country)[By Stars](#by-stars)[Distribution](#blocking-distribution)[Who Blocks](#whos-blocking)[Opting Out](#opting-out)[FAQ](#faq)[Methodology](#methodology)\n\n## TL;DR\n\nWe parsed robots.txt files from **105,002** hotel websites across 7 countries. Only **3.3% block any AI crawler** — and just 0.9% block all of them. GPTBot (OpenAI) is the most commonly blocked at 2.9%, while AI search bots like PerplexityBot and OAI-SearchBot are blocked by just 1.0%. The most interesting signal: **2.1% of hotels use a \"smart\" strategy** — blocking training bots while allowing search bots through. France is a clear outlier at 7.5%, more than 3x the rate of any other country.\n\n## Executive Summary\n\nThe robots.txt file is the first line of defense for any website. It tells crawlers what they can and cannot access. With the rise of AI-powered search engines (ChatGPT, Perplexity, Gemini) and AI model training, hotels face a new decision: should they allow AI bots to crawl their content?\n\nOur analysis of 105,002 hotel websites reveals that the vast majority have not yet made this decision — or have decided to leave the door wide open. Only 3.3% block any AI crawler at all. For context, this means **96.7% of hotel websites are fully accessible** to AI training bots and AI-powered search engines alike.\n\nThe distinction between training and search bots matters. Training bots (GPTBot, ClaudeBot, CCBot) scrape content to build AI models. Search bots (PerplexityBot, OAI-SearchBot) fetch content to answer user queries in real time. Blocking the first protects your content from being used in training. Blocking the second removes you from AI-powered search results. Understanding this distinction is critical — and our [anatomy of ChatGPT hotel search](/research/anatomy-chatgpt-hotel-search-2026) article explains exactly how these bots work.\n\n96.7%\n\nNo AI blocking rules\n\n3.3%\n\nBlock at least one AI bot\n\n2.5x\n\nTraining vs search blocking gap\n\n**The key finding:** Hotels that do block AI crawlers are making a deliberate distinction between training and search. Training bots are blocked ~2.5x more often than search bots. This \"smart\" strategy — blocking training while allowing search — is emerging as the most sophisticated approach, adopted by 2.1% of hotels.\n\n## robots.txt Adoption\n\nHow many hotel websites have a robots.txt file? (n=105,002 hotels)\n\n82.2%\n\nHave robots.txt\n\n86,348 hotels\n\n60.1%\n\nHave Sitemap\n\n63,110 hotels\n\n17.8%\n\nNo robots.txt\n\n18,654 hotels\n\n0.9%\n\nBlanket Disallow\n\n958 hotels\n\nrobots.txt status breakdown\n\n**The 60.1% sitemap rate is a positive signal.** Hotels that declare a sitemap in their robots.txt are actively helping crawlers discover their content. Combined with the 82.2% robots.txt adoption rate, this suggests that most hotel websites have at least basic crawl management in place — they just haven't updated it for the AI era.\n\n## AI Blocking Overview\n\nHow does AI bot blocking compare to traditional search engine blocking?\n\n3.3%\n\nBlock Any AI\n\n3,458 hotels\n\n0.9%\n\nBlock All AI\n\n957 hotels\n\n1.3%\n\nBlock Googlebot\n\n1,325 hotels\n\n1.1%\n\nBlock Bingbot\n\n1,160 hotels\n\nAI blocking vs traditional search engine blocking\n\n**AI blocking (3.3%) is higher than traditional search blocking.** Hotels that block Googlebot (1.3%) or Bingbot (1.1%) are likely misconfigured — blocking your primary search engines is almost never intentional. But AI blocking at 3.3% represents a deliberate choice by hotels that are specifically targeting AI crawlers while keeping traditional search open.\n\n## Per-Bot Blocking Rates\n\nWhich AI bots are hotels blocking? Colored by category: training search user agent\n\nAI bot blocking rates by bot\n\n**GPTBot leads at 2.9%.** Training bots cluster between 2.5% and 2.9%, while search and user-agent bots hover around 1.0%. The ~2.5x gap between training and search bot blocking is the key finding — hotels that actively manage AI access are distinguishing between content scraping for model training and real-time search retrieval.\n\nFull AI bot blocking data\n\nBot\n\nProvider\n\nCategory\n\nHotels Blocking\n\n% of Total\n\nGPTBot\n\nOpenAI\n\ntraining\n\n3,036\n\n2.9%\n\nGoogle-Extended\n\nGoogle\n\ntraining\n\n2,793\n\n2.7%\n\nCCBot\n\nCommon Crawl\n\ntraining\n\n2,847\n\n2.7%\n\nBytespider\n\nByteDance\n\ntraining\n\n2,782\n\n2.6%\n\nClaudeBot\n\nAnthropic\n\ntraining\n\n2,742\n\n2.6%\n\nApplebot-Extended\n\nApple\n\ntraining\n\n2,669\n\n2.5%\n\n## Selective Blocking: Training vs Search\n\nAmong hotels that block AI, what strategy are they using?\n\n2.1%\n\n\"Smart\" Strategy\n\nBlock training, allow search\n\n0.9%\n\nBlock Everything\n\nAll AI bots blocked\n\n0.1%\n\nReverse Strategy\n\nBlock search, allow training\n\nAI blocking strategy distribution\n\n**The \"smart\" strategy is the most interesting signal in this data.** 2.1% of hotels (2,214 properties) block training bots like GPTBot and ClaudeBot while allowing search bots like PerplexityBot and OAI-SearchBot to crawl freely. This means they protect their content from model training while remaining visible in AI-powered search results. Only 58 hotels (0.1%) do the reverse — blocking search while allowing training — which suggests either misconfiguration or a very unusual strategy.\n\n#### Understanding OpenAI's 3 crawlers\n\nPer [OpenAI's official documentation](https://platform.openai.com/docs/bots), each crawler serves a distinct purpose:\n\nTraining\n\n**GPTBot** — crawls content for training generative AI models. Blocking it means your content won't be used for training. _This is the one most hotels should consider blocking._\n\nSearch\n\n**OAI-SearchBot** — indexes content for ChatGPT's search features. **Blocking it means your site won't appear in ChatGPT search results**, only as navigational links. Hotels wanting AI search visibility should allow this.\n\nUser\n\n**ChatGPT-User** — triggered when a user asks ChatGPT to browse a page. It's user-initiated, and OpenAI states \"robots.txt rules may not apply.\" **Blocking this bot is largely pointless** — yet 1,190 hotels do it.\n\n**The visibility trade-off is real.** Hotels blocking OAI-SearchBot opt out of ChatGPT search results. Hotels blocking PerplexityBot vanish from Perplexity. As AI search becomes a primary discovery channel for travelers, blocking search bots is equivalent to delisting from a search engine. [Read our anatomy of ChatGPT hotel search](/research/anatomy-chatgpt-hotel-search-2026) to understand how search bots retrieve and present hotel information.\n\n## AI Blocking by Country\n\nFrance is a clear outlier. Germany, despite being another GDPR market, is the lowest.\n\n% of hotels blocking any AI crawler, by country\n\n**France at 7.5% is 3x the US (2.1%) or UK (2.0%) rate.** But the number is misleading. Most of it comes from a single chain. See below.\n\n### The Logis Effect: One Chain Explains France's Outlier Status\n\n**Logis Hotels** is a French cooperative of ~2,300 independent hotels, restaurants, and guesthouses. Their shared CMS/platform includes a robots.txt that blocks 6 AI training bots (GPTBot, ClaudeBot, Google-Extended, CCBot, Bytespider, Applebot-Extended) while allowing all search bots. This single decision affects **955 properties in our dataset**.\n\n72.1%\n\nof French blockers are Logis\n\n2.1%\n\nFrance's rate without Logis\n\n\\= US rate\n\nNo longer an outlier\n\nRemove Logis from the data, and **France drops from 7.5% to 2.1%** — exactly the US rate. The \"French GDPR culture\" hypothesis largely evaporates. What looks like a national trend is actually a single platform decision by a cooperative that bundles AI blocking into its shared infrastructure.\n\nFrance blocking breakdown (1,317 total blockers):\n\nLogis Hotels — 950 (72.1%)\n\nIndependent/Other — 283 (21.5%)\n\nAparthotel chains — ~90 (6.9%)\n\nSofitel + Others — 13 (1.0%)\n\nThe irony: Logis's blocking is actually the \"smart\" pattern — they block training bots while allowing search bots. Their hotels remain visible in ChatGPT search and Perplexity. This makes them the largest coordinated example of the training-only blocking strategy in our dataset.\n\n**Germany at 1.7% disproves the GDPR hypothesis entirely.** If data protection regulation drove AI blocking, Germany — with its equally strong GDPR enforcement — would match France. Instead, it has the lowest rate of any country in our dataset. AI blocking in hospitality is driven by platform decisions and agency culture, not regulation.\n\nFull country-level blocking data\n\nCountry\n\nHotels\n\nHas robots.txt\n\nBlocks Any AI\n\nGPTBot\n\nClaudeBot\n\nFrance\n\n17,634\n\n89.8%\n\n7.5%\n\n7.2%\n\n6.5%\n\nItaly\n\n27,319\n\n71.7%\n\n3.3%\n\n2.6%\n\n2.4%\n\nSpain\n\n16,411\n\n83.6%\n\n2.6%\n\n2%\n\n2%\n\nNetherlands\n\n2,891\n\n86.6%\n\n2.2%\n\n1.8%\n\n1.5%\n\nUSA\n\n7,445\n\n90.8%\n\n2.1%\n\n1.8%\n\n1.7%\n\nUK\n\n10,547\n\n89.4%\n\n2%\n\n1.7%\n\n1.5%\n\n## AI Blocking by Star Classification\n\nDoes hotel quality affect AI blocking decisions?\n\nAI blocking rates by star classification\n\n**Zero effect.** The range is tight: 2.6% to 3.8% across all star classifications. Whether a hotel is 1-star or 5-star has no meaningful impact on whether it blocks AI crawlers. The slightly higher rate for \"Unclassified\" properties (3.8%) may reflect a different mix of website platforms rather than a deliberate strategic choice.\n\nBlocking rates by star classification\n\nStars\n\nHotels\n\nBlocks Any AI\n\nBlocks All AI\n\n1-star\n\n2,699\n\n2.6%\n\n1.2%\n\n2-star\n\n10,222\n\n3.1%\n\n0.8%\n\n3-star\n\n30,199\n\n3%\n\n1.1%\n\n4-star\n\n16,548\n\n2.6%\n\n1%\n\n5-star\n\n2,062\n\n3.1%\n\n0.6%\n\nUnclassified\n\n43,272\n\n3.8%\n\n0.8%\n\n## Blocking Distribution\n\nHow many AI bots do hotels block? The pattern is bimodal: 0 or most.\n\nNumber of AI bots blocked per hotel\n\n**Hotels either block 0 or block most/all.** The distribution is bimodal: 101,544 hotels (96.7%) block zero AI bots, while 3,065 hotels (2.9%) block 9-14 bots. Very few hotels block just 1-3 bots (704, or 0.7%). This suggests that AI blocking is typically an all-or-nothing decision — when hotels add AI blocking rules, they tend to copy comprehensive blocklists rather than selectively choosing individual bots.\n\nHotels by number of AI bots blocked\n\nBots Blocked\n\nHotels\n\n% of Total\n\n0\n\n101,544\n\n96.7%\n\n1-3\n\n704\n\n0.7%\n\n4-8\n\n1,689\n\n1.6%\n\n9-14\n\n3,065\n\n2.9%\n\n## Who's Actually Blocking?\n\nThe 3,458 blocking hotels aren't random — most blocking is chain or platform-driven.\n\n### Logis Hotels — 944 properties (27% of all blockers)\n\n6 bots blocked: GPTBot, ClaudeBot, Google-Extended, CCBot, Bytespider, Applebot-Extended\n\nThe French cooperative hotel chain accounts for the single largest share of AI blocking. Their blocking is **training-only** — they allow ChatGPT-User, OAI-SearchBot, and PerplexityBot. This is the \"smart\" pattern: block AI training, stay visible in AI search. Logis alone explains most of France's 7.5% outlier rate.\n\n### Block-everything hotels — 957 properties\n\n14 bots blocked (all AI crawlers)\n\nThese hotels use a blanket `Disallow: /` for all user agents, which blocks every crawler including AI. Many are Italian resort booking platforms (bookitalyhotels.com, Greenblu) or vacation club networks (Diamond Resorts). Notable 5-star blockers: **Grand Hotel Des Iles Borromee** (Stresa, 4.7★), **Aquatio Cave Luxury Hotel & Spa** (Matera, 4.7★), **Hotel Masseria San Domenico** (Fasano, 4.7★).\n\n### Partial search bot blocking — ~90 properties\n\nGPTBot fully blocked + OAI-SearchBot blocked on specific paths\n\nSome hotel chains block GPTBot entirely (no training) but only restrict OAI-SearchBot from sensitive paths like `/booking/`. This is actually a **nuanced, smart strategy**: the hotel remains visible in ChatGPT Search for discovery queries, but protects its booking funnel. Our detection flags any `Disallow` rule as a \"block,\" but these hotels are still discoverable.\n\n### Sercotel Hotels — 71 properties (Spain)\n\n9 bots blocked: GPTBot, ChatGPT-User, ClaudeBot, anthropic-ai, Google-Extended, PerplexityBot, cohere-ai, YouBot, Applebot-Extended\n\nThe Spanish chain blocks both training _and_ search bots — including PerplexityBot and YouBot. They allow OAI-SearchBot and Claude-Web but block ChatGPT-User. Per [OpenAI's docs](https://platform.openai.com/docs/bots), blocking ChatGPT-User is largely pointless: it's user-initiated and \"robots.txt rules may not apply.\" Meanwhile, blocking PerplexityBot means Sercotel hotels are invisible to Perplexity search — a real visibility loss.\n\n### Sofitel (Accor luxury) — blocks only Google-Extended\n\n1 bot blocked: Google-Extended\n\nSofitel Le Scribe Paris Opéra, Sofitel Paris le Faubourg, Sofitel Paris Arc de Triomphe, Sofitel London St James, Sofitel Legend The Grand Amsterdam — they all block only Google-Extended (Gemini training). GPTBot, ClaudeBot, and search bots pass freely. This is the most minimal blocking policy: stop Google from training Gemini on your content, allow everything else.\n\n### Paris Spotlight: 35 hotels block AI\n\nOf the ~4,000+ hotels in our Paris dataset, only 35 block any AI crawler. Notable 5-star blockers:\n\nHôtel Madame Rêve\n\n5★ · 4.6 rating · blocks 6 training bots\n\nSofitel Le Scribe Paris Opéra\n\n5★ · 4.6 rating · blocks Google-Extended only\n\nSofitel Paris le Faubourg\n\n5★ · 4.6 rating · blocks Google-Extended only\n\nSofitel Paris Arc de Triomphe\n\n5★ · 4.6 rating · blocks Google-Extended only\n\nThe majority of Paris blockers are **Logis Hotels** (via their CMS) and **aparthotel chains** (corporate policy). The palace hotels — Ritz, Plaza Athénée, Le Bristol, Four Seasons George V — do not block any AI crawler.\n\n### 5-Star Hotels: 64 Block AI\n\nOnly 64 out of ~2,062 five-star hotels (3.1%) block any AI crawler. The most notable:\n\nHotel\n\nLocation\n\nRating\n\nBlocking\n\nVilla d'Este\n\nCernobbio, IT\n\n4.7★\n\nGPTBot + ChatGPT-User\n\nAman Venice\n\nVenice, IT\n\n4.8★\n\nGoogle-Extended only\n\nVilla la Massa\n\nCandeli, IT\n\n4.8★\n\nGPTBot + ChatGPT-User\n\nGrand Hotel Des Iles Borromee\n\nStresa, IT\n\n4.7★\n\nAll 14 bots\n\nEquinox Hotel New York\n\nNew York, US\n\n4.4★\n\nanthropic-ai only\n\nThe Royal Horseguards\n\nLondon, GB\n\n4.4★\n\n6 training bots\n\nHôtel Madame Rêve\n\nParis, FR\n\n4.6★\n\n6 training bots\n\nGran Hotel Inglés\n\nMadrid, ES\n\n4.7★\n\n6 training bots\n\nItaly dominates the 5-star blocking list. Villa d'Este and Villa la Massa (both luxury Italian properties) block GPTBot and ChatGPT-User specifically — an anti-OpenAI stance. Aman blocks only Google-Extended. Equinox New York blocks only anthropic-ai. Each has a different, seemingly deliberate policy.\n\n**Most AI blocking is a chain or CMS decision, not an individual hotel decision.** Logis alone (944 hotels) accounts for 27% of all blockers. Add other chains and blanket-blocking platforms (~957), and you've explained ~60% of all AI blocking with just a few patterns. The remaining ~40% is a mix of individual hotels, smaller chains, and hosting providers with default blocking rules.\n\nOpting Out\n\n## 1,071 Hotels Are Invisible to ChatGPT Search\n\nBlocking OAI-SearchBot doesn't just prevent training — it removes your hotel from ChatGPT's search results entirely.\n\n1,071\n\nblock OAI-SearchBot\n\n1.0% of all hotels\n\n1,083\n\nblock PerplexityBot\n\n1.0% of all hotels\n\n58\n\nblock only search bots\n\n0.1% — deliberate opt-out\n\nMost hotels that block OAI-SearchBot do so as part of a blanket blocklist — they're blocking _everything_, not specifically targeting search. But **58 hotels block search bots while allowing training bots**, which is the exact opposite of the \"smart\" strategy. These hotels are opting out of AI discovery while still letting their content be used for model training.\n\n### Three blocking patterns we observed\n\nFull block\n\nBlanket `Disallow: /` for all AI bots\n\n~957 hotels block every crawler including all AI bots. These are typically platform-level decisions (booking platforms, resort networks) rather than individual hotel choices. The hotel is invisible to every AI search engine.\n\nPartial block\n\nBlock OAI-SearchBot from specific paths only\n\nSome hotel chains block OAI-SearchBot only from sensitive paths (e.g. `/booking/`) while allowing it on the rest of the site. This is actually a **nuanced, smart strategy**: the hotel remains visible in ChatGPT Search but protects its booking funnel from being scraped. Our detection counts these as \"blocking OAI-SearchBot,\" but the hotel is still discoverable.\n\nSmart pattern\n\nBlock GPTBot (training), allow OAI-SearchBot (search)\n\n2.1% of hotels block training bots while keeping search bots open. This is the optimal approach: your content won't be used to train AI models, but your hotel still appears when travelers ask ChatGPT for recommendations. Some chains go further by also protecting booking paths from search bots — the most sophisticated policy we observed.\n\n### Common mistake: blocking ChatGPT-User instead of OAI-SearchBot\n\n**1,190 hotels block ChatGPT-User** — but this is largely pointless. Per [OpenAI's documentation](https://platform.openai.com/docs/bots): ChatGPT-User is triggered when a user asks ChatGPT to visit a page or interacts with a Custom GPT. It's user-initiated, not automated crawling — and **robots.txt rules may not apply**.\n\nCritically, **ChatGPT-User is not used to determine whether content appears in ChatGPT Search**. That's OAI-SearchBot. Hotels blocking ChatGPT-User think they're opting out of ChatGPT — but they're blocking the wrong bot.\n\nWe also observed the reverse mistake: some hotel chains block GPTBot + ChatGPT-User but _allow_ OAI-SearchBot. The result is correct (visible in search, opted out of training) — but likely achieved by accident rather than by understanding the bot taxonomy.\n\n### Note on partial blocks\n\nOur detection flags any `Disallow` rule targeting OAI-SearchBot as a \"block.\" But some of the 1,071 hotels only block specific paths (like `/booking/`) — not the entire site. These hotels are still discoverable in ChatGPT Search. The true \"fully invisible\" count is lower than 1,071, concentrated among blanket blockers and platform-level decisions.\n\n**Blocking OAI-SearchBot is the new \"noindex\".** When a hotel blocks OAI-SearchBot, it won't appear when travelers ask ChatGPT for recommendations — even if the hotel has great reviews and a strong web presence. As AI search grows as a discovery channel, this is equivalent to delisting yourself from a search engine. Hotels that want to opt out of ChatGPT Search should block OAI-SearchBot — not ChatGPT-User.\n\n## Frequently Asked Questions\n\n## Methodology\n\n### Data Collection\n\n-   **Source:** Same 105,002 reachable hotel websites from our [schema adoption study](/research/hotel-schema-adoption-study-2026)\n-   7 countries: France (17.6K), Italy (27.3K), Spain (16.4K), Netherlands (2.9K), USA (7.4K), UK (10.5K), Germany (22.3K)\n-   robots.txt fetched from each domain root with Chrome-like user agent, 10-second timeout\n-   Each robots.txt parsed for User-agent directives and Disallow rules\n-   14 AI-specific bots tracked across training, search, and user categories\n\n### Bot Classification\n\n-   **Training bots:** GPTBot, Google-Extended, CCBot, Bytespider, ClaudeBot, Applebot-Extended, anthropic-ai, cohere-ai, Diffbot\n-   **Search bots:** PerplexityBot, OAI-SearchBot, YouBot\n-   **User agent bots:** ChatGPT-User, Claude-Web\n-   \"Blocks any AI\" = at least one AI bot has a Disallow rule\n-   \"Smart strategy\" = blocks at least one training bot but allows all search bots\n\n## Continue Reading\n\nExplore more research on AI hotel search.\n\n[AI Hotel Landscape 2026](/research/ai-hotel-landscape-2026)\n\n[Anatomy of ChatGPT Search](/research/anatomy-chatgpt-hotel-search-2026)[Schema Adoption Study](/research/hotel-schema-adoption-study-2026)[Google AI Mode Study](/research/google-ai-mode-hotel-study-2026)[Hotel Blog Study](/research/french-hotel-blog-study-2026)[All Research](/research)","author":{"@type":"Person","name":"Nicolas Sitter","url":"https://nicolassitter.com/about","sameAs":["https://www.linkedin.com/in/nicolassitternolleau/","https://github.com/Nicositter88","https://hotelrank.ai"]},"publisher":{"@type":"Person","name":"Nicolas Sitter","url":"https://nicolassitter.com"},"image":"https://nicolassitter.com/api/og/hotel-robots-ai-blocking-study-2026","mainEntityOfPage":{"@type":"WebPage","@id":"https://nicolassitter.com/research/hotel-robots-ai-blocking-study-2026"},"tags":["robots.txt","AI Crawlers","Hotel Websites","Content Access"],"sameAs":["https://hotelrank.ai/research/hotel-robots-ai-blocking-study-2026"],"alternateFormat":{"html":"https://nicolassitter.com/research/hotel-robots-ai-blocking-study-2026","json":"https://nicolassitter.com/api/post/hotel-robots-ai-blocking-study-2026","rss":"https://nicolassitter.com/rss.xml"},"datasets":[{"name":"summary","contentUrl":"https://nicolassitter.com/data/hotel-robots-ai-blocking-study-2026/summary.csv","encodingFormat":"text/csv"}]}