{"@context":"https://schema.org","@type":"BlogPosting","headline":"AI Search for Bookstores in Tokyo (2026): Where AI Search Stops Looking Western","description":"22 prompts × 4 AI engines (Perplexity returned no usable Tokyo data this run), EN + JA, US/JP proxies, 584 Tokyo bookstores resolved, 336 captures and 2,322 citations. Three Tokyo-only findings: Gemini cites store sites just 5% of the time and runs on a 58% local-guide web (whenin.tokyo, Tokyo Weekender, GaijinPot — a layer with no Western equivalent at this density); Japanese prompts cite .jp domains 5× more than English (the sharpest language→TLD coupling in the study); DAIKANYAMA T-SITE and Kinokuniya tie at the top (122 each). Ward-level geography is respected (Shibuya 100%); neighborhood precision is real but unmeasurable against ward-labeled seed data.","datePublished":"2026-05-26","dateModified":"2026-05-26","url":"https://nicolassitter.com/research/bookstores-tokyo-ai-search-2026","category":"research","keywords":["AI search bookstores","Tokyo bookstores AI","ChatGPT bookstores Japan","Gemini local guides","Japanese language AI","DAIKANYAMA T-SITE","Kinokuniya","AI local search study"],"articleSection":"Research","wordCount":2900,"readTime":"12 min","articleBody":"May 2026AI Search Studies\n\n# AI Search for Bookstores in Tokyo (2026):Where AI search stops looking Western\n\n**TL;DR:** Same playbook as the hotel and yoga studies — 22 prompts, EN and JA, US and JP proxies, every answer parsed and matched to 584 real bookstores — pointed at Tokyo. Three Tokyo-only findings emerge. **Gemini almost never cites a store’s own website (just 5%)** — instead it lives in a dense local-guide web (58% third-party guides like whenin.tokyo and Tokyo Weekender) that has no Western equivalent at the same density. **ChatGPT cites .jp store domains 5× more in Japanese than in English** — the sharpest language-to-TLD coupling in the whole study. And the top of the leaderboard is a clean duopoly: **DAIKANYAMA T-SITE and Kinokuniya tie at a score of 122**, with English-friendly stores and the Jimbocho used-book cluster filling the breadth tier below.\n\n4\n\nAI engines\n\n22\n\nPrompt templates\n\n584\n\nStores\n\nEN + JA\n\nBoth languages\n\n[Read the Report](#executive-summary)\n\n[Summary](#executive-summary)[1\\. Source Mix](#source-mix)[2\\. Leaderboard](#leaderboard)[3\\. Top Sources](#top-sources)[4\\. Language & TLD](#language-bias)[5\\. Geography](#geography)[6\\. AI Behavior](#ai-behavior)[Methodology](#methodology)[FAQ](#faq)\n\n## Executive Summary\n\nTokyo is where AI search stops looking Western.\n\nThe methodology is the same one we ran for Paris yoga studios and the AI hotel landscape — a fixed set of prompts, fired at every major AI engine, in two languages, from two countries, in a single week — except this time aimed at Tokyo bookstores, and with one structural caveat: Perplexity returned no usable data for this scrape, so only 4 engines carry the analysis (ChatGPT, Copilot, Gemini, Google AI Mode).\n\nThree things are true of Tokyo and only of Tokyo in our studies so far:\n\n1.  **A dense local-guide citation web that Gemini lives in almost entirely.** Just **5%** of Gemini’s Tokyo citations are store websites; **58%** are third-party guides (39% local editorial like whenin.tokyo and Tokyo Weekender + 19% expat media like GaijinPot and Japan-Guide). A Tokyo-specific taxonomy pass dropped the residual “other” bucket from 36% to 15% — the long tail wasn’t noise, it was a whole stratum of local guides that no Western city has at that density.\n2.  **The strongest language → TLD bias we have measured.** On ChatGPT, .jp store domains are cited **5× more** by Japanese prompts than English ones. Amsterdam was neutral at ~1.0×; Tokyo is the opposite extreme. Ask in Japanese and the engine leans hard into native .jp domains; ask in English and those same domains nearly vanish.\n3.  **A clean DAIKANYAMA T-SITE / Kinokuniya duopoly at the top.** The Tsutaya design flagship and the national chain tie at **122** across all four engines. Below them, the breadth tier is dominated by English-friendly stores (Aoyama Book Center, Infinity Books) and the Jimbocho used-book cluster — exactly the stores Western editorial covers.\n\nThe core laws from the hotel and yoga studies survive intact (per-engine source strategies, platform-specific blindness, near-universal web-search triggering on ChatGPT at 99%). What’s new in Tokyo is _which_ source strategy each engine picks, and how aggressively the language of the prompt steers which top-level domain gets cited.\n\nSection 1\n\n## Source mix by platform\n\nFor every cited URL we bucketed the source — the store’s own website, a social post, a local editorial outlet, an expat-media site, a global book-press outlet, a travel blog, a Google SERP, or other. A Tokyo-specific taxonomy pass resolved the long tail of local guides (whenin.tokyo, gltjp.com, Truly Tokyo, GaijinPot, Japan-Guide, visit-chiyoda) so the “other” bucket fell from **36% → 15%** overall. The mix is wildly different per engine, and it maps to four fundamentally different ways of sourcing an answer.\n\nNote: Perplexity is absent from this run — the scrape returned no usable Tokyo bookstore citations. All percentages and counts on this page are over the 4 engines that did return data.\n\nsource-mix-by-platform-tokyo-bookstores\n\n### Entity engine\n\n89%\n\nCopilot\n\nCites the store's own domain directly. Same lookup behavior we see for hotels in every market — Copilot is an entity-resolution machine. To win here, your store's own site has to be the canonical answer.\n\n### Self-referential engine\n\n61%\n\nGoogle AI Mode\n\nCites google.com URLs back to itself 61% of the time — essentially re-presenting its own search results page. Real source diversity is low; ranking in ordinary Google results is the path in.\n\n### Social / editorial engine\n\n21%\n\nChatGPT\n\nSocial (Reddit-led) is its single biggest bucket at 21%, with another 21% editorial\\_local and 13% book-press. Only 8% of ChatGPT's Tokyo citations are store websites. ChatGPT learns about Tokyo bookstores from communities and journalists.\n\n### Local-guide engine — the Tokyo signature\n\n58%\n\nGemini\n\nGemini cites store websites just 5% of the time. Instead it runs on the local-guide web — 39% editorial\\_local + 19% expat\\_media = 58% third-party guides (whenin.tokyo, Tokyo Weekender, gltjp.com, GaijinPot, Japan-Guide). For Tokyo bookstores, Gemini is effectively a travel-guide aggregator, not an entity engine.\n\n**The Tokyo signature is the Gemini column.** In Paris yoga, Gemini cited store/studio sites 41% of the time. In Tokyo bookstores it’s 5%. The difference isn’t Gemini — it’s that Tokyo has a layer of English-expat and Japanese city-guide sites with no Western analogue at the same density, and Gemini grounds in that layer instead. The same store gets surfaced through completely different intermediaries depending on the engine: your own domain (Copilot), Reddit (ChatGPT), or a Tokyo expat blog (Gemini).\n\n**There is no single “AI search” to optimise for in Tokyo.** An optimisation that wins Copilot (own your domain) is irrelevant on Gemini (be on whenin.tokyo / Tokyo Weekender) and on ChatGPT (be discussed on Reddit and in book press). Four engines, four jobs.\n\nSection 2\n\n## The Tokyo bookstore AI leaderboard\n\nAggregating mentions across all 4 engines (chain locations merged), these are the most-cited Tokyo bookstores. “Engines” is the number of the 4 engines that surfaced the store at all — a breadth signal. The top of the table is a tie: **DAIKANYAMA T-SITE** (the Tsutaya design flagship, surfaced by all 4 engines) and **Kinokuniya** (the national chain, surfaced by 3) both score 122.\n\nai-favourite-tokyo-bookstores-2026\n\nThe Tokyo bookstores AI recommends most — scored across ChatGPT, Copilot, Gemini and Google AI Mode combined (Perplexity returned no usable Tokyo data this run).\n\nTop 12 Tokyo bookstores by cross-platform citation score (chain-aggregated).\n\nRank\n\nStore\n\nScore\n\nEngines\n\n#1\n\nDAIKANYAMA T-SITE\n\n122\n\n4 / 4\n\n#2\n\nKinokuniya\n\n122\n\n3 / 4\n\n#3\n\nmagmabooks\n\n59\n\n2 / 4\n\n#4\n\nKitazawa Bookstore\n\n57\n\n2 / 4\n\n#5\n\nAoyama Book Center\n\n55\n\n4 / 4\n\n#6\n\nInfinity Books Japan\n\n47\n\n4 / 4\n\n**Two tiers, two stories.** The very top is a clean duopoly between the most recognisable domestic brands — DAIKANYAMA T-SITE (Tsutaya design flagship) and Kinokuniya — tied at 122. The breadth tier directly below is dominated by **English-friendly stores** (Aoyama Book Center, Infinity Books Japan) and the **Jimbocho used-book cluster** (Kitazawa Bookstore, Book House Jinbocho, Komiyama Book Store). That isn’t a coincidence — it’s exactly the slice of Tokyo bookstores that Western editorial covers, and three of the four engines lean editorial-heavy.\n\nSection 3\n\n## Top cited sources, cross-platform\n\nThe buckets are one thing; the actual domains are another. These are the most-cited sources across all 4 engines — the Tokyo bookstore citation backbone. Two patterns stand out: **Reddit (104 cites) and Tokyo Weekender (67) are the editorial anchors**, and **whenin.tokyo (47) is the single biggest pure local-guide site** — exactly the kind of domain a naive taxonomy would dump into “other”.\n\nMost-cited domains for Tokyo bookstores across 4 AI engines.\n\nEngines\n\nCites\n\nBucket\n\nDomain\n\n4\n\n31\n\nstore\\_website\n\naoyamabc.jp\n\n3\n\n104\n\nsocial\n\nreddit.com\n\n3\n\n67\n\nbook\\_press\\_global\n\ntokyoweekender.com\n\n3\n\n57\n\nstore\\_website\n\nstore.kinokuniya.co.jp\n\n3\n\n57\n\nstore\\_website\n\nstore.tsite.jp\n\n3\n\n47\n\neditorial\\_local\n\nwhenin.tokyo\n\n**Reddit is huge here.** 104 cites — and it’s almost entirely a ChatGPT signal (Reddit is 21% of ChatGPT’s Tokyo citations). On Copilot, Gemini and Google AI Mode, Reddit barely moves the needle. So “get talked about on Reddit” is genuine Tokyo bookstore advice — but _ChatGPT-specific_. For Gemini you need to be on whenin.tokyo or Tokyo Weekender; for Copilot, your own .jp domain has to be the canonical answer.\n\nSection 4\n\n## Language and TLD bias\n\nSame intent, different language, different stores. On the same JP proxy, English and Japanese prompts returned mixed top-5 overlaps — some neighborhood queries fully converge, but genre and “iconic Tokyo bookstore” queries diverge sharply.\n\nEN vs JA overlap per prompt template, ChatGPT on JP proxy. Higher = more language-agnostic.\n\nTemplate\n\nEN↔JA overlap (Jaccard)\n\ndist\\_shimokita (Shimokitazawa)\n\n100%\n\ndist\\_aoyama\n\n67%\n\nlate\\_night\n\n67%\n\ndist\\_daikanyama\n\n50%\n\naesthetic\n\n50%\n\nenglish\\_speaking\n\n43%\n\nThe pattern is consistent: **specific neighborhoods converge** (Shimokitazawa is the same place in both languages, and the same stores get named), while **generic “iconic Tokyo bookstore” queries split** — English prompts surface tourist-famous and English-language stores, Japanese prompts surface a different local set entirely.\n\n### .jp vs .com is a language-affinity signal — the sharpest we’ve measured\n\n.jp vs .com Tokyo-bookstore citation balance by prompt language (ChatGPT).\n\nTLD\n\nStore entities\n\nEN cites\n\nJA cites\n\nJA/EN ratio\n\nReading\n\n.jp\n\n8\n\n3\n\n15\n\n5.00×\n\nlocal-biased (sharp)\n\n.com\n\n2\n\n2\n\n3\n\n1.50×\n\nlocal-biased (mild)\n\nOn ChatGPT, **.jp store domains are cited 5× more by Japanese prompts than English ones**. This is the strongest language-to-TLD coupling we have measured anywhere — Amsterdam bookstores ran neutral at ~1.0×, Paris yoga’s comparable .com/.fr split was 0.59×. Ask ChatGPT in Japanese and it leans hard into native .jp domains; ask in English and those same domains nearly vanish from its citations.\n\nFor a Japanese-market business, the TLD is doing real work. A .jp domain doesn’t just signal origin — on ChatGPT, it’s effectively a Japanese-audience routing rule. An English-only .com presence will under-earn citations from Japanese-language users even when you’re the right answer.\n\nSection 5\n\n## Geography: ChatGPT respects wards\n\nTokyo geography is the test case where our usual accuracy metric stops being honest, so it’s worth reading the table carefully. When we asked ChatGPT for stores in a specific area, the results split along a structural line — **ward-level prompts are accurate; neighborhood-level prompts look 0% only because of a labeling mismatch in the seed data**.\n\nChatGPT in-target accuracy by Tokyo prompt area. \\*Caveat below: seed labels stores by ward, not neighborhood.\n\nPrompt target\n\nType\n\nIn-target accuracy\n\nShibuya\n\nward\n\n100%\n\nShinjuku\n\nward\n\n79–92%\n\nJimbocho\n\nneighborhood (in Chiyoda)\n\n0%\\*\n\nAoyama\n\nneighborhood (in Minato)\n\n0%\\*\n\nDaikanyama\n\nneighborhood (in Shibuya)\n\n0%\\*\n\nShimokitazawa\n\nneighborhood (in Setagaya)\n\n0%\\*\n\n**The 0% rows are a labeling caveat, not AI error.** Jimbocho, Aoyama, Daikanyama and Shimokitazawa are _neighborhoods inside_ wards (Chiyoda, Minato, Shibuya, Setagaya). The Apify seed stores the _official ward name_, so a perfectly correct “Jimbocho” result is recorded against “Chiyoda City” and fails the string match. Where the prompt target _is_ a ward name (Shibuya, Shinjuku), accuracy is 79–100%. The honest takeaway: **ChatGPT respects ward-level geography well; neighborhood-level precision is real but unmeasurable against ward-labeled seed data.** A different seed — indexed by neighborhood — would be needed to score the Jimbocho rows fairly.\n\n### The 584 Tokyo bookstores on the map\n\nThe full reference set every answer was resolved against — 584 Tokyo bookstores with websites, covering the 23 Special Wards plus Musashino and Mitaka. Density follows Chiyoda (Jimbocho), Shinjuku and Shibuya — the historic used-book core and the two largest commercial wards.\n\n![Geographic scatter map of 584 Tokyo bookstores across the 23 Special Wards plus Musashino and Mitaka, showing dense clusters in Chiyoda (Jimbocho), Shinjuku and Shibuya, with sparser distribution east and west.](/_next/image?url=%2Fresearch%2Fbookstores-tokyo-ai-search-2026%2Fbookstores_tokyo_geo_scatter.png&w=3840&q=75&dpl=dpl_H4iFpRn3vQ7W7vEpqXGjb44m5WjB)\n\n### Stores per ward — the supply side\n\nThe same distribution, counted by ward. Chiyoda (the Jimbocho cluster) and the two big commercial wards lead; the outer wards are comparatively thin.\n\n![Bar chart of Tokyo bookstores per ward, showing Chiyoda, Shinjuku and Shibuya leading the count and outer wards trailing.](/_next/image?url=%2Fresearch%2Fbookstores-tokyo-ai-search-2026%2Fbookstores_tokyo_by_district.png&w=3840&q=75&dpl=dpl_H4iFpRn3vQ7W7vEpqXGjb44m5WjB)\n\nSection 6\n\n## AI search behavior — and a data gap to flag\n\nTwo behavioural signals to report; one we measured cleanly, one we couldn’t.\n\n### Web-search trigger rate\n\n99%\n\n**ChatGPT triggered live web search on 95 of 96 captures** — one prompt answered from training data alone. As with Paris yoga (96%) and our hotel studies, Tokyo bookstore queries are almost entirely search-driven, not pre-baked. The visible answer is grounded in live retrieval practically every time.\n\n### Fan-out queries — a genuine gap\n\nn/a\n\nNot captured for any Tokyo platform this run. AI Mode, Copilot and Gemini don’t expose the trigger / fan-out fields through our scraper, and Perplexity (which carried single reformulated queries for Amsterdam bookstores) returned no usable Tokyo data. Fan-out behaviour is a real data gap here — we lean on the ChatGPT trigger rate, the source-mix breakdown and the TLD findings instead, and flag the absence rather than papering over it.\n\n**Two structural gaps to keep in mind reading this study.** Perplexity returned no usable Tokyo bookstore data this run, and fan-out behaviour wasn’t observable on any platform here. Everything on this page is over the 4 engines that did return data (ChatGPT, Copilot, Gemini, Google AI Mode), and we don’t extrapolate to the missing fifth.\n\nMethodology\n\n## Study design\n\n### Data collection\n\n-   22 prompt templates × 2 languages (EN/JA) × 2 proxy countries (US/JP) × 4 AI engines\n-   Engines with usable data: ChatGPT, Copilot, Gemini, Google AI Mode. Perplexity returned no usable Tokyo bookstore citations in this run and is excluded everywhere on this page.\n-   Captured 2026-05-23 → 2026-05-24 via Bright Data\n-   336 captures · 2,322 cited URLs · 858 distinct map entities\n-   For each answer we logged both the rendered text and every cited URL\n\n### What we measured\n\n-   Stores named per answer (chain-aggregated leaderboard + per-engine breadth)\n-   Cited URLs, bucketed into a Tokyo-specific source taxonomy\n-   Geographic accuracy vs the prompted ward (with the neighborhood caveat)\n-   EN/JA overlap per template; .jp vs .com citation balance by prompt language\n-   ChatGPT live-web-search trigger rate\n\n### The Tokyo bookstore reference set\n\nStores were resolved against a 1,310-row Apify Google Maps seed covering Tokyo’s 23 Special Wards plus Musashino and Mitaka, filtered to book / used-book / rare / kids / magazine stores. Comic shops, hobby shops, stationery shops and thrift shops were flagged `is_real = false` and excluded. After filtering for stores with verifiable websites the working universe is **584 Tokyo bookstores**. As in the yoga study, this is the universe an answer can resolve _to_, not a list we fed to the models.\n\n### How we turned answers into stores: the NER pipeline\n\nAI answers are free text — “You could try Kinokuniya in Shinjuku, or the Jimbocho used-book cluster…” — not a clean list of businesses. To count anything, we first had to extract the store mentions. We ran each answer (and each citation’s anchor text) through a **named-entity-recognition (NER)** pass that works in four steps:\n\n1.  **Span detection.** A transformer NER model tags candidate spans — business names and the location phrases attached to them (ward names, neighborhood names, station names, street addresses). A Tokyo-specific gazetteer (“Kinokuniya,” “Tsutaya,” “Daikanyama,” “Jimbocho,” “shoten,” “書店”) boosts recall on names the base model would otherwise miss, including kanji-script store names.\n2.  **Normalisation.** Each candidate is lower-cased, romanised where the script differs, and stripped of boilerplate — “Tokyo,” ward suffixes (“-ku,” “City”), “Co. Ltd.,” trademark glyphs and punctuation — so “Kinokuniya Bookstore Shinjuku Main Store” and “紀伊國屋書店” collapse toward the same key.\n3.  **Entity resolution.** Normalised mentions are matched to the 584-store registry using fuzzy string similarity plus a domain match when the answer cited the store’s own website. A mention only counts if it resolves above a confidence threshold; ambiguous or sub-threshold spans are dropped rather than guessed.\n4.  **Chain aggregation.** Resolved entities that belong to the same brand (Kinokuniya, Tsutaya, Book 1st…) are merged so a multi-location chain isn’t double-counted, while the per-location coordinates are retained for the maps.\n\nThe 584 stores are the universe an answer can resolve _to_; it is not a list we showed the models. Everything in the leaderboard is a store an engine surfaced on its own and that the NER pipeline could confidently identify.\n\n### Caveats\n\n-   **Perplexity excluded.** The Bright Data Perplexity scrape returned no usable Tokyo bookstore citations in this run. Every chart, table and percentage on this page is over the four engines that did return data. We do not extrapolate or impute Perplexity numbers.\n-   **Fan-out queries not captured.** AI Mode, Copilot and Gemini don’t expose query trigger / fan-out fields through the scraper. The Amsterdam dataset carried Perplexity-side fan-out; the Tokyo dataset has none. We flag this as a genuine data gap.\n-   **Neighborhood-vs-ward labeling.** The seed indexes stores by official ward, so neighborhood-prompt accuracy (Jimbocho, Aoyama, Daikanyama, Shimokitazawa) reads as 0% even when results are correct. Ward prompts (Shibuya, Shinjuku) are the honest accuracy figures.\n-   **Source taxonomy is approximate.** The Tokyo taxonomy pass moved roughly 21 percentage points out of “other” into editorial\\_local / expat\\_media; ~15% of cites remain in the residual bucket (long-tail Tokyo blogs, niche review sites).\n-   **Google AI Mode’s heavy reliance on google.com URLs** (61% of its citations) may inflate its citation count relative to other engines.\n-   **No author affiliation.** The author has no commercial relationship with any of the stores in this study.\n\nFAQ\n\n## Frequently Asked Questions\n\n### Summarize with AI\n\n## Continue Reading\n\nMore on how AI search surfaces local businesses.\n\n[AI Search for Yoga Studios in Paris (2026)](/research/yoga-studios-paris-ai-search-2026)[All AI Search Studies](/research?topic=ai-search)\n\n[All Research](/research)","author":{"@type":"Person","name":"Nicolas Sitter","url":"https://nicolassitter.com/about","sameAs":["https://www.linkedin.com/in/nicolassitternolleau/","https://github.com/Nicositter88","https://hotelrank.ai"]},"publisher":{"@type":"Person","name":"Nicolas Sitter","url":"https://nicolassitter.com"},"image":"https://nicolassitter.com/api/og/bookstores-tokyo-ai-search-2026","mainEntityOfPage":{"@type":"WebPage","@id":"https://nicolassitter.com/research/bookstores-tokyo-ai-search-2026"},"tags":["AI Search","Local Search","Bookstores","Tokyo","Gemini","Japanese"],"sameAs":["https://hotelrank.ai/research/bookstores-tokyo-ai-search-2026"],"alternateFormat":{"html":"https://nicolassitter.com/research/bookstores-tokyo-ai-search-2026","json":"https://nicolassitter.com/api/post/bookstores-tokyo-ai-search-2026","rss":"https://nicolassitter.com/rss.xml"},"datasets":[{"name":"summary","contentUrl":"https://nicolassitter.com/data/bookstores-tokyo-ai-search-2026/summary.csv","encodingFormat":"text/csv"}]}