# Arahi AI - Robots.txt # https://arahi.ai/robots.txt # Updated: 2026-02-18 # Default: Allow all legitimate crawlers User-agent: * Allow: / # Block Next.js internal paths only Disallow: /api/ Disallow: /_next/ Disallow: /admin/ Disallow: /private/ # Sitemaps Sitemap: https://arahi.ai/sitemap.xml # LLM Content Policy # See https://arahi.ai/llms.txt for structured content for AI systems # ─── AI Search Engine Crawlers ─── # OpenAI (ChatGPT, GPT search) User-agent: GPTBot Allow: / User-agent: ChatGPT-User Allow: / User-agent: OAI-SearchBot Allow: / # Anthropic (Claude) User-agent: Claude-Web Allow: / User-agent: ClaudeBot Allow: / User-agent: anthropic-ai Allow: / # Google (Gemini, AI Overviews, SGE) User-agent: Google-Extended Allow: / User-agent: Googlebot Allow: / # Perplexity AI User-agent: PerplexityBot Allow: / # Microsoft (Copilot, Bing AI) User-agent: Bingbot Allow: / User-agent: BingPreview Allow: / # You.com User-agent: YouBot Allow: / # Brave Search (powers Claude's web search) User-agent: Brave Allow: / # Meta AI User-agent: FacebookBot Allow: / User-agent: meta-externalagent Allow: / # Apple (Siri, Spotlight) User-agent: Applebot Allow: / # Cohere User-agent: cohere-ai Allow: / # Common Crawl (used by many AI training pipelines) User-agent: CCBot Allow: / # ─── SEO Tools - Block aggressive crawlers ─── User-agent: AhrefsBot Disallow: / User-agent: SemrushBot Disallow: / User-agent: MJ12bot Disallow: / User-agent: DotBot Disallow: / User-agent: BLEXBot Disallow: / User-agent: DataForSeoBot Disallow: /