Schema Markup & Structured Data for AI Search Inclusion

28 November 2025

2 Mins Read

Keith Nallawalla

Having worked in SEO since 2010, I’ve watched Google’s search results page evolve from ten blue links into something that increasingly resembles an AI-powered knowledge repository. We’re now at a point where ChatGPT, Perplexity, Google’s AI Overviews, and various other LLMs are scraping the web and presenting information in ways that make traditional SEO look quaint by comparison.

I’ve seen a lot of SEO budgets get wasted over the years. Companies still dumping money into outdated tactics whilst completely ignoring the fact that AI search engines are now answering queries without users ever clicking through to their websites. If you’re not implementing structured data properly in 2025, you’re essentially invisible to the AI systems that are increasingly controlling how information gets discovered and presented online.

What Actually Is Schema Markup and Why Should You Care?

Schema markup is structured data vocabulary that you add to your website’s HTML to help search engines and AI systems understand what your content actually means, not just what it says. It’s like the difference between someone reading your menu and actually understanding that “Margherita” is a type of pizza with specific ingredients, not just a random word on a page.

The vocabulary comes from Schema.org, which was created by Google, Microsoft, Yahoo, and Yandex back when they all still cared about working together on web standards. Schema uses formats like JSON-LD, Microdata, or RDFa to mark up everything from products and reviews to events, recipes, articles, and local businesses.

Think of it this way: when you write “Open Monday-Friday 9am-5pm” on your website, a human understands you’re describing business hours. An AI system without structured data might interpret it as random text. With proper schema markup, you’re explicitly telling the AI “this is an OpeningHoursSpecification object with specific days and times” in a language it can actually process and use.

How LLMs and AI Overviews Actually Work

Large Language Models like GPT-4, Claude, and Gemini were trained on massive datasets scraped from the internet. When you ask ChatGPT a question, it’s not searching the web in real-time (unless it’s using a tool to do so). It’s generating responses based on patterns it learned during training.

Google’s AI Overviews, however, are different. They’re actively pulling information from indexed web pages to generate those AI-written summaries you see at the top of search results. This is where structured data becomes absolutely critical. When your content has proper schema markup, AI systems can extract precise, factual information with confidence about what that information represents.

Without schema markup, an AI might see your pricing table and make assumptions. With schema markup, it knows exactly what currency you’re using, what the price includes, and what conditions apply. That’s the difference between being cited accurately in an AI Overview and being ignored entirely.

The Schema Types That Actually Matter for AI

Not all schema types are created equal when it comes to feeding AI systems. Based on what I’ve seen working with clients and analysing which types of markup actually get surfaced in AI responses, here’s what you should prioritise:

Organization Schema: This tells AI systems who you are, what you do, and how to contact you. Include your legal name, trading names, logo, social profiles, and contact details. This becomes your identity in the AI’s knowledge graph.

Article & NewsArticle Schema: For any content marketing or blog posts, this markup helps AI systems understand authorship, publication dates, and article structure. Include headline, datePublished, dateModified, author details, and publisher information.

Product & Offer Schema: If you sell anything, this is non-negotiable. Include name, description, price, currency, availability, SKU, brand, and review aggregates. AI shopping assistants are increasingly relying on this data to make product recommendations.

FAQPage Schema: This is increasingly important as AI systems love to pull from FAQ sections. Each question-answer pair becomes a discrete piece of information that AI can cite or use to construct responses.

HowTo Schema: For instructional content, this breaks down your process into clear steps that AI can understand and potentially reproduce in its own responses. Include estimated time, required tools, and supply lists.

LocalBusiness Schema: For any business with a physical location, this is essential for appearing in location-based AI responses. Include opening hours, service areas, accepted payment methods, and price ranges.

Review & AggregateRating Schema: AI systems use this to assess credibility and quality. Include review counts, rating values, and ideally link to actual review sources.

Implementation: JSON-LD vs Microdata

I’ve implemented schema markup using various formats over the years, and there’s really only one format worth using in 2025: JSON-LD.

JSON-LD (JavaScript Object Notation for Linked Data) sits in a script tag in your page’s head or body. It’s clean, it’s separate from your HTML, and it’s what Google explicitly recommends. More importantly, it’s what AI systems can parse most easily.

Here’s a basic example for a local business:

Microdata embeds the markup directly into your HTML using attributes like itemscope and itemprop. It’s messier, it’s harder to maintain, and whilst it still works, there’s no compelling reason to use it over JSON-LD.

Common Schema Mistakes That Make Your Data Useless

I’ve audited hundreds of websites over the years, and the same schema markup mistakes keep appearing. These errors don’t just annoy Google’s testing tools – they make your structured data completely useless for AI systems.

Marking up invisible content: If users can’t see it on your page, don’t mark it up. Hiding review schema for reviews that don’t exist on your page will get you penalised and ignored by AI systems.

Incorrect data types: If you’re marking up a price, use the proper Price type with currency. Don’t just throw a number in there and hope for the best. AI systems are literal and will interpret incorrect types as garbage data.

Incomplete markup: Adding Organization schema but forgetting to include your logo URL or contact details makes the markup largely pointless. AI systems want complete entity information.

Multiple conflicting schemas: Having three different Organization schemas on your homepage with different information confuses everyone. Pick one source of truth and maintain it properly.

Not updating dynamic content: If your opening hours, prices, or availability change, your schema needs to update too. Stale structured data is worse than no structured data because it trains AI systems on incorrect information.

Ignoring required properties – Each schema type has required properties. Missing them means the markup fails validation and gets ignored. Use Google’s Rich Results Test or Schema Markup Validator regularly.

Testing and Validation

You cannot implement schema markup without testing it. I’ve seen too many websites with broken markup sitting in production for months because nobody bothered to validate it.

Use these tools:

Google’s Rich Results Test – This shows you if Google can read your markup and if you’re eligible for rich results. It’s the baseline test.

Schema Markup Validator – The official validator from Schema.org that checks your markup syntax and completeness.

Google Search Console – This shows you which pages have markup errors or warnings. Check it regularly.

The reality is that even valid schema markup doesn’t guarantee you’ll appear in AI Overviews or get cited by LLMs. But invalid or missing schema markup almost guarantees you won’t be. It’s like showing up to a job interview without a resume – technically possible to get hired, but you’re making it unnecessarily difficult.

AI-Specific Considerations

Here’s what most SEO guides won’t tell you: AI systems aren’t just looking for valid schema markup. They’re looking for comprehensive entity information that helps them build knowledge graphs.

Entity disambiguation: If you’re “Smith & Co”, make sure your schema clearly identifies which Smith & Co you are. Use sameAs properties to link to your Wikipedia page, Wikidata entry, and authoritative social profiles.

Relationship mapping: Use schema properties like founder, employee, member, and parentOrganization to show relationships between entities. AI systems use this to understand organisational structures and expert attribution.

Temporal information: Always include dates: publication dates, modification dates, event dates, validity periods for offers. AI systems need to understand when information was current.

Geographic specificity: Don’t just say “Melbourne”. Use proper address schema with postal codes, coordinates if possible, and service area definitions. Location-based AI queries need precise geographic data.

Citation and provenance: Use citation and isBasedOn properties to show where your information comes from. AI systems that cite sources will prefer content that itself cites sources.

Looking Forward

We’re at the beginning of something that fundamentally changes how information gets discovered and consumed online. Traditional SEO focused on ranking for keywords. The future is about becoming a trusted entity that AI systems cite and reference.

Schema markup is your way of speaking directly to these AI systems in their native language. It’s the difference between being a website that humans read and machines guess about, versus being a structured knowledge source that machines can confidently use and cite.

I’ve been in SEO long enough to see tactics come and go. Yellow Pages advertising, exact-match domains, keyword density, and private blog networks – all had their moment and faded. Schema markup isn’t a tactic. It’s the foundational language of the semantic web. If you’re not implementing it properly, you’re not just behind on SEO. You’re actively becoming invisible to the AI systems that increasingly mediate how people access information.

The businesses that win in this new AI-mediated search environment won’t be the ones with the most backlinks or the highest domain authority. They’ll be the ones whose structured data clearly tells AI systems exactly who they are, what they do, and why they should be cited as authoritative sources.

Get your schema markup sorted. Test it properly. Keep it updated. Your visibility in AI search depends on it.