Many people ask: How can we get our content used as a source for the answers that ChatGPT and similar LLM systems provide to conversational searches related to topics important to our business?
The answer is simple: Semantics and SEO.
I know, the answer is anticlimactic considering how the mainstream consensus is to tell that optimizing for classic Search and visibility in LLMs are two very different things, so much so that we see the classic invention of new acronyms like GEO or LLMO, but, please, bear with me.
A little case study.
The context.
Let me present to you Tratos, a client of mine:
It is a business company in the industrial cable industry, which is a niche that relied (and still relies) strongly on traditional marketing, but that the COVID-19 pandemic obliged to start paying attention to online marketing and SEO to maintain a constant flux of lead and potential new contracts.
Because of budget limitations, the strategy for this client relied more on redesigning the website, restructuring its architecture and creating content able to fill the gaps and demonstrate its expertise, experience to build trust and authority, and less on classic link building and digital PR.
However, thanks to classic marketing initiatives, the Tratos website created a relevant and thematically consistent albeit not very big link profile.
SEO Strategy and tactics.
If you had the kindness to read what I have published in the past on this blog, you know that for many years now I am preaching the need to think of SEO as targeting the interests and needs of an audience all along the Search Journey and the Messy Middle in and outside Google, hence in having visibility the most relevant metric and, on the contrary, to stop thinking in isolated keyword rankings.
If you do not remember what I said, here below you have the links for diving into these concepts:
- Secrets Google hides in plain sight, and how to use them to assess our SEO strategy.
- Oh, My MUM. Or how to think SEO in the era of algorithms based on AI.
- Search Intent and SEO. A practical guide.
Resuming, I tend to develop my SEO strategy in these tactical steps:
- Defining the ontology domain of the client, aka what its business is about (industrial cables).
- Defining the entities related to the ontology domain as, for example, what are the industries that it serves (e.g.: mining, telecom, energy providers et al), what are the products and their components.
- Using Google as a tool and retrieving all the queries that Google associates with our entity seeds from scraping features like topic filters, People Also Ask, People Also Search for, Image Search tags, Google Suggests and many more.
- Performing a Named Entity Recognition analysis of all the queries retrieved from analyzing Google’s SERPs to further improve the entity search for the website.
- Determining a taxonomy based on entity search.
- Analyzing the targeted audience search journeys (discovery, evaluation and decision-making)
- Clustering the phases of the search journeys through a carefully crafted internal linking strategy and implementation, to strengthen the so-called topical authority for all the ontology domains.
- Doing embeddings and cosine vicinity analyses for the existing content (and eventually of the competitors) to individuate gaps.
- Using the insights of the previous steps for updating the existing content and creating new ones, considering entity salience and clarity of language (monosemanticity) as key elements along with expert matter knowledge.
Another point (it could be numbered as 3.1) is related to content formats.
In fact, when we analyze the nature of the SERPs, we can also individuate the patterns that Google has in rendering them for the entities and queries important to our business.
Therefore, when we are in the last phase (updating and creating content), we can consider those patterns for enriching the written content with images, videos and/or specific formatting of the text) or to consider a visual format as more relevant than a textual one.
The consequences.
Thanks to this methodology, strongly biased toward Semantics, the visibility of Tratos – as well as other clients of mine – has increased steadily and in a meaningful way, which means not obtaining visibility for not relevant query sets but for those that matter for its business needs and objectives.
For instance, Tratos started competing relatively soon for queries like “[cable type]” and “[cable type] + manufacturer”, as well as for more generic informational query sets to topics and subtopics related to the industrial cable industry like, for instance, “What is voltage rating” or “What is tri rated”, and not only with classic search results (including an increase in featured snippets) but also visibility in People Also Ask, Things to Know, Image Search box, Videos and even as branded People Also Search For.
Another consequence was that the semantically optimized content and clustered in topic hubs started to be visible for an impressively high number of queries variations not explicitly targeted.
Then, we started to see how the website’s contents were constantly cited and linked as sources for answers related to the industrial cable ontology domains’ topics/subtopics on LLMs like ChatGPT, Perplexity, Gemini and environments like AI Overview
Example of Tratos visibility in AI Overview.
Example of Tratos visibility in ChatGPT/SearchGPT.
Now, this is not only unchanged but has become a steadily growing phenomenon.
Understanding the reasons behind a website visibility in LLMs.
So, how did it happen for a website like Tratosgroup.com to become a source for LLMs’ answers for cable industry questions, if those systems never had been purposefully targeted with specific optimizations actions?
Semantics, again, is the answer.
I will try to “reverse engineer” the reasons why a normal website like Tratosgroup.com is obtaining visibility on LLMs without – for instance – executing very costly digital PR or link building campaigns.
I will try to be as clear as possible and, for that, I will use the Socratic method to explain myself: questions and answers.
What is an LLM? A simple definition
A Large Language Model (LLM) is like a supercharged version of the autocomplete feature on our phone. Just like our phone predicts what we’re going to type next based on the words we’ve already typed, an LLM predicts and generates text based on the words and patterns it has learned from huge amounts of text (like books, websites, and articles).
An LLM chooses the words it uses based on probability; it predicts the next word by calculating which word is most likely to come next based on the patterns it has learned from all the text it was trained on.
An LLM does this at a very large scale and complexity.
- It breaks down the sentence into smaller parts (like words or even fragments of words).
- It looks at the context (the whole sentence or the whole conversation) to understand what we’re asking or saying.
- It calculates the probability of what the next word should be based on patterns it has learned from the billions of sentences it has seen before.
Is Entity Search key for how an LLM understands the context of a question?
Yes, it is because it’s how LLMs figure out what we’re talking about, not just how to form a response.
An entity is any specific person, place, thing, or concept, and entities give context to language, and without understanding which entity we’re referring to, an LLM can’t give an accurate answer.
How do LLMs use entity search?
When we ask a question, LLMs try to figure out what the key entities are and how they relate to each other.
For example, if we ask: “What’s the weather like in Valencia, Spain?”, then, the LLMs will:
- Recognize “Valencia” as an entity.
- Use context to figure out it means the city in Spain, not Valencia, California or Valencia oranges.
- Understand that “weather” is related to environmental conditions like temperature and precipitation.
It does this by analyzing patterns from all the data it’s trained on. If it has seen “Valencia” in weather-related contexts more often with “Spain” than other meanings, it assigns a higher probability to “Valencia” meaning the city.
Is context important for entity disambiguation?
It is indeed. When there’s ambiguity, LLMs rely on the surrounding context to clarify it.
- “Apple sales are down” and contextual content says “Sales of iPhones and MacBooks” hence “Apple” means the company.
- “I picked an apple from the tree” hence the fruit context tells the LLM that we mean “apple” literally.
The LLM doesn’t “know” this like a human does but it calculates based on patterns.
If 90% of the time “Apple” plus “sales” refers to the company, that’s what it will assume.
Entity search allows LLMs to move from simple pattern-matching to actual understanding — connecting the dots between people, places, and things to provide intelligent responses.
So, is SEO based on entity search essential for obtaining visibility both on classic search engines and LLMs?
Yes! And this is why.
Classic (and old) SEO focuses on keywords. Oversimpyfing, placing the right words in the right spots so that search engines would match your page to a search query.
LLMs but also search engines like Google for a few years now, instead, are moving beyond just looking for keyword matches; they’re trying to understand meaning by recognizing entities and relationships between them.
For example, the old way of doing SEO would have tried to make a URL rank for “Best pizza NYC” focusing on classic on page SEO factors like keywords in titles, headings, and content.
On the contrary, Entity-based SEO will optimize that URL so that Google and LLMs will easily understand these things:
- Formatting of the content aka using the review format or the listicle format.
- Entity salience, aka talking of “pizza” in relation to its ingredients, time of cooking, types of pizza et al.
- Offering clear “geo-localization” signal to make clearly understand that the geographical context is New York City (e.g.: addresses, phone numbers, places et al).
So it’s no longer about keywords alone but it’s about helping search engines and LLMs understand that our content is about pizza restaurants in NYC and why it’s trustworthy.
Why entities are central to LLMs and search visibility?
LLMs are trained to mimic the way search engines work by building a “knowledge graph” of entities and how they relate to each other.
A knowledge graph works like this:
- “Valencia” → City → In Spain → Known for paella and the Fallas festival.
“Paella” → Spanish cuisine → Traditional dish → Made with rice, chicken, saffron and other produces.
“Best” → High ratings → Reviews → User-generated content.
When an LLM generates an answer, it pulls from this type of structured understanding. Therefore, if our content is well-linked to these entities and fits into the knowledge graph, the LLM is more likely to surface it.
How to optimize for entity-based SEO?
To make our content visible to LLMs and search engines, we need to structure it in a way that reinforces entity recognition and context.
- Using structured data (more about this here).
Schema helps define entities for search engines and LLMs, therefore using the structured data for LocalBusiness helps search engines (and indirectly LLMs) clearly understand the entities and relationships involved.
- Building topical authority (context matters).
We should not just write about “pizza.” Instead, we should cover related topics like:
-
- Best pizza ingredients.
- History of pizza in NYC.
- Differences between New York and Chicago Pizza.
Creating a network of content (content hub) reinforces that our site is an authority on the “pizza” entity.
- Reinforcing entity relationships through internal linking.
A consequence of thinking in content hubs is internal linking optimization. We need to link to related pages to strengthen the connection between entities.
Example: A page about NYC pizza styles should link to our review of the top 10 pizza places in NYC.
This signals to search engines and LLMs that your content is part of a broader, meaningful structure.
- Using entity-rich language (aka natural language).
Instead of just saying “Pizza is popular,” we should write: “New York-style pizza is known for its thin crust and was introduced to the city by Italian immigrants in the early 20th century.”
Doing so, we’re embedding entities like New York, pizza, thin crust, and Italian immigrants — which makes the content easier to classify and retrieve.
- Optimizing for featured snippets and direct answers.
LLMs are often trained on structured and direct answers from search engines. So, we should format our content to match the way direct answers are given in search:
-
- Q&A format.
- Lists and bullet points.
- Clear, direct explanations.
This makes it easier for LLMs to extract and present our content in responses.
In other words, if our content tends to obtain featured snippets or be visible in the People Also Ask SERP feature (as it was for my client Tratos), then it has big chances to be used by LLMs.
Why does this matter for LLM-generated answers?
When an LLM like ChatGPT or Google Gemini generates an answer, it’s essentially “crawling” a virtual version of the web-based on entity relationships.
If our content is entity-optimized, then:
- LLMs can identify it as a trusted source.
- Our content is more likely to appear in AI-generated answers.
- It positions our site as part of the “knowledge graph”, which increases visibility in both search and AI-driven platforms.
If our content is keyword-stuffed without a clear entity structure:
- LLMs might not recognize the context.
- Our content could get overlooked or misinterpreted.
Now that we know that Entity Search is the framework within which we must base our SEO strategy, we can dive deeper in its specificities.
If Entity Search is our framework, does this mean that we should think about embeddings?
Yes!
Embeddings are the secret ingredient that allows LLMs and modern search engines to truly “understand” language and context at scale, especially in the context of entity search and content visibility.
What are embeddings?
An embedding is a way of turning words, phrases, and entities into mathematical representations (essentially, a set of numbers) that capture their meaning and relationships.
Instead of seeing words as isolated strings of letters, LLMs convert them into vectors aka points in a multi-dimensional space where similar meanings are grouped closely together.
For instance,
- “Cat” and “dog” would have embeddings close together because they’re both animals.
- “Cat” and “lion” would be even closer because they’re both part of the feline family.
- “Cat” and “car” would be far apart because they’re not semantically related.
How do embeddings help LLMs understand entities?
When LLMs process text, they don’t see words: they see embeddings.
Each word, entity, or phrase gets mapped to a vector.
Entities with similar meanings or relationships are positioned closer together in this high-dimensional space.
LLMs rely on the distances and relationships between embeddings to understand context and meaning.
Embeddings give LLMs the ability to handle ambiguity because they rely on the “closeness” of meanings rather than just keyword matching.
Why embeddings are critical for entity search?
Entity search relies heavily on embeddings because entities are not just words; they represent concepts and relationships.
Let’s take the entity “Amazon.”
Embeddings for “Amazon” as a river and “Amazon” as a company are close to very different sets of concepts:
- “Amazon” + “jungle” → river.
- “Amazon” + “Prime” → company.
The context of the query determines which embedding cluster the LLM selects from.
Entity embeddings capture more than just word similarity; they capture the entire concept.
- “Amazon” as a company is close to “e-commerce” and “logistics.”
- “Amazon” as a river is close to “South America” and “rainforest.”
Do embeddings help with context over long conversations (or search journeys)?
LLMs don’t just map individual words, they build embeddings for entire sentences and documents to preserve context.
For instance, we can ask:
“Who is the father of Luke Skywalker?” → Answer: “Darth Vader.”
Then we follow up with: “What’s his connection to Leia?”
The LLM understands that “his” refers to “Darth Vader” because the embeddings for “father” and “Luke Skywalker” remain active in the context.
Embeddings create a “memory” for the conversation, allowing the LLM to preserve context across a conversational search journey.
Embeddings give LLMs a sense of continuity because they know that entities remain linked across a conversation, which is critical for complex answers.
How do embeddings improve visibility in LLMs?
Since LLMs generate answers based on embeddings, our content’s visibility depends on how well its entities are embedded and connected.
To obtain this we must:
- Use entity-rich language.
The more relevant entities are included in your content, the better the embeddings become.
- Build semantic relationships.
As said before, we should link internally and externally to content that reinforces entity connections, because this helps LLMs map embeddings to a wider network of connected entities, hence increasing our authority in the knowledge graph.
- Optimize for entity-based structured data.
Schema markup tells search engines (and indirectly LLMs) exactly what an entity is.
Example: A product review for a pizza restaurant should define entities like:
- Optimize for semantic search, not just keywords.
This means being focused on concepts and context.
LLMs use embeddings to group related concepts, so our content gets more visibility if it fits into a wider semantic map of related terms and entities.
How do embeddings power retrieval-augmented generation (RAG)?
Many modern LLMs (like ChatGPT) use a process called retrieval-augmented generation (RAG):
The model receives a query and:
- It searches a database of embeddings for the most relevant information.
- It pulls the most relevant pieces and generates an answer.
If our content is well-structured and entity-rich:
- The embeddings for our content will match the query embeddings.
- Our content becomes more likely to be retrieved and shown in the answer.
So, does “embedding optimization” mean that our content will be surfaced in AI-generated answers?
Yes!
- Entities define what our content is about.
- Embeddings define how it fits into the wider context of knowledge.
- Search engines and LLMs now rely on embeddings (not just keywords) to surface content.
Therefore, we should follow this strategy:
- Optimize for entities, so search engines and LLMs can identify our content.
- Optimize for embeddings, so LLMs can retrieve and generate answers from our content.
- Build semantic connections, so our content is part of the broader knowledge graph.
So, now we know that Entity Search is central in modern SEO and, then, that embeddings are the framework that gives it meaning and structure, therefore we must understand why cosine similarity is the mechanism that determines how closely and accurately content matches a query, and ultimately, how visible it becomes in search and LLM-generated answers.
How does cosine similarity relate to embeddings and entity search?
When an LLM receives a query, it calculates the cosine similarity between:
- The embedding of the query.
- The embeddings of entities and content that are stored in its training data or knowledge base.
For instance, if we search for “Best paints for Warhammer minis“, the LLM calculates the cosine similarity between:
- The query embedding
- Embeddings for:
- Mini painting techniques.
- Reviews of different paint brands.
- Discussions about Warhammer mini painting in online forums.
The higher the cosine similarity, the closer the embeddings are, which makes it more likely that the LLM will select that content to generate a response.
Is cosine vicinity the same as relevance score?
When we search for something in an LLM or search engine, cosine similarity essentially acts as a relevance score:
- High cosine similarity → High relevance → More likely to appear in the answer.
- Low cosine similarity → Low relevance → Less likely to be selected.
If the cosine similarity between the query and the content embeddings is high, the LLM interprets the content as highly relevant, therefore increasing the chances it will be retrieved and included in the generated response.
Is cosine vicinity also key to entity disambiguation?
Yes, it is.
Cosine similarity helps LLMs determine which version of an entity we are referring to based on context.
For example, if we search for the best paints for minis, the LLM might calculate high cosine similarity with:
- Paints for Warhammer 40K minis.
- Paint brands like Citadel or Vallejo.
- Techniques for shading and layering.
If our content is well-optimized for entities and embeddings, it will sit in the right cluster, hence (again) increasing the chance that LLMs will select it when generating an answer.
What is the relation between cosine similarity and entity co-occurrence?
One of the most powerful aspects of cosine similarity is that it allows LLMs to understand related entities even if they aren’t mentioned directly.
For instance, we have a query like: What’s the best way to paint Space Marines?
An LLM might retrieve content that mentions:
“Primaris Marines”
“Citadel paints”
“Edge highlighting”
Even if the exact term painting Space Marines isn’t present, the LLM understands that the embedding for “Space Marines” is close in cosine similarity to Primaris Marines and Citadel paints, as well as it understands that Edge highlighting is a mini painting technique.
This means that our content can rank or appear in LLM answers even if it doesn’t match the query exactly, as long as the embeddings are close in cosine space.
How to increase cosine similarity for better visibility?
- Strengthening entity clusters.
We must write about related concepts to strengthen embedding relationships.
This increases the “density” (I know, I don’t like this word either) of the embedding cluster and raises the overall cosine similarity between your content and related queries.
- Using natural language (plus structured data to label it).
- Creating content that reflects high-similarity clusters.
We must analyze the top-ranking content in our niche, and structure our content similarly aka using similar entity relationships, to sit within the same cosine vicinity as top-ranking content.
- Including semantic variations.
Synonyms and variations increase cosine similarity by increasing the number of embeddings linked to our content.
All these embeddings will sit close together, increasing your visibility for any variation of the query.
Cosine similarity in retrieval-augmented generation (RAG).
Being schematic, we can say that:
- In RAG-based LLMs, a query is converted into an embedding.
- The LLM calculates cosine similarity between the query and stored content embeddings.
- The most similar embeddings (highest cosine similarity) are retrieved.
- The LLM uses this content to generate an answer.
If our content is not close enough in cosine similarity, it won’t be retrieved even if it’s technically relevant.
So if:
- Entity Search is central in modern SEO.
- Embeddings are the framework that gives it meaning and structure.
- Cosine similarity is the mechanism that determines how closely and accurately content matches a query.
Then, we must go even deeper and understand why Entity Salience is the signal that tells search engines and LLMs which entities matter most — shaping relevance, context, and retrieval accuracy.
Why does Entity Salience matter for Search and LLMs?
If we write an article titled The Best Warhammer 40K Space Marine Minis for Competitive Play and mention Space Marines in the title, intro, and key headings, while also referencing related terms like Games Workshop, Primaris Marines, and 40K tournaments, then the entity Space Marines becomes highly salient.
Search engines and LLMs will recognize that Space Marines is the core focus because it appears consistently in important structural elements and is supported by semantically related terms.
High entity salience improves cosine similarity between our content and queries about Space Marines, increasing the chances that it will rank in search and be selected for an LLM-generated answer.
How to improve Entity Salience?
To boost salience, we must structure the content strategically.
We should start the article with a strong opening like: Space Marines are the backbone of any competitive Warhammer 40K army. Primaris Marines, in particular, have become essential in the current meta due to their versatility and durability.
Then, we should include detailed descriptions, like: Games Workshop recently updated the Space Marines kit with new weapon options, making them even more competitive in 40K tournaments.
We should also reinforce the entity by using related terms like Games Workshop, Primaris, and 40K meta throughout the article, and link internally to other relevant Warhammer 40K content.
How does Salience translate into visibility?
If the entity “Space Marines” is highly salient in the article, search engines and LLMs will establish strong semantic links between “Space Marines”, “Warhammer 40K”, and “competitive play.”
This raises the cosine similarity between our content and queries about Warhammer 40K Space Marines.
If a user asks an LLM What are the best Space Marines for competitive 40K?, the system will recognize that our content is a strong semantic match, so increasing the chance that it will be retrieved and cited in the response.
We now go even deeper. Starting from Entity Search and then descending into embeddings, cosine similarity, and, at a single content level, entity salience, we must consider one more element: Monosemanticity because it ensures that the intended meaning of an entity is clearly defined and unambiguous — increasing the precision of embeddings and raising cosine similarity, which improves both search ranking and LLM retrieval accuracy.
What is Monosemanticity and why is it important?
Monosemanticity refers to the idea that a word, phrase, or concept has a single, unambiguous meaning in a given context.
A “monosemantic” word or concept is easy for an LLM to process because it only has one valid interpretation within the context of a query or sentence.
Some examples can be these:
- Frodo is the bearer of the One Ring. → “Frodo” is monosemantic in this case because it unequivocally refers to the character Frodo Baggins from The Lord of the Rings.
- Mount Doom is where the One Ring was forged. → “Mount Doom” has one clear identity as a geographical entity in the context of The Lord of the Rings.
In both cases, the cosine similarity between embeddings would be very high because the relationship is direct and unambiguous.
On the contrary, these are examples of polysemic terms (aka with more than one possible meaning):
- Apple → Could mean the fruit or the company.
- Amazon → Could refer to the company, the river, or the region.
In their cases, the cosine similarity will depend on context.
If the context is vague, the cosine similarity will be low between the query and the correct concept.
Therefore, a stronger context improves monosemanticity, which increases the cosine similarity between the terms used in the content.
How can we optimize content for cosine similarity and monosemanticity?
To make our content more visible to LLMs, we need to reduce ambiguity and strengthen entity-specific context.
In other words, we must guide the LLM toward a monosemantic interpretation.
How?
By doing these things:
- Strengthening entity embedding clusters.
We should mention related terms and entities close together.
For example, instead of writing Shatterpoint is a fun game, it’s better to write: Star Wars: Shatterpoint by Atomic Mass Games features dynamic skirmish battles between iconic characters like Anakin Skywalker and Darth Maul.
The additional entities (Star Wars, Atomic Mass Games, Anakin Skywalker, Darth Maul) clarify the context, which increases the cosine similarity by reinforcing the semantic relationship between the game and the Star Wars universe.
- Using Schema markup to disambiguate.
We should define entities explicitly using schema.
Classic examples of this is using the Organization type or the “brand” or the “sameAs” properties when marking up a product.
- Increasing context “density”.
More context means a higher chance of resolving ambiguity.
For instance, instead of saying: The Wheel of Time is popular, we should write: The Amazon TV series The Wheel of Time, based on the novels by Robert Jordan, has become popular thanks to Rosamund Pike’s portrayal of Moiraine and its quite faithful adaptation of the source material.
Stronger context translates into higher cosine similarity, hence greater monosemanticity.
- Strengthening internal linking to reinforce monosemanticity.
Link to related content to establish strong entity relationships.
For instance, a post about mini-painting techniques should link to:
- A guide on edge highlighting for Warhammer minis.
- A comparison of Citadel vs. Vallejo paints.
- A tutorial on shading and layering techniques.
Reinforcing entity relationships increases the embedding cluster’s strength, raising cosine similarity.
- Using synonyms and variations thoughtfully.
Can semantics be applied to multimodal visibility optimization?
When people talk of optimizing content for LLMs, they tend to forget the multimodal nature of the latest models (in the case of Google, consider AI Mode).
The good news is that what has been said before about entity search, embedding, cosine similarity, entity salience and monosemanticity applies completely to multimodal responses.
How does Cosine Similarity work in multimodal systems?
Cosine similarity still operates at the core of a multimodal model, but it now works across different types of embeddings.
For instance, we can ask: “Show me a picture of the Eiffel Tower at sunset.”
- Text input → Embedding: the text is converted into a vector embedding.
- Image retrieval. The system looks for an image embedding that has high cosine similarity with the text embedding for “Eiffel Tower at sunset.”
- Output. The image with the highest cosine similarity is selected and presented.
Cosine similarity, then, allows the LLM to link different modalities (text/image) based on their shared position in the embedding space.
Multimodal is essential for LLMs to improve accuracy and relevance through:
- Combining multiple types of information to clarify context.
- Resolving ambiguity by cross-referencing between different modalities.
A good example can be a query like: “Explain how to paint a Warhammer Space Marine:
- High cosine similarity between the query and step-by-step text instructions.
- But adding an image or video increases the cosine similarity between the query and the response because:
- Text embeddings provide conceptual clarity.
- Image embeddings provide procedural clarity.
In other words, the multimodal output increases cosine similarity and improves answer quality.
How does cross-modal cosine similarity work?
Multimodal models map different types of inputs into a shared embedding space using a common embedding architecture.
Example:
- “Dog” (text) → Embedded at (2, 3, 5).
- Picture of a dog → Embedded at (2.1, 3.1, 5.0).
- Sound of a dog barking → Embedded at (2.2, 3.0, 5.1).
All three embeddings cluster closely together because they represent the same underlying concept, even though they come from different modalities.
Cosine similarity, then, increases because the embeddings share a similar position in the shared space.
Concluding, we should think multimodal because:
- It improves monosemanticity.
Text alone might leave ambiguity unresolved, hence by using an image or audio clip we can provide immediate context and reinforce entity identification.
- It increases Cosine Similarity Through Context Reinforcement.
- A query like “Explain how to paint a Space Marine” might have high similarity with text instructions.
- Adding a video or image increases cosine similarity by reinforcing the same concept across multiple modalities.
- It enhances user experience.
- Multimodal content is more natural and engaging.
- People process information more effectively when they receive visual + verbal + textual inputs simultaneously.
- It matches how humans process information.
- Humans naturally combine sensory input when processing language.
- Multimodal LLMs (and search engines) are mimicking this human cognitive process.
How can we optimize for multimodal?
Good news! We don’t have to do things differently than we have ever done before (well, many SEOs always neglected these next tasks for time-consuming and not impactful):
- Doing images and visual search SEO.
- Using schema markup for images, videos and audio.
- Enriching our content with multimedia content.
Returning to the case study of Tratos.
Now that we “know” all these things about Semantic SEO, can we say that having applied an SEO strategy meant to make a website visible all over the search journey and all the multi-format types of search features (eg video box, local pack or image search), also meant to practically set up the base for having that website optimized for LLMs visibility?
Yes, because that way of doing SEO is aligned with the way LLMs structure their knowledge.
By covering the entire search journey, Tratos created a dense, highly relevant embedding cluster, which translated into higher cosine similarity, which increased its possibilities for a higher LLM retrieval accuracy.
Then, having considered important to use videos and images for both usability and creating visibility in media SERP features, improved what we can call cross-modal cosine similarity, which translates in even bigger chances to be used as a source by LLMs.
But, Gianluca, aren’t you forgetting the importance of brand mentions for visibility in LLMs?
There is a current stream of opinions that the brand mentions on other websites, which are used as sources by LLMs, is the most important way to earn visibility for a website, if not the only one.
While I do not deny that brand mentions are valuable, I suspect that many think that mechanism behind why they matter still is link-based authority.
It is not. It is semantic authority and entity relevance.
For short: you do not necessarily need to have your brand mentioned on a website listicle, for instance, linked back to yours.
Brand mentions now matter not because they increase “authority” directly but because they strengthen the position of the brand as an entity within the broader semantic network.
When a brand is mentioned across multiple (trusted) sources:
- The entity embedding for the brand becomes stronger.
- The brand becomes more tightly connected to related entities.
- The cosine similarity between the brand and related concepts increases.
- The LLM “learn” that this brand is relevant and authoritative within that topic space.
So… before starting applying the crappier version of link building for trying to earn brand visibility on LLMs, please, understand that brand mentions only matter when they strengthen semantic relationships and consider this:
- Low-value mentions:
- Random mentions with no contextual relationship.
- Mentions on low-authority or spam websites.
- Mentions without entity reinforcement.
- High-value mentions:
- Contextually relevant mentions.
- Mentions alongside related entities.
- Mentions within authoritative and topically relevant content.
- Mentions that reinforce existing entity relationships.
Brand mentions matter because they reinforce semantic relationships, and not because they “increase authority” directly.
Why I insist on considering structured data important.
Structured data is now often dismissed in the context of large language models (LLMs) because LLMs don’t directly process structured data.
In my opinion, this is a weird way of thinking because structured data still plays a crucial indirect role in improving search visibility and the quality of LLM-generated answers.
Search engines like Google crawl and process structured data to understand entities, their relationships, and context. This information is then incorporated into Knowledge Graphs, which LLMs use as part of their training data.
For example, if a page is marked with structured data linking “Games Workshop” to “Warhammer 40k” and “Space Marines,” the Knowledge Graph reinforces those connections.
LLMs trained in this structured data environment will generate more accurate answers because the embeddings for these terms are more tightly linked.
Structured data also improves Knowledge Graph accuracy, which directly impacts LLM performance.
Structured data, then, helps resolve polysemy (words with multiple meanings) by explicitly defining the intended context.
Additionally, RAG systems also benefit from structured data, as it improves semantic similarity and the accuracy of content retrieval during LLM responses.
Finally, structured data powers AI Overviews and similar search features directly influenced by LLMs.
Product and business information marked with schema can appear in AI Overviews, and LLMs trained on these enhanced data points will improve retrieval accuracy for related queries.
In short, while LLMs don’t read structured data directly, they rely on search engines and Knowledge Graphs shaped by structured data, which is a key factor in improving search visibility and LLM performance.
(if you jumped here from my jump link above, click here to return to where you were reading.)
[UPDATE – MARCH 28, 2025]
A few days ago, Dan Petrovic (also known as DejanSEO on X) made an important comment that I think is important to add to this guide.
“LLMs use attention mechanisms* and not cosine similarity to weigh the importance of different parts of the input when generating each output token.
It still uses vector representations, but the relationship is more complex.
Cosine similarity measures fixed distances between vectors, while attention mechanisms use learned weights that determine how each word should focus on other words in context.
These relationships are learned during training, not calculated by a fixed formula. Essentially, it is a sophisticated contextual weighting system.
Each word decides how much to pay attention to every other word based on learned patterns, not just semantic similarity.
Another interesting distinction is that cosine similarity gives you a single similarity score, while attention mechanisms capture multiple types of relationships simultaneously through different attention heads (some might focus on grammatical relationships, others on topical connections).
Essentially, attention allows models to grasp structure and hierarchy in content, not just flat relationships.
So this actually supports your main argument in the article more strongly than cosine similarity.”
Follow Dan on X and LinkedIn.
Best article on the topic I’ve read thus far. Thank you for putting in the work. Bookmarking this to share with anyone asking about any of the items in that venn diagram.
Great article on semantic SEO for both traditional search and LLMs! Strengths, Gianluca Fiorelli. THX for “sharing”. Clear explanations, practical advice, forward-thinking, real case study, multimodal consideration. To make it even better: Add a deeper dive into RAG within LLMs, suggest practical implementation tools, discuss metrics for measuring LLM visibility, address potential challenges, and briefly touch upon future trends. Overall, a valuable resource!