>

The Search Session

Navigating Agentic AI in the World of SEO | Andrea Volpini

Gianluca Fiorelli

Feb 13, 2025

27

min read

Join our amazing host, Gianluca Fiorelli, on the first episode of The Search Session as he chats with Andrea Volpini, CEO of WordLift and a true visionary in the world of semantic search. They explore the fascinating evolution of AI-powered SEO, the importance of structured data, and how AI agents are changing the search landscape.

Andrea shares the story behind WordLift, discusses the critical need for SEOntology, and explains the innovative concept of llms.txt. This episode is packed with valuable insights on optimizing your strategy for AI and what the future of search might hold. And, as always, they wrap things up with a lighthearted Proust Questionnaire!

Enjoy!

Andrea Volpini

CEO and Co-founder of WordLift

Andreea has been transforming how we understand and leverage web content and is one of the most vocal SEO professionals in educating the industry about semantic search and the rise of AI.

Andrea Volpini

CEO and Co-founder of WordLift

Andreea has been transforming how we understand and leverage web content and is one of the most vocal SEO professionals in educating the industry about semantic search and the rise of AI.

Andrea Volpini

CEO and Co-founder of WordLift

Andreea has been transforming how we understand and leverage web content and is one of the most vocal SEO professionals in educating the industry about semantic search and the rise of AI.

Video Chapters

Resources

Your next client isn't human

LinkedIn post by Andrea Volpini

Why do we need an ontology?

Seontology: Writing the Future of SEO
(connecting traditional symbolic AI's structured knowledge representation with modern neural methods)
Training models to think from imitation learning
(reinforcement learning applied to CoT, the aha moment for SEO thinking)

Is structured data still relevant?

How can we influence a language model?

Monosemanticity and Sparse Autoencoders
Contextual structured generation
(generating your biography with a controlled, autonomous knowledge graph. Zero-shot KG creation, slide 54-58)

Transcript

Gianluca Fiorelli: Welcome to the first episode of The Search Session, the podcast where we will explore the ever-changing world of SEO. I'm your host, Gianluca Fiorelli, an international SEO and strategic SEO consultant, and I’m thrilled to have you join us on this journey. In each episode we'll feature conversation with industry leaders, and we will explore how companies can address the challenges posed by the evolving search landscape.

Our goal is simple. We want to equip in-house SEOs, agency professionals and consultants with the essential insights and practical strategies and tactics needed to stay ahead of the curve. So get ready, learn, explore, and transform the way you think about search with us. And remember to subscribe to our YouTube channel and click on the bell to receive updates on new episodes.

Without further ado, let's start the first episode of a Search Session.

Our very first guest is Andrea Volpini, CEO of WordLift. Andrea has been transforming how we understand and leverage web content and is one of the most vocal SEO professionals in educating the industry about semantic search and the rise of AI.

In this episode, we will explore all these topics. And let's see where the conversation is going to bring us.

Welcome, Andrea and happy to have you as the first guest of a Search Session.

Andrea Volpini: Super honored to be the first guest. I’m happy to discuss with you what the future holds for us.

The Origins of WordLift

Gianluca Fiorelli: Let's talk a little bit about you and WordLift. How did you, with your partners, come out with this quite prophetic idea? I know you for many years, and for many years, you are talking about semantics and all these things related to what is now AI.

How did you come out with this idea and this vision for WorldLift.

Andrea Volpini: Yeah. Let me explain it in a few steps. I started a long time ago, like in 1995, when I approached the web and built it. And so, my first companies were about content management systems and creating platforms for helping people publish content online. As we progressed, I started to feel the issue of organizing and architecting information, like information, architecture, and accessibility. These became core topics because, at that time in early 2000, I started to work with the Italian parliament, so we had to deal with a lot of Regulations and laws, and the web was still a fairly new channel for citizens to interact. Open data started later on as a movement to enable people to access and compute information.

And so, at that time I started to do research on the semantic web. And how these could be applied to content management systems. This is really how it started. Very far from SEO, but more rounded into information architecture and publishing systems.

Gianluca Fiorelli: Cool. And from there, you made a long journey and landed in the SEO world.

Andrea Volpini: Yeah, and to be honest, it was unexpected. Like many startups, you don’t always have a perfect market fit at the beginning. WordLift actually started as a research project around 2014–2015, and then it became a company in 2017.

We initially focused on enabling people to publish data because we understood the importance of feeding autonomous agents. But let me take a step back.

Twenty-five years ago, Tim Berners-Lee—the guy who invented the web—along with a couple of other pioneers, started thinking about how to make information easily processed by autonomous agents. This led to a famous and widely debated article in Scientific American, introducing the concept of the Semantic Web. The Semantic Web was all about making information accessible to machines, structuring content in a machine-readable way, and envisioning the role of autonomous agents.

For many years, the focus was primarily on structuring content and adding metadata because we didn’t yet have truly capable autonomous agents. But in the last two or three years, with the rise of large language models in natural language processing, we’re finally starting to see what autonomous agents can be. And it’s fascinating to explore how we can blend structured data with AI-driven agency.

That’s really been my journey—realizing at some point that search engines needed this structured data to train their systems. So, what started as an editor for helping people create linked open data evolved into an SEO tool. And even now, I feel like our approach still sits at the edge of the SEO industry.

There’s a lot of cross-pollination from different industries in the work we do. More and more, we’re being approached by product teams rather than just SEO teams because they want to integrate our technology into their platforms and systems. So yeah, we’re in a really interesting place right now, I think.

Agentic Search: The New SEO Challenge

Gianluca Fiorelli: Cool. So, let’s say WorldLift got on its feet around 2015, maybe 2017. It feels like just a few years ago, but in reality, a lot has changed since then.

Andrea Volpini: Oh, yeah.

Gianluca Fiorelli: Yeah. For instance, in 2017, nobody was thinking about AI—aside from what we were just discussing. Okay, Google started using machine learning to refine its algorithms, which was probably the closest thing to AI in search at the time. But now, in just the past three years, we've seen an explosion of AI advancements—LLMs, then LLMs combined with RAG, and now agentic search. These concepts have quickly become mainstream, not just among visionaries or strategic thinkers but across industries, even down to small business and local companies that are now seeing AI integrated into everything they do.

The latest buzzword making waves is "agentic search" or "AI agents." It’s becoming a hot topic, and you’ve been quite vocal about it, especially through testing and experimentation. You once said that as professionals, we need to rethink who we’re really communicating with. Sure, our ultimate audience is still our potential customers or users, but now there’s a new layer—the AI agent. These systems act as intermediaries, filtering and delivering our content, services, and brand messaging to users.

This isn’t entirely different from how we used to approach SEO. In the past, we always emphasized speaking to the audience, which made sense. But we were really doing that through a medium—Google. We relied on search data and Google's suggestions to understand what users were looking for. Now, we face a similar shift: we're still talking to an audience, but there’s an extra step—we’re communicating through AI agents.

Given this, what do you think are the biggest challenges for companies and SEO professionals like us? How should businesses adapt to stay visible and effectively reach their users in this new AI-driven landscape?

Andrea Volpini: We have to look at different layers, right? And we also need to dive into the SEO know-how behind how information extraction works. This remains foundational.

There are a few key elements to keep in mind. First, information delivery must be as fast and efficient as possible. Why? Because AI crawlers, especially at inference time, aim to retrieve responses quickly. They might pull from an existing index, like those provided by Microsoft Bing or Google, or they might actively discover new information in real-time to generate a response. So, speed is crucial.

If speed was already a major factor for human users—given all the stats about decreasing attention spans and lower click-through rates when pages take more than three seconds to load—it’s even more critical for AI-driven search engines or agent systems retrieving information on behalf of users.

Beyond speed, there are technical factors to consider, such as a website’s crawlability. It’s important to provide clear directives about how content can be used—whether for inference, training, or both. This is where bot management and robots.txt settings come in, allowing us to control what parts of our content are accessible.

While AI crawlers share some similarities with traditional web crawlers, they also differ because their data pipelines are often less sophisticated—unless they’re backed by major commercial search engines.

Lastly, structure matters. How we organize information plays a key role in helping AI agents retrieve data effectively. Well-structured content makes a significant difference in how efficiently AI systems can access and interpret information.

Introducing llms.txt

Gianluca Fiorelli: Let's go back to crawlability for a bit. Is this where the new kid on the block, llms.txt, comes in?

Andrea Volpini: Yeah.

Gianluca Fiorelli: It's a way to better guide AI bots—LLMs substantially—on what to crawl, store, and use as a source. It also helps companies, both large enterprises and small businesses, protect the information they don’t want to be public or unintentionally included in AI-generated responses.

Andrea Volpini: Yes, llms.txt is an emerging standard, but it hasn’t been fully defined by the Three C specification yet. It’s still an evolving protocol designed to help AI systems consume content at inference time. However, it’s not used to indicate what models should be trained on. Instead, llms.txt provides a brief overview of the content available on your site.

Originally, it was developed as a compact way to describe content on technical documentation websites. It started within the ecosystem of tools like FastAPI and other Python-centric technologies that wanted to make their technical documentation more accessible. But over time, its usage has expanded.

An llms.txt differs from a traditional sitemap because it doesn’t just list URLs from a single website—it can include URLs from both your main site and your documentation site. So, a single llms.txt file can provide a structured list of URLs spanning multiple domains. These URLs are formatted using Markdown, a widely used text formatting system. Since Markdown is already used in training language models, this new standard makes it easier for AI to consume content efficiently.

An llms.txt file should be placed in the root folder of your website, and it is now crawled regularly. For instance, GPTBot (among others) is already adopting it. Companies like Perplexity and Anthropic are also using llms.txt for their documentation and admin-facing sites, and its adoption continues to grow.

To help people get started, I’ve developed a free tool that makes it easy to create an llms.txt file. It’s available for anyone who wants to experiment with this new standard.

The Need for SEOntology

Gianluca Fiorelli: The need for standards is always a challenge in emerging fields. I hesitate to call SEO an “industry” because it’s more of a revolution within the industry, constantly evolving with new elements. The real issue is how to establish and agree upon standards in this landscape.

Standardization has always been a chronic issue in SEO. Each tool operates differently—one might categorize features one way, another might present them as a long list, and no two tools seem to name the same feature in the same way. This inconsistency creates friction when connecting proprietary data from sources like Search Console or analytics with third-party tools. The process of aligning these different sets of information often requires significant effort.

Now, let’s get into an interesting topic you brought up in the industry not too long ago—just last year, in fact: the concept of SEOntology. The idea behind SEOntology is to create a structured framework that standardizes definitions across the SEO domain. Essentially, it aims to bring clarity and consistency to the language we use in SEO.

It’s funny because "it depends" is a classic meme of SEO. But how can SEOntology move beyond niche discussions and gain mainstream adoption? My main concern with standardization is that if it's only embraced by a small group, it never truly becomes a standard. I’d love to hear your thoughts—how can we push SEOntology into wider acceptance?

Andrea Volpini: I've always been a strong advocate for standardization, especially given the work we've done building on the foundations of W3C and later Schema.org. Schema.org is a standard that helps search engines understand the meaning of web pages, and it became widely adopted as soon as there was commercial interest from the major search engines at the time—Google, Yandex, Bing. They sat around the table to agree on a common vocabulary that could describe web pages and their content.

SEOntology is designed as an extension of Schema.org, specifically focused on concepts relevant to SEO. This is becoming increasingly important as agentic AI plays a bigger role in the field. In 2023, we introduced the first SEO agent—an AI that, for example, can generate authorship markup for your site or use graph-based retrieval-augmented generation (RAG) to help refine and expand existing content.

For these AI systems to function effectively—and ideally interoperate—we need a standardized way of exchanging data between different agentic workflows. This is similar to how Schema.org set a foundation for SEO tools to share structured data back in 2011. One of the core goals of SEOntology is to provide this interoperability, and we've been making good progress. Hopefully, in the next few months, we'll have more updates for the community on the next version of SEOntology.

That said, I haven’t focused much on interoperability just yet because we're still refining the model through testing. Everything in SEOntology is designed to be used by an AI agent. Why? Because language models are inherently stateless—they're trained once, their weights are frozen, and they generate responses based on that fixed knowledge. If we want to evolve an AI system built around a language model, we need to give it memory.

This ties into my interest in the "Thousand Brains Theory," which suggests that memory works through multiple cortical columns, each representing different perspectives on the same object. For example, if you're holding a glass, different columns process information from touch, sound, and other sensory inputs, then collectively "vote" to form a unified understanding of the object.

When I started working on SEOntology, the main goal was to standardize SEO data across different tools—since everyone in SEO tends to use their own terminology for the same concepts. But the bigger vision is to create a memory layer for agentic SEO systems. This allows an AI to recognize trends in search data, understand what "E-E-A-T" means in a given context, or determine the best approach to increasing topical authority across a content cluster.

Think of SEOntology as the language that enables AI agents to build memory for SEO decision-making. That’s the big picture.

Gianluca Fiorelli: Okay, so yeah, this is definitely a cool idea. And also, because—how do I put it?—it's essentially about memories, what they're made of. I agree with the definition of things in this context. It’s like the fundamental building block of everything we do. That part isn’t so hard to define. What’s tricky is figuring out how to interpret that definition in different contexts.

And this is where agents—SEO and intelligent agents—can help us bridge the gap between different interpretations. They can assist in finding common ground between various ways of understanding something.

Oh, and another thing—your really long post on Search Engine Journal, where you talked about ontology—I really liked it.

So, we’re talking about agents, but there’s another key player here: us. Humans. We’re not just handing over 100% of the work to the agent. As professionals, we’re not just guiding—we’re validating, proving, or even shaping the agent’s output. In a way, we collaborate with it, steering it through this chain of thoughts within SEO, marketing, and digital marketing in general.

Andrea Volpini: This is a crucial aspect of symbolic AI, which is what I’m really trying to focus on. And that’s reassuring, right? It highlights the role of humans in designing AI systems—especially in defining the language the system needs to use and agreeing on a set of definitions.

This foundational layer is essential for representing knowledge. For example, if we talk about a "primary keyword" for a webpage, we first need to agree on what a primary keyword actually is. It has to be defined in a way that ensures any agentic system understands and applies the same definition consistently.

Then, of course, there’s the next layer: once we’ve defined all the elements, how we implement them makes a difference. I want to preserve these distinctions while adding a layer of interoperability to the core definitions. For example, we use SEOntology to build dynamic internal links on large sites. This helps us provide agents with a memory layer that the workflow operates on, while also giving us control over that workflow. Since we’ve defined the terms, we can also determine how the workflow should adapt over time.

This is often very customer-specific because we may need to consider business metrics alongside SEO metrics. More broadly, when you look at symbolic AI—or narrow symbolic AI—it’s really about blending the semantic web with autonomous agents (or even deep learning). The goal is to keep human oversight in place: we need to control what should be agentic and what shouldn’t.

For instance, an organization may have internal rules dictating which terms can or cannot be used. We can’t let the system operate entirely on its own, because if it's trained on vast amounts of external data, it might break these rules. To ensure reliable outputs, we need to blend both symbolic AI and modern deep learning models—the very models that created today’s language systems.

So, that’s also the idea behind SEOntology.

Structured Data in the AI Era

Gianluca Fiorelli: Shifting the topic a bit, one of the key milestones—and one of the most important aspects—of semantic web search has always been structured data. We also talk about ontology as, for instance, an extension of structured data. However, with the rapid advancements in natural language processing, as well as video and image recognition, more and more people are starting to claim that structured data is becoming less useful.

Let’s blame Google for that. Structured data has traditionally been seen as a way to obtain richer search results and improve CTR on search result pages. But I remember reading a fantastic article by Jono Alderson a few months ago, arguing that structured data might now be more valuable as a labeling system rather than just for search enhancements.

Maybe this has always been the true role of structured data—acting as a foundation for labeling content, defining the structure of a website, and clarifying the nature of a page. And now, with the rise of LLM-based systems, structured data might be more useful than ever.

What do you think? What are your thoughts on this shift?

Andrea Volpini: I've been deeply involved in structured data for a long time. In fact, I built a company around automating structured data—one of our core focus areas from the start. I've always advocated for creating structured data for your own use, which sounded almost futuristic years ago.

Because, why would you use structured data for yourself? Traditionally, structured data was designed for search engines—to help them understand the meaning of a webpage. So, what internal value can a website owner gain from their own structured data? At first, this idea seemed abstract, but coming from the linked data movement, I always saw its broader potential.

Search engines invested in structured data primarily to train their systems, improve their algorithms, and receive fresh, structured information at scale. Fast-forward to 2021, when transformer architectures became available in the open-source world. I remember the excitement of using the first GPT models and the impact that Google’s T5 had on the industry.

At that moment, my intuition—that we needed to consume structured data more effectively—was no longer just a hunch. It became something we could prove. That realization led me to work more on graph retrieval-augmented generation (RAG) and graph-based content generation. The idea was simple:

You have structured data.
You use a language model to generate unstructured content.
You validate that generated content against your structured data in a knowledge graph.

Finally, structured data had a practical, large-scale application. We deployed this approach across hundreds of websites—generating content at scale and validating it using the very same structured data originally created for search engines.

So, is structured data still relevant in 2025?

Absolutely—more than ever. However, search engines can’t rely solely on structured data to understand webpages because adoption remains inconsistent. We failed to establish a baseline for structured data across the web. Some sites have rich, well-structured data that helps search engines interpret their content, while others have none at all.

Because of this inconsistency, it is hard to think of structured data as a ranking signal. It has never been a ranking signal. But it plays a crucial role in search. While working on the structured data chapter for the Web Almanac, I collaborated with industry leaders, including Google experts, and it became clear that structured data is key to reliably activating certain search features—especially for large e-commerce sites.

For example, I once questioned whether structured data was necessary if a site already had a merchant feed. After all, if we build a product knowledge graph, we can either output the data as a merchant feed or add structured data to webpages. But what I learned from Google’s own engineers is that merchant feeds aren’t always up-to-date. A webpage, on the other hand, reflects the latest information, making structured data a more reliable source.

So, structured data remains incredibly valuable for search engines—even if it can’t be the sole mechanism for understanding content. Great content still matters, and some major sites, like Amazon, thrive without structured data by leveraging their own product knowledge graphs to create rich product descriptions that rank well.

Structured data isn’t going anywhere—it’s just evolving in how it’s used.

Gianluca Fiorelli: Yeah. So, let's say this—the structure is built into the design. When you define the structure of data, it creates consistency in the design. That consistency is what search engines use to generate everything from product feeds to classic PDP pages, search results, and merchant feeds.

Having a well-designed structure makes this process much easier—just look at Amazon. Their structure has been highly standardized for years, and interestingly, it hasn't changed much over time. That's because it was so well-structured from the start.

Andrea Volpini: Yeah. Yeah.

Gianluca Fiorelli: It’s almost as if search engines are using design as a guiding principle. Take Amazon’s PDP (Product Detail Page) design—it’s become the gold standard. When other websites don’t follow that structure, we have to see if structured data can still provide the same level of organized information that Amazon and other major e-commerce players manage to deliver, even without structured data.

So really, structuring content comes first. Before even thinking about structured data, LLMs, or anything else, it’s a design-thinking approach at its core.

Andrea Volpini: Yeah, absolutely. As we mentioned at the beginning—what does it take for a crawler or an AI crawler to understand the content on your site? First, you need crawlability and speed, along with various other technical factors. But most importantly, you need structure.

This structure can take different forms—it might be defined by schema, the quality of your llms.txt, or the depth of information in your merchant feed. But at its core, structure is essential.

Beyond just helping crawlers, structure also plays a key role in creating better, more personalized content for your users, clients, or even an algorithmic audience. The quality of your structure is largely independent of how it’s visually represented. That said, interoperability is crucial.

For example, it wouldn’t make sense to describe a product’s price without using a schema attribute. LLMs and crawlers rely on standardized vocabulary to interpret pricing correctly. That’s why structure comes first—then, we ensure it’s represented using interoperable standards. This approach increases visibility and engagement with your end users.

Monosemanticity & Clarity in Content

Gianluca Fiorelli: One of the things I’ve always appreciated about structure comes from my own past. Back when I was working—literally a century ago!—I was in the audiovisual industry in Italy. One of my projects involved creating collectible video encyclopedias about, well, everything.

One of the biggest challenges, in pre-production was structuring the series in a way that made sense. Once pre-production started, my job was to organize the content so it would fit into, say, 52 VHS tapes for a weekly release. That experience shaped how I think about structuring content—not just in terms of information, but in a broader, more conceptual way.

This idea of structure ties back to LLMs as well. It’s all about clarity—about ensuring that what we want to communicate is expressed in the clearest, least ambiguous way possible. And that brings me to something I want to touch on briefly: the concept of monosemanticity.

It’s a bit of a mouthful, but essentially, monosemanticity is a principle that was proposed—if I’m not mistaken—last year. It’s an entropic concept that, in simple terms, emphasizes the need for absolute clarity in communication. This is something we should be talking about, especially with writers and companies, because the content we create and publish should be as unambiguous as possible.

Monosemanticity is a way of saying, Let’s make things crystal clear. Sure, marketing language has its place, but the core content—what we actually want people (and machines) to understand—should follow strict clarity guidelines.

I won’t go into SEO copywriting because, frankly, I’ve never believed it truly exists. And I don’t think “AI copywriting optimization” will be a real thing either. But what is real, and increasingly important, is this need for structured, unambiguous content—whether we’re writing for humans, machines, or even when we’re creating code.

Andrea Volpini: Let me tell you how I started exploring monosemantic features.

Since we were working on generating text from data in a graph, I began noticing something interesting: when prompting the system, we were getting highly varied responses depending on the specific terms we used. In some cases, we could define these terms as named entities. Certain concepts seemed to trigger responses that were richer than the terms themselves—almost as if some tokens had a greater influence on the model.

As I dug deeper, I came across sparse autoencoders, which led me down an intriguing path. So, what exactly is a sparse autoencoder?

It's a type of transformer built around an encoder-decoder architecture. The decoder takes in activation parameters from a language model. When a sentence is passed into the model, it behaves like a pinball machine—the tokens bounce through different pathways, influenced by a series of weights. These weights guide the token through the network layers, triggering specific neurons. The activation parameters of these neurons are then used to train the sparse autoencoder.

The sparse autoencoder essentially captures the representations generated by these activation layers. It encodes the activated neurons and then decodes them, aiming to reconstruct the original token. This process helps us understand how the model interprets different inputs.

The great thing is that sparse autoencoders are relatively simple to build—as long as we have access to the model's activation parameters. That means we can apply this method to open-source models like GPT-2 or Google's German models, but not to proprietary models like GPT-4, since their activation data isn’t accessible.

So why do we go through this process?

The goal is to "scan the brain" of a language model—to see which neurons activate for specific token sequences. Take a brand like Nike, for example. What happens inside the brain of a language model when it processes that word? By using a sparse autoencoder, we can observe two types of neurons:

Polysemantic neurons – These neurons encode multiple meanings because there are far more possible meanings than available parameters in the model. One neuron might store a Python code snippet, a Korean word, and a historical fact about Rome all at once. This phenomenon is called superposition.
Monosemantic neurons – Unlike polysemantic ones, these neurons activate only for specific terms or concepts. These are what we call monosemantic features.

So what exactly is a monosemantic feature?

It’s a concept that can often be linked to an entity in Wikidata. A great example is the Golden Gate Bridge, which was identified as a monosemantic feature in the famous Anthropic paper. By leveraging these features, we can actually influence how a language model responds.

Monosemantic features are fascinating because they give us deeper insight into how language models work—but more importantly, they open the door to controlling model outputs in a more precise way.

Gianluca Fiorelli: Yes, it's a way to avoid the classic hallucination substantially.

Andrea Volpini: No, not necessarily. It can actually increase hallucinations, but it also triggers a different type of response. At the same time, it's a way to help align the model.

For instance, we can use these monosemantic features to prevent the model from saying things we don’t want it to after the post-training phase. Say we want to align it so it doesn’t instruct people on how to create Anthrax at home—that’s where these features come in.

There are a lot of potential use cases once we identify which features can be leveraged. What I’m analyzing is how much these features overlap with our existing notions of entities in large databases like Wikidata.

The Proust Questionnaire

Gianluca Fiorelli: Okay, cool. I think we’ve been talking about all this stuff for almost an hour now. Before we wrap up, let’s get a little personal.

I’ve got something fun for you—a few quick-fire questions that I ask everyone who joins a Search Session. It’s called the "Proust Questionnaire."

The rules are simple: answer as quickly as possible, almost without thinking. It’s a great way for all of us to get a better sense of who you are.

So, do you want to play?

Andrea Volpini: I'm ready.

Gianluca Fiorelli: Okay, easy question. What is your favorite word?

Andrea Volpini: Favorite word?

Gianluca Fiorelli: Yes.

Andrea Volpini: Semantic.

Gianluca Fiorelli: What is your least favorite word?

Andrea Volpini: Entropy.

Gianluca Fiorelli: Oh, that's a cool word. We were talking about structure and structured data. So it seems quite logical that you don't like entropy. What turns you on?

Andrea Volpini: These days, I’m very much into DADL files. For me, it’s all about building the right agents—ones that are actually useful and don’t just pollute the web with junk. That’s what excites me.

Gianluca Fiorelli: And what's turns you off?

Andrea Volpini: I think what turns me off about how we got here with AI is that, in a way, it was built on stolen content. I see that as AI’s original sin. It reminds me of Aaron Swartz—I'm not sure if you remember, but he died after being prosecuted for accessing data from a corporate network. And now, AI has been trained on vastly more content than what was on his laptop. That feels like an even greater act of taking without permission. This “original sin” of AI really bothers me because there has to be a better way to achieve what we're trying to do.

Gianluca Fiorelli: Ok, let's go sensorial. What sound do you love?

Andrea Volpini: I'm very much into industrial techno, something in that spectrum, I think.

Gianluca Fiorelli: And what sound do you hate?

Andrea Volpini: here’s something very classic in Rome at noon—the sound of the cannon. Originally, the Pope introduced it so that all the churches in the city could be synchronized.

Gianluca Fiorelli: The one of The Janiculum.

Andrea Volpini: Yes, exactly. And as much as I love the story, I also hate it because, every time that I listened to it, it's already noon for me and and I should have done more than what I actually did.

Gianluca Fiorelli: What is your favorite curse word?

Andrea Volpini: Curse word. Oh, my God. I have no idea.

Gianluca Fiorelli: If you're saying something in Roman dialect, we are not going to translate it.

Andrea Volpini: I don't know. In general, I don't like swear word and curse words. I know that now they can deactivate AI overview on Google, so I'm getting more acquainted to but nothing special in that area, I would say.

Gianluca Fiorelli: If you weren’t doing what you do now, what kind of job or profession do you think you’d have? Or maybe, when you were a kid, did you have a classic dream—like wanting to be an astronaut? I don’t know.

Andrea Volpini: Yeah. I think that performing arts would be my next choice.

Gianluca Fiorelli: Cool, cool. Andrea Volpini could have been a guest at the Cannes Festival. And maybe, in some alternate parallel universe, I’d be interviewing him back when I worked for the movie channel on television. How fun would that be?

Okay, thank you so much, Andrea Volpini, for spending this hour with us and for all the information! Really grateful for it.

Andrea Volpini: Thanks to you!

Podcast Host

Gianluca Fiorelli

With almost 20 years of experience in web marketing, Gianluca Fiorelli is a Strategic and International SEO Consultant who helps businesses improve their visibility and performance on organic search. Gianluca collaborated with clients from various industries and regions, such as Glassdoor, Idealista, Rastreator.com, Outsystems, Chess.com, SIXT Ride, Vegetables by Bayer, Visit California, Gamepix, James Edition and many others.

A very active member of the SEO community, Gianluca daily shares his insights and best practices on SEO, content, Search marketing strategy and the evolution of Search on social media channels such as X, Bluesky and LinkedIn and through the blog on his website: IloveSEO.net.

Table of Contents

Video Chapters

Resources

Transcript

The Origins of WordLift

Agentic Search: The New SEO Challenge

Introducing llms.txt

The Need for SEOntology

Structured Data in the AI Era

So, is structured data still relevant in 2025?

Monosemanticity & Clarity in Content

The Proust Questionnaire

stay in the loop

Subscribe for more inspiration.

More from the

More

The Search Session

category

the-search-session

From “It Depends” to Action: Rethinking SEO in the Age of AI | Chris Green

the-search-session

From “It Depends” to Action: Rethinking SEO in the Age of AI | Chris Green

the-search-session

From “It Depends” to Action: Rethinking SEO in the Age of AI | Chris Green

the-search-session

News SEO: Challenges, Revenue Models, and Unchanging Truths | Barry Adams

the-search-session

News SEO: Challenges, Revenue Models, and Unchanging Truths | Barry Adams

the-search-session

News SEO: Challenges, Revenue Models, and Unchanging Truths | Barry Adams

the-search-session

From Fashion to Travel: Lessons in SEO, Brand & Data | Maria White

the-search-session

From Fashion to Travel: Lessons in SEO, Brand & Data | Maria White

the-search-session

From Fashion to Travel: Lessons in SEO, Brand & Data | Maria White

Share on social media

Share on social media

stay in the loop

Subscribe for more inspiration.