Marketing Marketing Intelligence

In Search Podcast: How to Audit Content Like an SEO Data Analyst

In Search Podcast: How to Audit Content Like an SEO Data Analyst

Free Website Traffic Checker

Discover your competitors' strengths and leverage them to achieve your own success

Are you actively using Google Search Console data to get more traffic?

That’s what we’re going to be discussing today with a man who when he’s not busy running calculations or preparing SEO strategies, he’s fighting the darkness of SEO misinformation on Twitter and LinkedIn. He’s an SEO specialist and data analyst focused on B2C content, websites, and publishers. A warm welcome to the In Search SEO podcast, Marco Giordano.

In this episode, Marco shares how to audit content like an SEO data analyst, including:

  • The setup
  • Defining goals
  • Data cleaning
  • Analysis
  • Insights and implementation

How to Audit Content Like an SEO Data Analyst

1. The setup

Marco: Thanks, David. If we have to start, the first and most important thing is the setup. The first step is having the correct setup requirements for your project. That is essentially your API access. When you’re working with Google data, it’s always better to use the APIs rather than the excerpts you get from the interfaces, because otherwise, you’re getting less data. And it’s also harder to work with manual processes. Ideally, if you have a development team, or if you can do it yourself, you can use the Search Console API, which is free and set up some scripts or some code that can make calls to this API to retrieve this data. This is the most important step because otherwise, you have nothing to work with.

There are other factors to consider, like what you have to pull. A recommendation for big websites or for agencies is to create a cloud function in a cloud platform. Essentially, a function that runs on the cloud within Google servers, so you’re able to send this data to BigQuery, a database storage where you keep this data safe. Okay. This is the most important stuff. You have to define what you have to pull. Do you want only the US? Do you want all the countries? Do you want to limit by URL? It depends on your project.

D: That was step number one of five steps for using your Google Search Console data to get more traffic which brings us to step two, defining goals.

2. Defining goals

M: Yes, you have to understand what to do with data. I put it as number two and not as number one because you always need to work with Search Console if you’re doing SEO because it’s organic data so you need to use it anyway. Defining goals means understanding what you want to do in the next steps. So in terms of analysis, it’s why and what you want to analyze something because otherwise, you’re just running in circles.

An example is if you’re analyzing a B2C content website, almost always you want to check how to increase traffic, and how to find good opportunities, in terms of content production. Or it can also be used for auditing. Or you can do it all together. The important thing is that you define clear goals, and you’re sure why you’re doing something in terms of analysis. You should ask questions about data and be confident in what to ask and what you are searching for. Because otherwise, it is just an exploration. Which is good, it can make sense. But I think that for SEO in most cases, you already know what you want.

D: So what’s an example of a good goal to set up?

M: If you have a B2C content website, a good question to ask will be how to improve some clusters in terms of traffic. Where do we have a margin of improvement? Not in terms of the homepage, in terms of adding more content, or finding new angles. How can I do it? And Search Console data give you this answer because you have a lot of queries. Even if you’re not ranking for it, it’s still data that you need for research.

Another example is when you have to audit, you can ask what are the pages that have the highest potential. This can be a goal of finding the pages with the highest potential to get more money. But how do you define the highest potential? That is another question.

D: But the highest potential could be something like highest potential traffic, highest potential ranking increase?

M: If you’re a content website, you can manage data with your ads provider like Mediavine, it depends on what you’re doing on your website. If you’re doing affiliate marketing, traffic alone is useless if you’re not selling anything, you have to sell something. So in these cases, you can also integrate other data, try to understand where to look, or define your own metrics.

D: That’s number two, defining goals. That takes us up to number three, data cleaning.

3. Data cleaning

M: Before you start with an analysis, you need to do data cleaning, or munging, as it’s called. Essentially, you manipulate the data to understand what is not useless for your analysis. You still need to figure out what you have to remove for the analysis. For example, if you have a content website, on WordPress, usually you won’t remove tag categories and hashtags because hashtags are used for site links and you don’t care about site links, they’re useless in terms of analysis. You don’t want to analyze site links because they’re going to get zero clicks and inflate your impressions. You’re going to remove ‘page’ because pagination is not what you need, you just want articles. You don’t need to have pagination because you’re measuring how to improve your articles’ traffic. And also author pages. They’re not supposed to rank in terms of organic traffic, you don’t want to rank an author. You should have them indexed but it’s not the goal of our analysis. We’re talking about content and not getting more money. So improving author pages is not going to help you because it’s not content in terms of articles and blog posts.

D: Understood, you can rank for author pages. And sometimes you can get traffic if you’ve got a relatively famous author. But at the end of the day, that traffic is not going to convert so that’s not what you’re measuring at the moment.

M: Yeah, it’s useless because you just want to find articles. This is not technical stuff, it’s more about strategy and content. This is an important point, you are not here to track tag pages or other stuff, just to find the articles that matter so you can improve them and get more traffic.

Once you do data cleaning, you also have to clean other details, like queries. Usually, when you pull data via the API, even if your website is in English, you get foreign data, like queries in Japanese, or even in Arabic, which is quite common. So I removed the, especially if you’re an English website because they’re not of interest to your analysis. I also try to remove all of those that are not useful. For example, if you’re sure that some queries are not related to your business, and they are just there because Google ranks you in position 70 for them, you can remove them. The results are a manual part. This is data science, it’s not SEO, you have to take care of your data and remove what’s not needed for your analysis.

D: So that’s data cleaning, a lot of cleaning to be done. Let’s move on to number four, analysis.

Outperform Your Competition - in Every Marketing Channel

The all-in-one solution for data-driven marketing planning and competitor analysis

Start your free trial

4. Analysis

M: So during analysis, which is where the actual work happens, I have some processes, since I just work with content websites. It’s more narrow compared to doing e-commerce, SaaS, or all of them together. If you only cover one type, it’s easier to have processes. I usually check the unique query count, which means I count how many unique queries our page is ranking for. So no repetition and no duplicates. This is to understand the weight of a page. Of course, you have to be careful because you can also rank for queries in low positions or you can rank for useless queries. But on average, a unique query count is a good indication of the potential of a page. If you’re not considering affiliate marketing or other monetary data, query count is the best way to measure something. Usually, you just want to understand where you can get more queries for B2C so you can expand your topic.

Another thing to check is the number of pages with zero clicks. Why? Because if 40% of your website is zero-click pages, it’s probably not a good website. If you’re going to get organic traffic and you’re telling Google to look at how almost half of your website is not worth it, Google will understand it and penalize you. If you have a bad ratio between content that is getting clicks and nothing, this is not a benefit. This is something you always have to monitor, I do it every month because for me it’s cheap.

D: Are you quite aggressive with getting rid of zero-click pages or redirecting them to something else?

M: No, I’m not aggressive with getting rid of them but I’m aggressive at finding them. You have to find them, then you decide. Because if those pages are used by other marketing channels, like social media and newsletters, you keep them, and you don’t have to delete them. If they have comments, if they have backlinks, I will never delete them. But if they’re not related to your business, even if they get traffic, but get no leads, those pages are more at risk. Or they are zero-click pages with thin content that can’t be updated. So if they have no chance of recovery, they are thin pages, they have weak content or no content at all, I prune them (delete them), but it’s the last resort. I don’t think about pruning as the first solution.

D: Let’s move on to number five, insights and implementation.

5. Insights and implementation

M: Number five is understanding how you can use this data. And this is the tricky part. Because analysis is not easy. But finding the code is easy because it’s always the same. And you can also ask Chat GPT, that is no problem. The problem is getting insights using this information. For instance, something I didn’t say in step four is that they usually also check the percentage of traffic that the top 10 pages by clicks get to understand the risk of a website because if your website is reliant on 10 pages to get traffic, it’s a risk. Because if you get even one page losing one position, you get huge losses. So during insights during step five, this is important because you have to understand how to use what we just learned to propose a solution. I know this fact so how can I improve the website or give an actionable solution to reduce risk? Or knowing that the query count is high for these pages, how can I diversify this topic to get more sub-articles or subtopics? Or how can I make more money if I know that these pages have a high-profit potential? How can I do it?

Once you have the information from step number four, you have to create a strategy (this is where SEO comes into play (that makes sense for what you have to do that can bring results. Okay. This is the hardest and most challenging part. Because first, you have to clean the data properly. If you don’t, you can do the other steps. And then you have to understand what you just did because doing it is not enough. You have to be really smart about it and understand during step two what you want or what you have to merge together, and so on.

D: I’m sure that 99% of SEOs listening to this will be thinking I can do B, I can be doing a lot more with Google Search Console. A lot of opportunity there.

The Pareto Pickle – Clustering

Let’s finish off with the Pareto Pickle. Pareto says that you can get 80% of your results from 20% of your efforts. What’s one SEO activity that you would recommend that provides incredible results for modest levels of effort?

M: Clustering. Full stop.

Clustering keyword search, in terms of grouping your keywords together. There are some tools that can save a lot of time for content production.

D: Any particular tool you’d recommend?

M: Keyword Insights.

D: Okay, great. And then that obviously, leads your content production strategy.

M: Of course, you still have some manual work to do. You don’t just take the tool and do it. But this process is a great companion because you can just pick a list of keywords, write, find if they are some common domains in the SERPs, which is the best way to check if you have to create one or more articles, and you’ll get a list so you can understand what to do. This is the best SEO activity in terms of the trade-off of effort and cost. I mean, if you pay for a tool it’s quite expensive, so not for a freelancer or for a small business. But considering that you do keyword research, it’s not that expensive, because you do it once in a while, not every single day. So it’s feasible.

D: Once in a while, once a year, once a quarter?

M: It depends on the situation and the scope of the project. I sometimes do it every week, if it is a super dynamic project. But other times it’s every three months. There is no magic number. I would say the scope and budget of the project dictate the research.

D: I’ve been your host, David Bain. You can find Marco by searching Marco Giordano on Twitter or LinkedIn. Marco, thanks so much for being on the In Search SEO podcast.

M: No problem. Bye.

D: And thank you for listening. Check out all the previous episodes and sign up for a free trial of the Rank Ranger platform.

author-photo

by Darrell Mordecai

Darrell creates SEO content for Similarweb, drawing on his deep understanding of SEO and Google patents.

This post is subject to Similarweb legal notices and disclaimers.

Wondering what Similarweb can do for your business?

Give it a try or talk to our insights team — don’t worry, it’s free!

Would you like a free trial?
Wouldn’t it be awesome to see competitors' metrics?
Stop guessing and start basing your decisions on real competitive data
Now you can! Using Similarweb data. So what are you waiting for?
Ready to start digging into the data?
Our comprehensive view of digital traffic gives you the insights you need to win online.