Vanessa Fox – AMA Interview

March 24th, 2007

Interview Transcript

Announcer: Hi, and welcome to the official podcast of AMA Hot Topics Search Engine Marketing, where our conference chair Stephan Spencer, the founder and president of Net Concepts, interviews some of the industry’s top thought leaders, who also happen to be speakers at the Hot Topic events taking place this spring: April 20th in San Francisco, May 25th in New York City, and June 22nd in Chicago. Without any further ado, here’s Stephan Spencer.

Stephan Spencer: Hi, everyone. I’m Stephan Spencer, founder and president of Net Concepts, and chairperson of the AMA Hot Topics Search Engine Marketing Conference. I have here with me today Vanessa Fox, Product Manager at Google Webmaster Central. Welcome, Vanessa.

Vanessa Fox: It’s good to be here to talk to you.
Stephan: Well, thank you for joining us. So tell me a bit about Webmaster Central and the power and the benefits of it to webmasters.

Vanessa: What we found at Google is that webmasters are really interested in talking to us more about how their site is doing in the index and how it’s crawled, indexed and ranked, and if they’re having any problems, being able to do those things. And so we really wanted to be able to talk more with all the site owners on the web about that, but at Google we always have these scalability problems, because we want to be able to interact with every webmaster, and there’s millions and millions of them on the web, and only so many of us.

So what we did is we put together Webmaster Central, which is always expanding and evolving but it’s basically the place that webmasters can go to find out information about their site in the index. And so we have a number of things available at Webmaster Central, you know everything from educational types of content in our help center. We have a discussion area where site owners can talk to each other and also we’ll sometimes come in and post as well. We have a blog. And we also have the tools available, which are automated types of tools that really can give site owners some visibility into their site specifically.

Stephan: So I understand there’s some tools for looking at links, at page rank, at spidering activity, robots.txt, if there are any errors… Tell me a bit more about these reports.

Vanessa: We have all kinds of things available like that. I mean, what we found is that site owners obviously are most interested in seeing how their site comes up in the search results, and so if there is an issue there, if the site isn’t showing up in the search results at all, or maybe in the place that the site owner should like, from our perspective there’s a number of reasons why that could be. So we wanted to make as much information available as we could, to really pinpoint for the webmasters where in the process the problem might be happening.

And then on the other side of it, we also wanted to give webmasters more just sort of statistical information about how their site could do in the search results, and then sort of the third part is letting the site owners give us information, so we can have more of a collaborative way of crawling and indexing their site.

So to kind of back up to the beginning of the pipeline, the first thing that we need in order to have a site show up in the search results, obviously, is for us to know the pages exist. You know that’s kind of the basic starting point and so historically, we’ve discovered pages throughout the web, and that works really well for us. That’s how we found the majority of the pages, but that’s sort of a process where we go out and discover information.

So we wanted there to be a way for site owners to tell us information that they didn’t have to wait for us to necessarily find them. And so that’s where the site maps started, the site maps protocol, which is an XML file or a text file, or an RSS feed, where the site owners can say, “Hey, here’s all the pages of my site.” And so it gives us a comprehensive view all at once of the pages, and it also lets the site owners give us some other optional information, so that we do have more of a collaborative way of crawling their site.

For instance, they can tell us how often they modify the page, the last time that they modified the page; you know those types of things. So it’s sort of the first part of the processes, that’s the site map submission, although I should also mention that everything else available in the tools does not require submission of a site map. So you can just go in and sign up, and say, “Here’s my site,” and you can use all of the rest of the tools without a site map submission if you would like, as well.

Stephan: With the tools that are available, some of them you have to verify that you actually own the site, or that you are in control of the site. Is that correct?

Vanessa: Yes, absolutely. So there’s certainly a lot of information that we only want to give to site owners. We wouldn’t want everyone on the web to be able to see this information. For instance, if I was a site owner… you may not want the competition to know the queries that you rank for, you know those types of things.

So we do have a verification process. It’s very simple. We have two ways: you can either upload an HTML file or you can insert a Meta tag on your home page. And so those are very simple ways you can verify that you have write access to the site, and in that way we can assume that you’re a site owner. Another reason that that becomes more vital is that we’re starting to have more input options available, and so certainly we would only want the site owner to be able to tell us how to crawl their site.

Stephan: Once you’re verified, you can actually get informed if you’re being penalized. Is that correct?

Vanessa: We have sort of this start summary page, so you add your site, you verify it, go right to the summary page, and that’s the page that we really want to use to give you the most important information about your site. If there’s any kind of a problem with your site being indexed, we want to show it right there. So if there’s been a violation to the guidelines and so the site is not indexed because of that, you’ll see that information right there, along with a link to the re inclusion request form. If we think that your site has malware on it, we will show you that as well, and we’ll actually show you the individual pages that we think have malware on them. We show that right on the summary page.

It also just has other kinds of basic information, whether the site is indexed or not. If we can’t access the site… So say there’s not a violation type of a situation, but maybe the entire site has blocked with robots, or the home page is redirecting to itself. You know, if there’s a fundamental problem with the site, we’ll show it right there.

And then the other thing that we show is sort of the summary of any kinds of crawl errors that we have on the site, and this leads me to the second phase of the process, once we know about the pages, kind of the biggest reason a page may not end up being indexed is if we can’t access the page. And so that’s really where the crawl errors come in. If we can’t access any page for any reason, server errors, redirect errors, those types of things, we’ll show you that there, and we’ll show you what the error is and when we tried to access it. And so that can really help people pinpoint what might be wrong with their sites.

Stephan: So you say that there are some really useful reports available through Webmaster Central, and I’m quite interested in the whole linking side of things. So if the webmaster wanted to see, for example, their inbound links and page rank scores of their internal site pages, there are reports that will show them that; is that right?

Vanessa: Right. So that kind of gets us away from the diagnostic side of things, and more into the statistical side of things, which can really help give people some insight into how and why their site is doing, and kind of where you might get the traffic.

So a good example would be the link reporting, which we just released within the last month. It’s been fairly recent. And what this shows you is all the links to the pages of your site. It’s not 100% of the links, but it’s most of the links, and we’re trying to get more and more information available. And it’s certainly quite a bit more than the Link Operator would show you, that historically people have used.

So we’ll show you the total number of links to your site, and then for each page of your site we’ll give you a list of the links to that page. And you can see all of the links from other sites; so kind of a list that does not include any of your own links. And then you can also see your internal linking structure separately, and you can export that as a CSV, up to a million links. You know, so that can be good; it can show you maybe what your most popular pages are, what pages are linked to the most, where you get the traffic, kind of how people are linking to you.

If you’re having some issues, say with ranking or page rank, you know you can take a look at the pages and see maybe if they don’t have any links to them. This can kind of help you pinpoint issues you might be having there. The page rank information that you were mentioning… we do have some information on page rank. We show you your page with the highest page rank by month, which can be somewhat interesting.

Sometimes people find that it’s not the page they expect. A lot of people expect it to be their home page. In many instances it is, but it isn’t always. And then we give you a graph that shows you the distribution of all of your pages, sort of the page rank of them. I mean it’s not a utility… a lot of people want detailed information, and we’re looking to see what more information we can offer, but at this point we do have some information that can give you sort of a general sense of where your site’s set, as page rank.

Stephan: If for example you aren’t ranking well for a particular product or category that is important to your business, you could use these diagnostic or statistical reports to identify missed opportunities or mistakes that you’ve made. For example, perhaps you haven’t linked to that page that you want to rank well, from a page that is well endowed with page rank, and you’ve only linked to it from very deep in your site. Now these reports would help you identify those opportunities. Is that right?

Vanessa: Yeah, there are lots of things like that. A couple of the other reports that we have available and all of these are available for export as CSV is we show you the queries that return your site in the search results most often. And we show you both the queries that return your site, which no one has clicked on, as well as the queries that return your site that people most often will click on.

And so that can show you some lost visits, in terms of if your site ranks really highly for the query but no one is clicking on it, you would never really know that from your logs, necessarily, that that was the case, and so you can kind of take a look at that and say, “Hey, you know my site’s ranking for this. Why don’t I get any visitors from it?” And you know maybe the title or the description in the search result isn’t great.

A good example of that is maybe someone has a Flash page, and so the description that shows up in the search results is “loading, loading, loading… Macromedia Flash.” And so someone might not know what that page was about then, if there’s not a description there, and so you can kind of take a look at those kinds of things and see if that’s happening.

And then, I think another thing that’s useful is we show you the words that most often appear on your site and the words that are used most often in links to your site from other sites, and that too can help you figure out why you might be ranking, or not ranking for particular words. You know if you think your site should be ranking for something, but then you take a look and it doesn’t show up as one of the common words on your site or as a link to your site, then you know you may need to optimize a little better for that particular phrase.

Stephan: If your search listing is not compelling, then your click through rate from the SERPs to your own website is going to be pretty low and these sets of tools will give you insight into those missed opportunities that you can’t identify, really any other way.

Vanessa: Oh, yeah, absolutely.

Stephan: A great example of a listing in a SERP that I thought was just horrendous and it was like this for a good year or two, was Starbucks’ home page. It said, “Cookies required.” That was the title of the page, and then the snippet underneath said something about, “Must have cookies enabled in order to view this page, blah, blah, blah…” and so that was a totally uncompelling search listing.

Vanessa: Right, and I do see things like that. And it can be difficult to know what shows up for your site, for each query. So I think it can be really helpful to use this report, to at least go through the keywords that your site ranks highly for, but no one’s clicking on, and just sort of evaluate what shows up in the search results for those. So yeah, that’s definitely a good thing to check out.

The robots.txt analysis tools that we have available also… this kind of goes back to the diagnostic information, another reason that sometimes pages don’t show up in the results is someone may be accidentally blocking them and they may not realize. And so there’s two main things that we have available for that.

One is just we have a report that shows you all the pages we tried to crawl that were blocked, and with that you can just take a look and in most instances it’ll be exactly what you want blocked and so that’s just confirmation that we’re not accessing the pages that we’re not supposed to. But you can also take a look and see is there anything on that report that you did want crawled and you didn’t mean to block.

And then you can use the analysis tool to take a look and make sure we’re able to access the file successfully. And then you can make edits to it and then type in those URL’s and test that against the modified version of the file and we’ll say, “Yeah, this is blocked…. This is allowed.” So it just kind of gives you peace of mind before you upload a new version of the robots.txt, new file, you know that we’re reading the file the way that you’re expecting.

The other big part of the tools is the input option. And there are a few examples of that. One is the crawl rate of your site, and how the Googlebot accesses your site. You can kind of see reports of over time how many pages a day the Googlebot accesses, how much data is downloaded. You can see how long it takes to download a page, which can be kind of helpful if you start to see that spiking, which could hurt in terms of conversion rates. If the page takes a long time to load your visitors might not stick around.

The other thing there is that if the Googlebot is crawling your site too much, we can slow it down, and then in some instances if you’d like the Googlebot to crawl more you can speed it up, and so that’s kind of handy.

We also have, you know there’s this issue of sites that like to be indexed with the www version of the URL and some sites like to be indexed under the non www version of the URL. But people link to both versions, so we’ll index in terms of what is linked to, since we follow links, and so sometimes sites will end up with like half their pages indexed with www and half indexed without, and they worry that they’re splitting their page rank or whatever, so you can just go right into the tool and specify, “Hey, I want my site indexed with the www, ” or “without”, and we’ll basically rewrite all of the URL’s and make sure that we index that way.

Stephan: Yep, that’s a really important point. I believe you guys refer to it as “canonicalization.” So if you have a page that is indexed at www, and you have a version of that same exact site that’s indexed at the root domain without the www, you have an in effect duplicate site, and page rank is being split across the two different sites, so if you have two copies of your home page indexed, well some of the votes are going to one version and some of the votes are going to another version, instead of all of it aggregating to one single version of the page.

Vanessa: Right. And we tell people to do a redirect in that case, but not every site is able to do that. Maybe they don’t have access to an HT access file or whatever to put in a redirect. So you know this is just an easy way, especially for people who don’t have a lot of a technical background. They can just hit a button and let us know.

Stephan: Speaking of ‘redirect’, I think it’s important to note for our listeners that there are two types of redirects and one of them is treated quite differently from the other. In terms of page rank flow and the votes counting towards the destination URL that you’re forwarding the visitor to, or the search engine spider. So there’s the permanent style of redirect that’s a 301 and the temporary style of redirect, which is the standard one that most people use. That’s the 302. Would you like to explain that a bit further as to which one is better and why?

Vanessa: Sure. So if you are redirecting permanently, or if you have two versions of a URL that are the same and so you want us to go to one particular one, you should always use the 301, absolutely. Because a 302, you know like you say, just tells us that’s a temporary, hence we don’t treat that in the same way.

So the 301 is definitely the way to go if you’re, say, moving a site, if you have multiple versions of a URL. You know a good example is the www/non www, you know for all of those types of things I would use the permanent, the 301. The temporary is really just for temporary situations. If you have a page that’s not going to be available for a week and you’re going to send everyone to another page on sort of a temporary basis, that sort of tells us, “No, the original page is still what you want indexed, so keep that original page indexed.”

Stephan: So a great example of that would be if I’d go to your home page and then a session ID is assigned, a redirect happens, that session ID then is put into the URL. A 302 would be the perfect sort of redirect to use in that case; because you don’t want a whole lot of session ID based URL’s to end up in Google’s index.

Vanessa: Yeah. That’s an excellent example, absolutely.

Stephan: Let’s actually switch topics a bit and talk about, there’s probably something that I’m guessing is not going to be available for a little while in Webmaster Tools, but correct me if I’m wrong. And that is the ability to see which pages are in the supplemental index versus the main index. Supplemental has been known as “supplemental hell” by many SEOs, because if you’re in the supplemental index, your pages typically will not rank well. So could you elaborate a bit further on supplemental index and whether that ability to see which pages are in supplemental and which pages are in the main index is something that you have planned for the tools?

Vanessa: Sure. Well, you can see what pages are supplemental now, not through the tool but just in the search results. We label them, all of the URL’s that are supplemental. So that is one way of looking. You could do a site search for your site and then see what’s labeled as supplemental. However, I would say that the supplemental results aren’t as bad as all that. The last year we’ve put a lot of work into our infrastructure and we really made a lot of enhancements behind the scenes.

And so the freshness of our supplemental pages is much, much better than it’s ever been before. And so I think people with pages from their sites in the supplemental results will find that we’re accessing and indexing those pages more often. And I think that you’ll also start to find that those pages may be ranking better for queries as well.

You know we’re making a lot of enhancements all the time, to make sure that they’re the most relevant results possible. I know that the supplemental results kind of have a bad name, but I’m sort of a behind the scenes… like a cheerleader of supplemental results [laughs], but they’re not as bad as people think. And I think over this year site owners will start to see much better results with their pages in supplemental. I think they’ll be much happier with them.

Stephan: So you mentioned there’s a way to see in the regular Google search results that pages are supplemental, and indeed that’s true. You can see the label that says “supplemental result”. There’s a hack out there that’s been talked about of being able to do a query on Google and see all the pages that are supplemental. So you could do a site colon (site: ) search and then add some additional keywords and you do star star star after the site colon (site:***) and whatever your site name is… so you do *** and then you do a minus ( ) and then some garbage characters, just some arbitrary like a s d f, (site:*** asdf) whatever, and that will return the supplemental pages but not the main index pages from your site.

Vanessa: Ah, interesting.

Stephan: But it doesn’t actually work if you type it in as a query. You actually have to put it into the URL. There’s a blog post, I read about it at, and so that’s quite cool to be able to do that, and it would be even nicer if that could be built right into Webmaster Central, to be able to say, “Show me all the pages in supplemental, of my site.”

Vanessa: We have heard that a little bit and you know we’re always looking into seeing if we can do the various things that people ask us for and so that’s certainly one that we’ll take a look at. Although again the only thing that would make me hesitate there is I just would hate for people to see that distinction that we show, you know, “Here’s your pages in the main index and here’s your pages in the supplemental results, ” have that really be a distinction that makes them think, “Oh, this is really bad.” Like I say, I don’t think it’s as bad as all that.

Stephan: Well it is a diagnostic sort of issue though, so for example you have some duplicate content, or duplicate pages. So let’s say that you have two versions of the same page on the same site. One has a flag in the query string and one does not. And the one with the flag in the URL is in supplemental and the other is not in supplemental. This could help you diagnose that, “Ah, I have two versions, and that’s not a good thing.”

Vanessa: So yeah, we’re always looking at those kinds of things, so there certainly may be valid reasons to kind of make that more available to people, so I’ll make sure it’s on our list.

Stephan: So tell me a bit more about the tools like relating to keywords, in terms of Webmaster Central. So if I could get into a port showing keyword popularity data from Webmaster Central, and particularly ones that relate to relevant keywords to my business….

What are your plans there? What do you have available currently? I know you can see the keywords that people are searching for you under, but can you see the historic trends of the keyword popularity of those keywords? Can you get related synonyms that you aren’t ranking well for?

Vanessa: We certainly heard that people would like a few things that we don’t have available now that you’ve mentioned. One would be historical type information and the other being information about the keywords that you care about the most. We have heard that a lot and so we’ve certainly looked into seeing if anything like that is possible.

What we show now, as you said, are the queries that have the most amount of traffic. I did a blog post. At one point, people weren’t quite sure how we came up with the data. And basically it’s by volume, so it’s the queries that happen most often that you rank most highly for. As opposed to, say, something like the click through rate.

And what is kind of cool about that is you can do a break down of it. So you can see the aggregate information, but then if you want to sort of drill into it more, you can look into it on the basis of a specific property and even on the basis of a location. So, for instance, you could say, “I only want to see websearch for Germany” or “I only want to see mobile search for France.” And so that’s kind of cool because you can kind of do it either way and you can see what the differences are.

A lot of times you might see a difference with image search. Image search has become a good way to get the traffic to your site if you’re a retail site. A lot of times people are looking for pictures of products they’re interested in. So if you have pictures on your site with really good descriptions, that can drive the traffic to your sites. It can be kind of interesting to see the queries from image search, for instance, if you’re a retailer. And so that’s the type of information that we have available now.

But like I say, we have looked into seeing what other ways we can expand that information, what are the ways that can be most useful for people to make it actionable. And so we’re really looking at that closely.

Stephan: Well that brings up an interesting topic there with image search. Online researchers really do get quite a low conversion rate off of traffic coming in from Google images. So what sort of diagnostics could an online retailer do to help with solving any issues they have with poor conversion rate with Google images?

Vanessa: The biggest thing with images is that they’re images, right? We’re a text based search engine. There are things you can do on your site to make sure your images are being described as accurately as possible so that when someone does a search for something specific your images are coming up for the right thing as opposed to something that’s more general.

So I would make the information about the images as specific as possible, use all tags for sure, make sure that you have a description around your image, and make sure that the description for each image is unique. Say you’re a clothing site and so instead of each image being “pants” for every single one, you might have something that’s unique about each one the brand name or the color to make it as specific as possible so that when someone does a search the right type of thing is coming up.

The other thing, of course, that you can do with Webmaster tools is opt into having your images used in the image labeler, which is a great way to have other people do your work for you. We’ll add all the images from your site into the image labeler, which is I don’t know if you’ve used it, but it’s very addictive.

You just go online, you get paired up with someone, you get shown an image, you type in as many descriptions of that image as you can and anytime you match your anonymous partner, you get points. And people really like seeing how many points they can get. But basically all that is meta data that helps us label those images. And so it can help have your images be returned for more of the queries than it otherwise would have. So it can help get you more traffic for sure that way. So I always recommend people do that.

Stephan: That game actually originated from Carnegie Mellon University, is that right?

Vanessa: Yes, absolutely. I actually met the professors who worked with students at the Grays Hopfer, the conference last year. And Anna, she was all excited. She was like, “How do people like it? We just love it!”

Stephan: Let’s switch topics for a moment. Let’s talk about the One Box, which is where Google results from other Google vertical search engines like Froogle and GoogleMaps and so forth are brought into the main web results and precede the main, natural results. So what are the tips or tricks that you would share with our listeners as to getting more visibility in the One Box?

Vanessa: The whole point behind those types of things is that we’re always looking to get the most relevant and useful results possible for users. We want them to get on to our home page and off the search results page on to wherever they want to go as quickly as possible. And so this is obviously one way of doing that.

So what I would do is take a look at what your site’s about, do the searches, see what types of One Box results come back. For instance, I think a good example of this is local types of result. If you’re a local business, we have a local One Box. And how we get that is from our local business center, so if you aren’t in that, then you’ll never show up in the One Box.

And so a very easy thing people can do is just go to the local business center, verify that you’re the owner of the business, put your information in and that’ll help you pop up there. So take a look and see what types of One Box results come up and how could you add your site there. From the local perspective, that’s really easily overlooked by a lot of businesses and it’s so easy and it can get really great results.

Stephan: So, speaking of local results, you can actually verify your business address and get some additional benefit out of Google Local that way. Do you have any thoughts on that?

Vanessa: Everyone in local business should put as much information in there as possible.

Stephan: Let me ask you about back to keyword research for a moment a couple of the tools I like from you guys that give insight into what keywords are popular and which ones are not, one of them is Google Suggest and the other is Google Trends. So Google Suggest will give you some keyword popularity insight. If you have, let’s say, the two search terms that you’re comparing that start with the same word, those are in order of popularity, those recommendations that are delivered by Google Suggest, correct?

Vanessa: You know, I don’t actually know, because I don’t work on that site specifically, but that would make sense.

Stephan: Then there’s Google Trends. Have you played around with that much?

Vanessa: I have. I have; that’s pretty fun.

Stephan: So you can overlay one set of keyword popularity trend data, like let’s say “digital camera” and see the historic popularity of that search term over the past 12 months and then you can overlay on the same graph, let’s say, “digital cameras”. So what sort of insight do you think this could provide, let’s say, an online retailer, if they’re trying to ascertain what keyword markets to go after?

Vanessa: Yeah, I think it can be very interesting when you’re doing your research and you think you want to go after a certain term to go ahead and pop it into Trends. And one thing that you can do is ask around people that you know, “How would you search for these various items?” and see what kind of words they come up with.

Because I do find sometimes that people are optimizing for a particular word but that’s not a word anyone actually searches for. You certainly want to optimize for the words that people are searching for. So ask around, see how people would do a search, then go ahead and pop that into the Trends and see what actually does have the most development of traffic.

And going back to the local type things, it give you regional results too. You know, you can see where the traffic is coming from most often. And I think that can help you do a certain amount of targeting if you’re a site where that would be relevant, like, say, a travel site.

Stephan: Let’s move on to duplicate content for a few moments. So, there’s been talk of a duplicate content penalty and a duplicate content filter, so my understanding from speaking with various people within Google is that there is filter for duplicate content, not so much a penalty, though. It’s a rare occurrence to penalize a site for duplicate content; it’s more of a query time filter so that it tries to eliminate search results that look very similar to each other to improve the search experience for the user. Is that correct?

Vanessa: Yeah, absolutely right. If you go back to the goal of the Google search result, it’s to give the best result to the searcher. So, obviously, showing a bunch of pages with the same stuff on it isn’t going to be as useful to the searcher as showing content that has more variety in it.

People are really worried about this because they think, “Oh, well, there’s a lot of information in my site that might be sort of duplicate. Am I going to get this penalty and get taken out of the index?” And, as you say, that’s not the case at all. It’s just that as we go through to return the search results, if we come across pages of your site that seem to be about the query but they also seem to be like each other, we’re just going to pick one and show it.

If that’s OK with you, then you don’t have to do anything because our aggregates are smart and so they’re going to pick a good page and return a good result. So most people shouldn’t even worry, especially people who have a website but they don’t know a lot about the technical details, they’re not an SC or whatever, they just have their site. It’s no problem. We’ll return good results from that site.

If, though, you are more interested in having more of the control over what’s returned, there’s a number of things that you can do. Take a look at the pages. What I tell people to do is look at the page, look at the text of the page that is unique so, for instance, not the navigational element, not the footer, not the images, not the stuff that might be on every page. Really isolate the text of the page that’s unique.

How much is there? I think a lot of times when you’re looking at a page with the whole layout of the site, you don’t really always notice when a page doesn’t have a lot of unique information on it. Because you’re looking at it in this context of this larger page. So once you really isolate what is the content of this page, that can help you see when you might have instances where there’s a lot of duplicate information.

There is, of course, the example that I’ve used before which is that if you have two pages and they basically have the same information on them about Boulder and then one about Denver but really both pages only talk about Colorado, just say, “Hey, this is a page of Colorado; it has Denver, it has Boulder” and then have the information there. You can just have one page. And if you really want two pages, just find ways to have each page have unique information on it.

So that’s sort of the duplicate within your site.

There’s the other issue of having multiple versions of a URL that point to the same page. And again, the same thing will happen. We’ll just pick a page. But if you care about which version of the URL that we show, then there’s certainly things that you could do.

A redirect is nice, because that way, if people link to multiple versions of the page, you can have all of the PageRank and everything ultimately go to one place, which is the page that will index. So that’s always a nice way to go if you can. So yeah, I wouldn’t really worry so much as people do about that.

Stephan: Let’s say that hypothetically you have an e commerce site, and you’re going through a rewriting process of your URLs to make them more search engine friendly, eliminating some of the parameters in the URL, like session IDs or user IDs or superfluous flags and so forth, just trying to simplify the URL structure.

And you also have these versions of the pages that are already out there with more complex URLs, and you haven’t changed all the internal links within your site to point to the new URL structure, so you end up with quite a lot of duplicate pages the same page, but at two different URLs. So what you’re saying is the online retailer is not going to get penalized for that situation. Is that right?

Vanessa: Right. We’re just probably not going to return both pages, just because it wouldn’t make for a useful experience. But that’s OK with you, because if both pages are the same, you wouldn’t want both pages returned anyway. We’re going to return the one page, and that’s going to get you your topic. So yeah, it’s not going to cause us to say, “This whole site’s bad!” [laughs] So don’t get too stressed out.

Stephan: Another situation that I think people get stressed out over is when they have a blog. And of course, the way that a blog is set up, you have the permalink page with the blog post on it, but you have all these other pages where that same blog post appears. So it’s on the archives by date for that month. It’s in that categories page on the blog. It’s in various tag pages, if you’ve got a tag plugin installed on your blog.

So it’s all these different copies of the same content in different places on your blog. That’s not something that a blogger should really worry about either. Is that right?

Vanessa: No, because again, we’re probably just going to return one of the pages. The one that we return is probably going to be based on the query. Like if you have a query that pertains to one post and we might return the post, and then you have a query that pertains to multiple posts that are all on one archived page you know how sometimes you might have multiple posts there then we might return that page.

Certainly, if a blogger is worried about that, you could block different versions with a robots file. If you really care about which version is returned, like you might say, “Oh, I don’t want my archived pages returning, and I want the permalink version returning and that’s it,” I wouldn’t worry. We’re not going to do any kind of a penalization.

Stephan: But if you want the maximum opportunity to rank well, you’d want to aggregate the various versions together, as far as having one definitive version, with all votes, all the links, internally and externally, pointing to that one single version.

Let’s talk for just a moment about Flash and JavaScript and Ajax. So, what does Google think of a Flash based website or a Flash based navigation? And how are you dealing with JavaScript and Ajax, as far as crawling and counting the content that’s embedded within JavaScript or Ajax?

Vanessa: We’re always looking at better and newer ways of crawling the content of the web, as the web itself evolves. However, it still is the case that Google is a text based search engine, right? People type in words and we match that up. So things like Flash and images are just hard to bring up for a text based search, just overall.

So what I always tell people is to use Flash and images in a useful way. I see some sites that put everything in Flash, so all the text is in Flash, all the navigation is in Flash. And there’s not really necessarily value that’s added there. Certainly, things like an interactive type of a demo, that you would want to be in Flash, right? But just like text, you’re not really adding a lot of value.

So I would only use the Flash, say, for those types of things where it really makes a lot of sense. Try to keep navigation out of Flash. I see pages say “Enter” and then you’ll find in the search results that snippet for your search result is “Enter.” So I would just recommend against that.

If possible, try to put the text in HTML. Have an HTML version, if possible. I mean, it’s not just for search engines; it’s really for visitors, too. I was going to a site a couple of days ago, and I just got a new laptop and I apparently didn’t have the right version of Macromedia installed to view this website.

So it’s like, oh, it wanted me to install the new version and it wanted me to close my browser. But I have like 50 tabs open, [laughs] and I’m in the middle of doing something in each of them, so I just closed the website, because it just wasn’t, at the time, worth it for me to go through all of the trouble. So I think it’s nice to give your visitors an option other than Flash, beyond the value it gives from a search engine’s perspective.

In terms of JavaScript and Ajax, kind of the same thing. At this point, what does your site look like with JavaScript turned off? Not everyone’s going to have it. Search engines aren’t going to be able to crawl it. You’ve got people on screen readers, mobile devices. I have a Blackberry that I use all the time, and you don’t know how many sites that I really want to access, and I get this error: I don’t have the right version of JavaScript on my Blackberry, so I can’t access the site.

When we say these things that will help search engines, it’s really not just for search engines; it’s for a lot of your visitors as well. There’s just so many different types of browsers and so many types of experiences out there, you don’t want to turn anyone away. So, make the content available in those types of situations.

Stephan: Specifically with Ajax, that relies on JavaScript. And a way that Ajax is oftentimes used is to pull in content without having to reload the page. So you click on a little triangle thing and it loads some additional content in underneath, and that’s sometimes quite useful and cool. But from a search engine’s standpoint, that’s not really an optimal scenario, because that content that is just brought in, it’s not available to the search engines. Is that right?

Vanessa: Right. We’re not going to see it. Probably the best thing to do is to just make sure the content around it accurately describes the page. So we’ve got to have enough content on the rest of the page to know what the page is about, in order to index and show it in search results. I just would be aware of that as you’re making sort of Ajax y pages, that what’s in those little Ajax y parts may not be indexed as well as the rest of the site.

Stephan: Right. Or alternatively, perhaps there’s a way you can use JavaScript or Ajax to not display that content, but it’s part of the HTML, until you click on the icon. In that case, that content is available. It’s just that if you have it load in from another place on the web once you click on it, rather than show it on the page, that content is going to be unavailable to the spiders.

Vanessa: Yeah. There’s a variety of things that you can do. And I would just take a look and see what the pages look like with JavaScript turned off, and can you access the content of the page. It’s really kind of a good diagnostic view of the page.

Stephan: Progressive enhancement is a way, with Flash and just the web in general, so that the page loads and displays maybe not as perfectly as you’d like it to but still works on a handheld device or a Blackberry, but on a recent version browser, it will have all the full functionality. So you can use progressive enhancement for Flash. You can use progressive enhancement, from a web standard standpoint, in CSS. So any thoughts on that? Is that something that Google likes or dislikes or is ambivalent about? Another way of calling it is a graceful degradation.

Vanessa: It would depend on the particular instance and the intent of it, but I think, generally, that can work pretty well, as long as you’re serving up sort of the same results for any Blackberry user; you come and you see the content.

One example, going back to being when the JavaScript is turned off or the Flash isn’t available, is there an HTML version available? Or if you’re serving that up to anyone who comes to the site, and say, not just for a search engine who visits the site, then that can be a good workaround. Of course, it would depend on the specific way it’s implemented, but overall, if you’re serving the same thing to your users as to search engines, then you’ll be OK.

What we care about is that when someone clicks through the search results to your page that they get to the same thing that we told them they were going to get to.

Stephan: OK. Well, any final words that you’d like to share with our listeners?

Vanessa: Well, I would just say I really enjoy the collaboration that we’ve started, kind of helping site owners improve their sites and having site owners kind of help us improve our search results. So I think it’s awesome, and I look forward to a lot more of it in the future. It’s really exciting!

Stephan: Yeah. Well, you’ve been such a wonderful help to the webmaster community. And I’ve actually seen you on the Google Groups for Webmaster Central, chiming in, offering help and assistance to a lady in Europe that was just going crazy over Christmas holiday. And here you were taking time out of your holiday to respond to someone who just was not listening.

Vanessa: Well, people have different levels of experience with the web and with search engines, and so I try to keep that in mind as I’m helping people, that sometimes they do get a little panicked if they haven’t really been around a lot and don’t know things that seem normal to us or just known for so long. People who are newer don’t really know all the things that we take for granted, so I try to help out when I can.

Stephan: And that Google Webmaster Central blog is a fabulous resource. Can you share the URL with our listeners?

Vanessa: Sure. Well, just go to is Webmaster Central. And there’s where you’ll find the link to the discussion forum, the blog, the help center, and then all of the tools. That’s probably the best URL to give out.

Stephan: All right. Well, thanks very much, Vanessa. Really appreciate your time.

Vanessa: Yeah, thanks for talking with me. It’s been fun.

Stephan: OK. And thanks, everyone, for listening.

Announcer: You’ve been listening to the official podcast of AMA Hot Topics: Search Engine Marketing, where conference chair and NetConcepts President Stephen Spencer interviews the gurus who will be speaking at our conferences in San Francisco on April 20th, New York City on May 25th, and Chicago on June 22nd. To register, or for more information, visit the conference website at Hope to see you there.