How does MSN Search stack up to Google and Yahoo!

February 8th, 2005

by

Originally published in MarketingProfs

With Microsoft throwing its hat into the ring alongside Google and Yahoo!, consumers as well as search marketers have more choices.

Choices and competition are good for the marketplace. But, for search marketers, along with more choices comes potential confusion — about what works and what doesn’t work on each engine.

I’ve assembled this side-by-side comparison of the top three search engines to serve as a quick reference sheet about each engine’s strengths, weaknesses, acceptance of search engine optimization best practices, and tolerance levels for SEO worst practices:

MSN Google Yahoo!
Recommended Practices
Keyword-rich title tags Yes Yes Yes
Keyword-rich body copy Yes Yes Yes
Links from “important” sites Yes Yes Yes
Keyword-rich link text Yes Yes Yes

Practices to Avoid
Complex URLs >5 parameters >6 parameters Unknown, but less tolerant than Google and MSN Search
Frames To a degree (but orphan frames are common) To a degree (but orphan frames are common) To a degree (but orphan frames are common)
JavaScript-based links No To a degree To a degree
Too many links per page Unknown >100 links Unknown
Page is too many clicks away from home page >7 clicks Unknown Unknown
Automated queries Currently not discouraged Excessive queries from an IP address can result in the IP being blocked; violates Google’s ToS unless using Google’s API Excessive queries from an IP address are ignored

Vital Statistics
Amount of page indexed 150K 101K 150K (500K for PDFs)
Default no. of results per page 10 10 20
Size of index 5 billion documents 8 billion documents undisclosed
Market share 15% 45% 32%

Querying
inurl: No Yes Yes
filetype: No Yes Yes
link: Yes Yes (but results are only a sampling) Yes
Linkdomain: No No Yes
Boolean logic (+, OR, -) Yes Yes Yes
Local search “Near Me” (currently very limited) Google Local Yahoo! Local
Reordering the search results Yes (by freshness, popularity, exactness of match) No No
Query word limit No limit 32 words No limit
Results via RSS Yes No Yes (Yahoo! News only)

Keyword-Rich Title Tags

The text within your page title (also known as the title tag) is given more weight by the search engines than any other text on the page. The keywords at the beginning of the title tag are given the most weight. So, by leading with keywords that are relevant to your business and popular with searchers, you make your page appear more relevant to those keywords in a search.

Keyword-Rich Body Copy

Relevant and popular keywords should also be included in the page’s body copy, particularly near the top of the HTML, as they will then be weighted more heavily by the search engines. Be careful not to go overboard — that is, to the point that your copy doesn’t read well. Ideally, incorporate at least 200 to 250 words on each page so the search engines have enough content to determine the theme of the page.

Links From ‘Important’ Sites

“Link popularity” — the number of links that point to your site — is a key criterion that search engines use for ranking pages, but with an important twist: there’s a weighting factor placed on each link to take into account the importance of the page linking to you. Think of each link as a “vote.” But this isn’t a democracy: a “vote” for your site from CNN.com is worth much more than a “vote” from Jim-Bob’s personal home page. Google has given a trademarked name to this importance-scoring algorithm: PageRank. Both MSN Search and Yahoo! appear to use similar importance scoring systems. Accordingly, links from “important” sites are of critical importance to your search visibility.

Keyword-Rich Text in the Links From These Sites

Search engines, MSN Search included, associate the anchor text in the hyperlink as highly relevant to the page being linked to. So use good keywords in the link text to help the engine better ascertain the theme of the page you are linking to. Keep the link text relatively succinct and tightly focused on just one keyword or key phrase. The longer the link text, the more diluted the overall theme being conveyed.

Complex URLs

Search engines are cautious of pages that are dynamically generated, because the search engine spider could end up in an infinite loop known as a “spider trap.” The search engine can tell if the page is dynamic by the appearance of a question mark (?) or the text “cgi-bin” in the URL.

The part of the URL after the question mark is known as the “query string,” which can consist of one or many parameters. Each parameter consists of a name and a value, separated by an equal sign (=). Each parameter is separated by an ampersand (&). The more parameters in a URL, the more likely that the search engine will end up in a “spider trap”; that is, the URLs keep changing but the content is the same as other pages that have already been spidered. MSN Search’s tolerance level is six parameters — in other words, five ampersands.

Frames

Search engines have problems crawling sites that use frames (i.e., part of the page moves when you scroll but other parts stay stationary). According to Google, “Frames tend to cause problems with search engines, bookmarks, emailing links and so on, because frames don’t fit the conceptual model of the Web (every page corresponds to a single URL).”

Furthermore, if a frame does get indexed, searchers clicking through to it from search results will often find an “orphaned page”: a frame without the content it framed, or content without the associated navigation links in the frame it was intended to display with. Often, they will simply find an error page.

Javascript-Based Links

If you expect search engine spiders to execute JavaScript code (or Flash or Java) to access links to further pages within your site, you’ll often be disappointed with the results. Google and Yahoo! have a limited ability to deal with JavaScript. But, nonetheless, it’s not a search engine friendly way to go.

Too Many Links Per Page

Google cautions: “Keep the links on a given page to a reasonable number (fewer than 100).” What’s the consequence of not heeding Google’s advice? No one’s entirely sure, but it could include a significant discounting of the PageRank voting power of the page containing all the links. In any event, it’s best to keep to fewer than 100 links per page just for usability/readability alone.

Page Is Too Many Clicks Away From Home Page

If MSNbot (MSN’s search engine spider) needs to traverse through eight pages on your site before finding link pages that nobody but yourself points to, it may choose not to index those pages.

Automated Queries

Automated queries cram the search engine’s servers with useless searches and distort search data. Conducting automated queries on Google violates Google’s terms of service — unless that tool does the querying through Google’s API (which has a maximum number of queries of 1,000 per day). Note that the API’s results are not very reliable (they tend to differ from the regular Google search results).

Amount of Page Indexed

The search engines will index only a finite amount of content on any Web page. For example, a two megabyte Web page will get only a fraction of its content included in a search engine’s index.

Default Number of Results Per Page

The search results page displays a certain number of search results by default. This number can be changed by the search user under the “Preferences” or “Settings” page.

Size of Index

Simply, index size is the number of documents in the search engine’s database of Web pages, known as an index. Put another way, it’s a measure of how all-encompassing a given search engine is of the entire Web.

Certainly, no search engine today is anywhere near comprehensive. Indeed, it’s estimated that the “Invisible Web” — the pages from database driven Web sites that create Web pages on demand and are not easily accessible to search engine spiders — comprises a whopping 91,850 terabytes of data (Source: Berkeley’s “How Much Information” study).

Market Share

For the purposes of this article, the market is defined as the total number of search queries conducted across all search engines — not the number of search users.

Inurl:

Use the inurl: operator to restrict the search results to pages that contain a particular word in the Web address. This can be especially useful if you want to view all the pages within a particular directory on a particular site, such as inurl:downloads site:www.bigfootinteractive.com or all the pages with a particular script name, such as inurl:ToolPage site:www.vfinance.com.

Filetype:

You can restrict your search to Word documents, to Excel documents, to PDF files, or to PowerPoint files by adding filetype:doc, filetype:xls, filetype:pdf, or filetype:ppt, respectively, to your search query.

Want a great PowerPoint presentation on email marketing that you can repurpose for a meeting? Simply search for email marketing filetype:ppt. Need a marketing plan template? Since the template would most likely be a Word document, cut through the Web page clutter with a search on marketing plan template filetype:doc. (Sidenote: Don’t link to your own marketing plans if you don’t want them showing up in the search engines.)

Google allows any extension to be entered in conjunction with the filetype: operator, including htm, txt, php, asp, jsp, swf, etc. Google then matches on your desired extension after the filename in the URL.

Link:

The link: operator displays a list of pages that link to the specified Web page. Follow this operator with a Web address, such as link:www.marketingprofs.com, to find pages that link to the MarketingProfs home page. On Yahoo!, you’ll need to include http:// after link: (for example: link:http://www.marketingprofs.com). Both Yahoo! and MSN Search, but not Google, allow you to append further refinements onto this operator, such as excluding links within the same site (for example: link:www.marketingprofs.com -site:www.marketingprofs.com on MSN Search; or link:http://www.marketingprofs.com -site:www.marketingprofs.com on Yahoo!).

Linkdomain:

The linkdomain: operatoris a superior link-checking tool to the link: operator, as it shows pages that link to any and all pages of the specified site. You can append further refinements onto this operator, such as excluding links within the same site (for example: linkdomain:www.marketingprofs.com -site:www.marketingprofs.com).

Boolean Logic (+, or, -)

Boolean logic, named after George Boole, is the logic of sets that makes use of the logical operators AND, OR and NOT to create additional sets. You may find that you want to match on both the singular and plural forms of a word. In that case, you can use the OR search operator, as in “direct marketing consultant OR consultants”; you can also group the words with their alternatives together using parentheses. For instance, a search query of “female car (buyer OR buyers OR shopper OR shoppers)” statistics would match on any of the four phrases plus the word statistics. Note that the OR needs to be capitalized to distinguish it from or as a keyword.

In addition to the OR operator, there is an AND operator. However, it is not necessary to specify it, because it is automatically implied. So don’t bother with it.

The search engines also offer an exclusion operator: the minus sign (-). On MSN Search you can also use NOT. The operator works as you might expect, eliminating from the search results the subsequent word or quote-encapsulated exact phrase. For example, confidential (“business plan” OR “marketing plan”) -template will not return pages in the results if they mention the word template, thus effectively eliminating the sample templates from the results and displaying a much higher percentage of actual business plans and marketing plans. (As an example of a query with a phrase negated instead of a single word, consider “marketing plan” -“business plan”).

The AND and OR operators can be abbreviated as a plus sign (+) and the pipe symbol (|), respectively. Thus, the previous search query can be entered as confidential (“business plan” | “marketing plan”) -template.

Local Search

Local search allows the searcher to restrict the search results to within a certain geographical region. Google offers Google Local for this (local.google.com), Yahoo! offers Yahoo! Local (local.yahoo.com), and MSN Search offers the “Near Me” button (next to the “Search” button).

Both Google Local and Yahoo! Local provide a box to enter the geographic location next to the search box; MSN does not. MSN Search determines your location automatically through one of four means: your IP address, your default location as you defined it under Settings, your search terms (e.g., marketing consultants seattle will recommend a revised search with Near Me set to Seattle, WA), or by selecting from the Try Near list in location-based search results.

Reordering the Search Results

MSN Search offers a unique and useful feature for users: namely, one can reorganize the set of search results by freshness, by popularity or by exactness of match. This is accomplished either through using the “Search Builder” advanced search functionality then clicking on “Results Ranking” and using the slider bars, or by using the following three operators within your search query: frsh, popl and mtch.

If using the latter, follow the operator with an equal sign then a number from 0 to 100. All this must then be encapsulated in curly brackets {}.

The fresh operator allows you to emphasize sites that are fresher (i.e., more recently added to MSN Search’s index). On the freshness scale, 100 gives the most emphasis to recently updated sites, 0 to the least recently updated. Use it like so: branding {frsh=100}. The popl operator reorders results by popularity. For example, branding {popl=100} would place the most popular sites at the top.

Finally, the match operator allows you to specify how precisely to match the search results to your query words. For example, branding {mtch=0} emphasizes the exact matches at the top of the search results. Note that these three operators default to 50 if not specified.

Query Word Limit

Longer search queries generally return more relevant search results than shorter queries. However, there is a limit. In the case of Google, that limit is 32 words. Any word after the thirty-second is ignored. You can work around this limitation to some extent by clever use of the asterisk wildcard characters. Specifically, consider replacing the most common words with asterisks.

Neither MSN Search nor Yahoo! has a query word limit. This is especially handy if you are restricting your results to a group of sites and the number in the group causes you to exceed Google’s word limit (i.e., when using the site: operator).

Results via RSS

RSS is an XML format designed for syndicating headlines and other Web content to other Web sites. It has evolved into a popular means for individuals to keep up with the latest articles and musings across favorite Web sites. RSS readers can be standalone programs, built into Web-browsers like Firefox (e.g., the Sage Firefox extension), integrated into email clients like Outlook (e.g., NewsGator) or aggregated on sites like BlogLines, My Yahoo!, or MyMSN. RSS offers an unspammable continuous stream of information from the Web sites we choose to follow.

MSN Search offers the ability to subscribe to an RSS feed of search results for continuous monitoring of those results. At the bottom of each search results page, you’ll find an RSS graphic that links to the URL that you should supply to your favorite Web-based RSS aggregator (e.g., My Yahoo!, MyMSN) or newsreader program (e.g., NewsGator, Sage).