Unlocking Google’s Hidden Potential as a Research Tool (Part 1 of 5)

August 3rd, 2004

by

Originally published in MarketingProfs

If youâ??re like me, you use Google every day to find thingsâ??news, technical support, events, tips, research documents and more.

Were you to master Googleâ??s powerful search refinement operators and lesser-known features, over a yearâ??s time you could save days scouring over irrelevant results. Perhaps even more enticing is the promise of elusive nuggets of market research and competitive intelligence out there waiting to be discovered. This five-part series will show you how to find what you need quickly and with laser-like accuracy

With over 6 billion documents in its index, Google is a veritable treasure trove of information. Yet finding just the right document out of those billionsâ??the one that answers your questionâ??can be daunting. Thereâ??s good news for you, however. The search results you seek are about to rise to the top of the results, thanks to some of Googleâ??s search-refinement operators that Iâ??ll talk about here, in part one, titled â??15 Ingredients to More Refined Searches.â??

Next week, Iâ??ll introduce you to the world of Googleâ??s advanced search operators, such as filetype:, intitle:, inurl:, site:, and daterange:. And in part 3, we will address various features available from Googleâ??s interface, such as Search Within Results, Similar Pages, SafeSearch filtering, spelling corrections, â??Iâ??m Feeling Luckyâ?? and the Advanced Search page.

Part four will cover Googleâ??s many other search properties, including Google News, Google Local, Google Personalized, Froogle, Google Directory, Google Catalogs, Google Groups and Google Images.

Finally, in the five and final part you can look forward to learning all about third-party tools and resources that enhance Google searching in various amazing ways.

15 Ingredients to More Refined Searches

If your search yields millions of search results, your search query is probably too broad. Rather than culling through pages and pages of search results, use these 15 ingredients to refine your search:

Multiple words: Avoid making one-word queries.

Case insensitivity: Thereâ??s no need to capitalize.

Stop words: Drop overly common words.

Exact phrase: Put quotes around phrases.

Word order: Order your words in the order you think they would appear in the documents youâ??re looking for.

Singular versus plural: Use plural if you think the word will appear in that form in the documents youâ??re looking for.

Proximity: Words close together in your search will favor documents with those words close together in the text.

Wildcard: * can substitute for a whole word in an exact phrase search.

Number range: .. between numbers will match on numbers within that range.

Punctuation: A hyphenated search word will also yield pages with the un-hyphenated version. Not so with apostrophes.

Accents: Donâ??t incorporate accents into search words if you donâ??t think theyâ??ll appear in the documents youâ??re looking for.

Boolean logic: Use OR, NOT, (), |, and – to fine-tune your search.

Stemming: Google may also match on variations of your search word unless you tell it otherwise by preceding the word with +

Synonyms: ~ in front of a word will also match on other words that Google considers to be synonymous or related.

Query length: 10 words are the maximum for a Google query.
1. Multiple Words

The first key to refined searches is a multiple-word query. A one-word search query isnâ??t going to give you as targeted a search result. Searching for ohio car buyer statistics instead of statistics will obviously yield a smaller and more specific set of search results.

2. Case Insensitivity

Searches are case insensitive, so capitalizing the word Ohio in the above example is unnecessary, as it would return the same results.

3. Stop Words

Overly common words like the, an, of, in, where, who, and is are omitted from your query. Such words are known as â??stop words.â?? Google will advise you on the search results page when it has left out a stop word from your query.

Avoid formulating your query as a question. A search like how many female consumers in ohio buy cars? is not an effective query, for two reasons. First, questions invariably contain stop words (how in this case). Second, the query will include other superfluous words that probably wonâ??t appear in the text of the documents you are searching for (such as the word many or in). Thus, a large number of useful documents will have been eliminated.

4. Exact Phrases

If youâ??re looking for a phrase rather than a collection of words interspersed in the document, put quotes around your search query. Enclosing a query in quotes ensures that Google will match those words only if they occur within an exact phrase. Otherwise, Google will return pages where the words appear in any order, anywhere on the page. For example, a market research query returns many more (but less useful) results than â??market researchâ?? would.

When stop words are included in an exact phrase search, Google doesnâ??t ignore them as it normally does. For example, a search for â??to be or not to beâ?? will match all words as the phrase, even though nearly all the words are stop words.

You can include multiple phrases in the same query, such as â??market researchâ?? consultants â??new jerseyâ??; such a query would match on documents that contain the word consultants in front of or behind the phrase market research, but giving preference to pages where consultants appears after market research.

Be careful not to create queries that should not be phrases. In the example of â??market researchâ?? consultants â??new jerseyâ?? you might be tempted to simply put one set of quotes around the whole set of words (like so: â??market research consultants new jerseyâ?? ). Such a search would return a nearly empty results set, however, because itâ??s not a likely order of words used in natural language.

5. Word Order

Thus, itâ??s important to consider the order of the words you use in your search query, because although it doesnâ??t affect the number of resultsâ??it does affect the relative rankings of those results. Priority would be given to pages where those words/phrases appear in the order given in your search query.

6. Singular Versus Plural

Consider whether the pages you seek are more likely to contain the singular form or the plural form of a given keyword, and then use that form in your search query. For example, a search for car buyers females statistics does not return nearly as good a set of results as car buyers female statistics.

7. Proximity

The proximity of keywords to each other is another factor that influences the positions of the search results. The closer the words that you have juxtaposed in your query, the higher they will rank.

8. Wildcard

The asterisk acts as a wildcard character and allows you to space out words from each other if you want Google to give preference to pages that space your keywords apart from each other by a particular number of words.

For example, if you wish to learn more about marketing your own books, youâ??d be better off with a search for marketing * books than marketing books, as the latter would return more results discussing books about marketing.

Asterisks can be used as a substitute only for an entire wordâ??not for a part of a word.

The asterisk is even more helpful when used within an exact phrase search. For example, â??standards * marketingâ?? would match pages that match for the phrases standards for marketing, standards in marketing, as well as standards and marketing, to name a few.

9. Number Range

Your Google search can span a numerical range; you indicate the range by using two dots between two numbers, which could be years, dollar amounts, or any other numerical value.

For example, a search for confidential business plan 2001..2004 will find documents that mention 2001 or 2002 or 2003 or 2004. The query confidential business plan $2000000..$5000000 will match documents that mention dollar figures anywhere in the range of $2 million to $5 million, even if commas are present in the numbers.

10. Punctuation

Other than these special characters (wildcard and range indicators), most punctuation gets ignored. An important exception is the hyphen. A search query of on-site consulting will be interpreted as onsite consulting OR on-site consulting OR on site consulting.

Another important exception is the apostrophe, which is matched exactly if contained within the word. So, marketerâ??s toolkit will return different results from marketersâ?? toolkit, but the latter will be equivalent to marketers toolkit (i.e., without the apostrophe).

11. Accents

Accents are yet another exception. A search for internet cafés manhattan will yield a different, and much smaller, set of results than internet cafes manhattan. So, for a search on cafés, more English-language documents would exclude the accent than include it; in that case, it would be advisable not to incorporate the accent into the search.

12. Boolean Logic

You may find that you want to match on both the singular and plural forms of a word. In that case, you can use the OR search operator, as in â??direct marketing consultant OR consultantsâ??; you can also group the words with their alternatives together using parentheses. For instance, a search query of â??female car (buyer OR buyers OR shopper OR shoppers)â?? statistics would match on any of the four phrases plus the word statistics. Note that the OR needs to be capitalized to distinguish it from or as a keyword (which is, of course, a stop word and would therefore be ignored).

You may be wonderingâ?¦ since there is an OR operator, whether perhaps there is an AND operator as well. Indeed there is. However, it is not necessary to specify it, because it is automatically implied. So donâ??t bother with it.

Google also offers the NOT operator. It works as you might expect, eliminating from the search results the subsequent word or quote-encapsulated exact phrase. For example, confidential (â??business planâ?? OR â??marketing planâ??) NOT template will not return pages in the results if they mention the word template, thus effectively eliminating the sample templates from the results and displaying a much higher percentage of actual business plans and marketing plans. (As an example of a query with a phrase negated instead of a single word, consider â??marketing planâ?? NOT â??business planâ?? ).

The AND, OR, and NOT operators can be abbreviated as a plus sign (+), the pipe symbol (|) and the minus sign (-), respectively. Thus, the previous search query can be fed to Google as confidential (â??business planâ?? | â??marketing planâ??) -template.

13. Stemming

Sometimes, Google automatically matches on variations of a word. This is called â??stemming.â?? Google does this by matching words that are based on the same stem as the keyword entered as a search term.

So, for the query electronics distributing market research, Google will match pages that donâ??t mention the word distributing but instead a variation on the stem distribut: e.g., the keywords distributor, distributors and distribution.

You can disable the automatic stemming of a word by preceding the word with a plus sign. For instance, electronics +distributing market research will not match on distribution, distributors, distributor, and so on.

14. Synonyms

You can expand your search beyond stemming to incorporate various synonyms too, using the tilde (~) operator. For instance, marketing research data ~grocery will also include pages in the results that mention foods, shopping or supermarkets, rather than grocery.

15. Query Length

Longer search queries are generally better than shorter queries. However, there is a limit. In the case of Google, that limit is 10 words. Any word after the tenth is ignored.

You can work around this limitation to some extent by clever use of the asterisk wildcard characters. Specifically, consider replacing the most common words with asterisks.

The query shakespeare OR hamlet â??to be or not to be that is the questionâ?? has two words too many; thus, the words the and question would be ignored. This query would produce better results: shakespeare OR hamlet â??to be or not to be that * * questionâ?? (note that the OR search operator doesnâ??t count as a keyword).