Google SiteMaps - Get ALL Your Pages Listed!
by Robert Fuess
Published on this site: July 18th, 2005 - See
more articles from this month...

Google has a new trick! Don't get left behind! It's the hottest
thing since RSS.
I was working on my Google AdWords campaign for http://www.SchoolAndTeacher.com
and I noticed that Google had a new feature called "Google
Sitemap". It was a perfect fit with my needs. I was concerned
with how deep Google would be able to get into my site. Teachers
are able to have homework posted daily on my site but will
Google see the daily changes? Now they can. When a new teacher
signs up, how many months will it be before Google recognizes
it? With Google Sitemap, Google can be informed about new
pages quickly and when a page last changed.
If you have a website and want Google to know about ALL the
pages in your site, build a Google Sitemap. That's it. Keep
reading, since I will show you how to do so, and where it
proves most useful.
Webmasters (like me) have been frustrated by the slow methodological
way in which Google gradually finds the pages in your site.
If your site starts out big and has a lot of dynamic pages,
then this is unacceptable. Hurray for Google! They are listening.
Many webmasters like using DHTML menus, but are concerned
about the search engines finding all the pages. This will
help Google to find them. (In reality, I would still recommend
having a regular site-map to help the other search engines
find the rest of your pages. Google may drive the most traffic
but customers coming in through any search engine are welcome.
Don't throw this away.)
DO WE KEEP OUR OLD SITE MAPS?
Yes! The Google Sitemap is not useful to your normal human
user just to Google. (It is in XML format, not HTML.) Also
remember that other search engines (like Yahoo and MSN) don't
use this type of site map yet. Hopefully they will.
WHAT DOES A GOOGLE SITEMAP LOOK LIKE?
It is XML. For XML gurus, here is the schema: http://www.google.com/schemas/sitemap/0.84/sitemap.xsd.
But if you are new to XML, here is a sample. I will go through
it. Don't let the tags scare you.
<?xml version="1.0" encoding="UTF-8"
?>
<urlset xmlns ="http://www.google.com/schemas/sitemap/0.84"
xmlns:xsi ="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation ="http://www.google.com/schemas/sitemap/0.84
http://www.google.com/schemas/sitemap/0.84/sitemap.xsd">
<url>
<loc>http://www.SchoolAndTeacher.com/OneClass.aspx?ClassId=1</loc>
<lastmod>2005-7-4</lastmod>
<changefreq>monthly</changefreq>
<priority>0.7</priority>
</url>
<url>
<loc>http://www.SchoolAndTeacher.com/OneClass.aspx?ClassId=2</loc>
<lastmod>2005-7-1</lastmod>
<changefreq>monthly</changefreq>
<priority>0.5</priority>
</url>
</urlset>
EXPLANATION OF TAGS:
<?xml version="1.0" encoding="UTF-8"
?>
This should be at the top of the document. It states that
this is an XML document and what version of XML is being used.
<urlset xmlns="http://www . . . ">
This tag is the wrapper for all the URLS in the sitemap.
Just copy it from the full example above. If you are new to
XML, don't fret. Just recognize that this is how GOOGLE wants
it.
<url> .. </url>
This wraps the set of XML elements (pair of opening and closing
XML tags) for each URL you want to tell Google about.
<loc>YourUrl</loc>
Here is where you put the URL to the webpage you want Google
to know about. Try to avoid extra spaces.
<changefreq>monthly</changefreq>
This tells Google how often you expect there to be changes
in a web page. It may be hourly, daily, weekly, monthly, yearly,
or never. Be honest with your real expectations.
<lastmod>YYYY-MM-DD</lastmod>
This tells Google when you last modified your web page. This
is a really important tag. (You can put the time in also if
you feel the need to.)
<priority>#.#</priority>
This tells Google what you feel is the most important to
crawl. These are relative numbers from 0.0 1.0, with 1.0 being
the most important. The default value is 0.5 (even if you
leave the tag out). THIS HAS NO IMPACT ON PAGERANK!!! This
is just a relative weight for Google to crawl YOUR site. If
you have all of them at .9 it would be no different than all
of them being .1
Look at it this way. If Google was super busy one day and
had time to crawl only 3 pages in your site, which ones should
it crawl? I would want them to crawl the three with the highest
priority (to me) that have changed recently.
BUT I DON'T WANT TO USE XML
You can just provide a text document with the list of URL's.
This will help Google find your pages, but will not help Google
effeciently decide on what to spider. I would strongly encourage
you to build an XML version of google sitemap.
BENEFITS OF USING THE XML VERSION:
- Google can know what pages have changed and not have
to re-crawl those pages that haven't changed.
- If you have a lot of pages and Google doesn't have time
to crawl all of your pages all right away, it will focus
on the ones that changed according to your priority.
- Google is your friend. If they want information on how
to efficiently crawl your site give it to them.
WHAT TO SUBMIT?
You may submit a sitemap, or an index of your sitemaps. Either
will do. Google has documentation on both. (Don't worry about
the index yet. That is addressed at the end of this article.)
I BUILT ONE - NOW HOW DO I SUBMIT IT?
First TELL GOOGLE IT EXISTS
- Upload your Sitemap to your site to the highest folder
in your website.
- Sign into Google Sitemaps with your Google Account. (Use
this link: https://www.google.com/webmasters/sitemaps/login)
- Click on "Add a Sitemap" link.
- Type the URL to your Sitemap location.
Congratulations! Google now knows about it!
Now tell Google whenever something changes. It will check
this sitemap to see where the changes are.
Quick and Easy Way:
Type the following into the address section of your browser:
http://www.google.com/webmasters/sitemaps/ping?sitemap=
http://www.YourDomain.com/YourSitemap.xml
Of course, you should replace the YourDomain.com with your
domain and YourSitemap.xml with your sitemap.
If you have a dynamically built site then you would want
to automate this using screen scraping techniques.
HOW OFTEN TO SUBMIT?
Ideally, it should be submitted when changes are made. Personally,
I would avoid doing so more than once per day. However, we
will look to Google as they may provide further guidance on
this. Search engines are our friends, and we should be respectful
of abusing any service they provide or making them process
things needlessly.
WILL THIS IMPROVE MY GOOGLE RANKING?
Google doesn't make any promises of this. This is mainly
a way for Google to find your pages, and to efficiently know
what pages need to be re-crawled on your website. If your
site makes frequent changes, this feature helps Google to
know about them more quickly. It won't have to spider through
your whole site to find the changes.
HOW MANY URLs CAN I HAVE IN A GOOGLE SITEMAP?
According to Google documentation, you may have up to 50,000.
If you anticipate more than this, then you should build several
sitemaps and use a Google Sitemap Index. This Index will point
to the several sitemaps. If you want more information on the
Sitemap Index, go to: http://www.google.com/webmasters/sitemaps/docs/en/protocol.html#s
itemapFileRequirements
FOR more information, please refer to Google Documentation
at http://www.google.com/webmasters/sitemaps/docs/en/about.html

Robert Fuess is a website designer, who focuses on
developing SEO friendly, database-driven websites (such as
shopping carts). If you are interested in a dynamic website
that has a Google Sitemap automatically generated and submitted
to Google, call me at (805) 720-0789 or email me at: [email protected].
http://www.spiderweblogic.com/

|