Results 1 to 3 of 3

Thread: sitemap_gen.py

  1. #1
    Thinkin' out loud again Builder's Avatar
    Join Date
    Nov 2002
    Location
    Illinois
    Posts
    2,088
    Rep Power
    19

    sitemap_gen.py

    Well this is interesting. For at least 3 years sitemap_gen.py hasn't successfully created a sitemap.xml file on my site. There was some talk here around that time about this and it seemed others had moved on to other ways of creating that file. Here's one thread:
    http://forum.powweb.com/showthread.php?t=83822

    Out of laziness I suppose, I left sitemap_gen.py and the config file on my site as well as leaving the scheduled job in OPS. Lo and behold, it worked this past Sunday morning (30 June) and a new sitemap.xml file was created! Actually, it may have worked before that; I haven't been paying attention .

    I had noticed a couple odd things recently such as googlebot trying to find files in places it should have never known about just by following links: session_id files for my calendar script for example. Also my SERP for one term jumped up 4 places overnight and I'm No. 1 on a variant of that term. Googlebot also created a few 500s on a couple of Perl files it shouldn't have known about since they are support files for something else (i.e. "subroutines").

    Google Webmaster Tools claimed that a sitemap hadn't been submitted, but googlebot must have read sitemap.xml to have found some of the files, there's no other way it could have known about them. I "submitted" the file manually and had a bunch of warnings. Mostly these were due to me denying access via robots.txt to certain folders to bots (of any flavor) but the sitemap having those folders and files included (bad explanation, read this twice). I manually edited them out and it was accepted with no errors or warnings.

    So FYI, Croc's instructions apparently work again -- or at least partly. This can save you a bunch of work or bandwidth depending on how you generate sitemap.xml. The caveat is that the script doesn't seem to successfully submit the file to Google, but googlebot seems to read it anyway.

    I'll check on it tomorrow (if I can remember ) and report back.

    Kevin
    A good friend will come and bail you out of jail...
    but a true friend will be sitting next to you saying,
    "Damn... that was fun!"

  2. #2
    Rick
    Join Date
    May 2002
    Location
    Minneapolis, MN
    Posts
    1,753
    Rep Power
    19
    Once you've set the location for your sitemap in Webmaster Tools (or included it in your robots.txt file), there's no need to re-submit it. Google will re-crawl the sitemap file periodically on its own, just as it does the other pages of your site. If you make a major change to the structure of your site or modify some important pages, then it's a good idea to re-submit the sitemap to notify Google of those changes so that they can start absorbing those changes as soon as possible.
    Rick Trethewey

  3. #3
    Thinkin' out loud again Builder's Avatar
    Join Date
    Nov 2002
    Location
    Illinois
    Posts
    2,088
    Rep Power
    19
    Good to know Rick, thanks!

    To update, the script ran early this morning and updated my sitemap.xml. Why it didn't work for 3 years, and why it all of a sudden started working again I have no clue. I did without a sitemap for that period because it was too much effort to put together, or the other solutions out there weren't as convenient. I'm a happy camper!

    Kevin
    A good friend will come and bail you out of jail...
    but a true friend will be sitting next to you saying,
    "Damn... that was fun!"

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •