Page 1 of 1
Question about the Bot Mod

Posted:
Wed May 31, 2006 1:00 am
Author: computerskillz
In the ACP it says this: Bots (also known as crawlers) are automated agents most commonly used to index information on the internet. Very few of these bots support sessions and can therefore fail to index your site correctly. Here you can define the assigning of session ids to these bots to solve this problem.
My question is, is this all that the mod does? I.e assign session IDs?
Because if assigning session IDs is all that it does, then that really doesn't address the problem of dynamic urls which the search engines don't like. Could it be that the bot is requesting pages as a result of acquiring a session ID, only to reject the page that its served.
Sometimes my server says 100% for google bot, but I'm curious as to whether or not those pages are actually getting indexed or if they're getting rejected.

Posted:
Mon Jun 05, 2006 5:40 am
Author: Simon N
They are getting rejected.
I too have 100% of my site apparently indexed via google yet trying to find anythign other than the index page is often near impossible.
I am not sure whats to blame. Googles way of indexing or phpBB itself.

Posted:
Tue Jun 06, 2006 2:29 am
Author: sanji
"Simon N";p="8493" wrote:They are getting rejected.
I wouldn't be so sure about that.
I asked many questions about bots in the previous forum, as I had for example google coming 5 times per day for months, and still no indexation of my pages.
And one day, after 5 months and a half, all pages were searchable via google! So I would say you just need to wait... Also, submitting a sitemap of your pages can help a lot to have all the site indexed.
sanji

Posted:
Tue Jun 06, 2006 2:33 am
Author: Simon N
Wait??? its been 3 years lol
I am top of the search engine when typing free-riders which is fine but the posts are not indexed neither are the pictures. Google it seems hates the session id's and the bots mod doesnt seem to allocate what it should.

Posted:
Tue Jun 06, 2006 4:06 am
Author: sanji
You can try an easy think to check if google "sees" your web site...
- download a free tool to download websites like webreaper
- define in that program that your user agent is "googlebot"
- go in the ACP of your forum, and add your IP address to the list of google's IPs
- ensure that you are not connected as a user on your forum
- start downloading
Integramod will consider that you are, in fact, a googlebot, and you will see what pages are downloaded...
Hope this helps,
sanji
Re: Question about the Bot Mod

Posted:
Tue Jun 06, 2006 7:16 pm
Author: Teelk
Google definately does not like sessions or dynamic URL's at all. This is unfortunate for them as this is the way that the webbased world is developing. They'll catch up eventually, and indeed have already implemented some changes.
Meanwhile, the only way to get around the problems seems to be to use dynamic rewriting. Though, that bot MOD doesn't accomplish this... I don't know precisely what it does accomplish other then being able to track when the bots are visiting. And the percentage numbers seem really ambiguous to me... I really don't know what they mean. You can have 2500 pages and google will index 1 page and the percent will be at 100%, which obviously makes no sense.
Well, why don't the IntegraMOD team include a dynamic rewrite MOD with the package you might ask? The answer is simple, dynamic rewriting is not universally supported by web hosting companies. Some of the better, more expensive hosting companies will allow dynamic rewriting while most shared or cheap hosts have turned this function off. I'm not entirely sure of the consequences of using a dynamic rewrite MOD if your host doesn't support it, and this may need further investigation. If there are no ill effects then it may be worthwhile to include a dynamic rewrite MOD in a future package.
For now, once my computer issues are in order, I can update webmedic's dynamic rewrite MOD to work with IM 1.4.0. I had it working on my site and I was constantly indexed by a number of different search engines. Even as I type this there are at least 2 bots indexing my site, and the site as been inactive for months...