Question about the Bot Mod

Support for IntegraMOD 140

Moderator: Integra Moderator

Question about the Bot Mod

PostAuthor: computerskillz » Wed May 31, 2006 1:00 am

In the ACP it says this: Bots (also known as crawlers) are automated agents most commonly used to index information on the internet. Very few of these bots support sessions and can therefore fail to index your site correctly. Here you can define the assigning of session ids to these bots to solve this problem.

My question is, is this all that the mod does? I.e assign session IDs?

Because if assigning session IDs is all that it does, then that really doesn't address the problem of dynamic urls which the search engines don't like. Could it be that the bot is requesting pages as a result of acquiring a session ID, only to reject the page that its served.

Sometimes my server says 100% for google bot, but I'm curious as to whether or not those pages are actually getting indexed or if they're getting rejected.
Last edited by computerskillz on Wed Dec 31, 1969 5:00 pm, edited 1 time in total.
testing apostrophe''s in the singature''''s

computerskillz
Members
Members
 
Posts: 45
Likes: 0 post
Liked in: 0 post
Joined: Thu May 18, 2006 1:23 pm
Cash on hand: 0.00

PostAuthor: Simon N » Mon Jun 05, 2006 5:40 am

They are getting rejected.

I too have 100% of my site apparently indexed via google yet trying to find anythign other than the index page is often near impossible.

I am not sure whats to blame. Googles way of indexing or phpBB itself.
Last edited by Simon N on Wed Dec 31, 1969 5:00 pm, edited 1 time in total.
[marq=left][url=http][img=left]http://www.tau-designs.org.uk/pictures/im2_88x32.png[/img][/url] | [url=http][img=left]http://www.free-riders.co.uk/images/banners/freeriders_88_31.gif[/img][/url][/marq]
User avatar
Simon N
Sr Integra Member
Sr Integra Member
 
Posts: 242
Likes: 0 post
Liked in: 0 post
Joined: Mon Mar 27, 2006 2:43 pm
Cash on hand: 0.00

PostAuthor: sanji » Tue Jun 06, 2006 2:29 am

"Simon N";p="8493" wrote:They are getting rejected.


I wouldn't be so sure about that.

I asked many questions about bots in the previous forum, as I had for example google coming 5 times per day for months, and still no indexation of my pages.

And one day, after 5 months and a half, all pages were searchable via google! So I would say you just need to wait... Also, submitting a sitemap of your pages can help a lot to have all the site indexed.

sanji
Last edited by sanji on Wed Dec 31, 1969 5:00 pm, edited 1 time in total.
[img]http://www.secret-japan.com/forum/images/banners/fuji%20secret-japan%2088x31.gif[/img] [url=http]Secret Japan[/url] : discover Japan off the beaten tracks

sanji
Sr Integra Member
Sr Integra Member
 
Posts: 291
Likes: 0 post
Liked in: 0 post
Joined: Wed Apr 12, 2006 9:18 pm
Cash on hand: 0.00

PostAuthor: Simon N » Tue Jun 06, 2006 2:33 am

Wait??? its been 3 years lol

I am top of the search engine when typing free-riders which is fine but the posts are not indexed neither are the pictures. Google it seems hates the session id's and the bots mod doesnt seem to allocate what it should.
Last edited by Simon N on Wed Dec 31, 1969 5:00 pm, edited 1 time in total.
[marq=left][url=http][img=left]http://www.tau-designs.org.uk/pictures/im2_88x32.png[/img][/url] | [url=http][img=left]http://www.free-riders.co.uk/images/banners/freeriders_88_31.gif[/img][/url][/marq]
User avatar
Simon N
Sr Integra Member
Sr Integra Member
 
Posts: 242
Likes: 0 post
Liked in: 0 post
Joined: Mon Mar 27, 2006 2:43 pm
Cash on hand: 0.00

PostAuthor: sanji » Tue Jun 06, 2006 4:06 am

You can try an easy think to check if google "sees" your web site...

- download a free tool to download websites like webreaper
- define in that program that your user agent is "googlebot"
- go in the ACP of your forum, and add your IP address to the list of google's IPs
- ensure that you are not connected as a user on your forum
- start downloading

Integramod will consider that you are, in fact, a googlebot, and you will see what pages are downloaded...

Hope this helps,

sanji
Last edited by sanji on Wed Dec 31, 1969 5:00 pm, edited 1 time in total.
[img]http://www.secret-japan.com/forum/images/banners/fuji%20secret-japan%2088x31.gif[/img] [url=http]Secret Japan[/url] : discover Japan off the beaten tracks

sanji
Sr Integra Member
Sr Integra Member
 
Posts: 291
Likes: 0 post
Liked in: 0 post
Joined: Wed Apr 12, 2006 9:18 pm
Cash on hand: 0.00

Re: Question about the Bot Mod

PostAuthor: Teelk » Tue Jun 06, 2006 7:16 pm

Google definately does not like sessions or dynamic URL's at all. This is unfortunate for them as this is the way that the webbased world is developing. They'll catch up eventually, and indeed have already implemented some changes.

Meanwhile, the only way to get around the problems seems to be to use dynamic rewriting. Though, that bot MOD doesn't accomplish this... I don't know precisely what it does accomplish other then being able to track when the bots are visiting. And the percentage numbers seem really ambiguous to me... I really don't know what they mean. You can have 2500 pages and google will index 1 page and the percent will be at 100%, which obviously makes no sense.

Well, why don't the IntegraMOD team include a dynamic rewrite MOD with the package you might ask? The answer is simple, dynamic rewriting is not universally supported by web hosting companies. Some of the better, more expensive hosting companies will allow dynamic rewriting while most shared or cheap hosts have turned this function off. I'm not entirely sure of the consequences of using a dynamic rewrite MOD if your host doesn't support it, and this may need further investigation. If there are no ill effects then it may be worthwhile to include a dynamic rewrite MOD in a future package.

For now, once my computer issues are in order, I can update webmedic's dynamic rewrite MOD to work with IM 1.4.0. I had it working on my site and I was constantly indexed by a number of different search engines. Even as I type this there are at least 2 bots indexing my site, and the site as been inactive for months...
Last edited by Teelk on Wed Dec 31, 1969 5:00 pm, edited 1 time in total.
User avatar
Teelk
Dev Team
Dev Team
 
Posts: 1309
Likes: 0 post
Liked in: 0 post
Joined: Tue Mar 14, 2006 6:25 pm
Cash on hand: 0.00
Location: Canada


Return to IntegraMOD 140

Who is online

Registered users: App360MonitorBot, Bing [Bot], Google [Bot]