Mod Release: GoogleBot Detector (1.4.1)

Mods etc.

Moderator: Integra Moderator

Mod Release: GoogleBot Detector (1.4.1)

PostAuthor: ZacFields » Mon Apr 09, 2007 11:26 am

GoogleBot Spider Detector

I think many people will enjoy this mod. It detects the Google spider's presence on your forums and logs the EXACT URL's that it is visiting. This serves many purposes such as:

1. Identifying a possible googlebot problem (googlebot has been known to hit websites too heavy and ignore crawl delays set on the robots.txt file)

2. Seeing specifically which pages on your site google is indexing.

3. OR just plain seeing how much and how often google visits your site.

Five minutes after I installed this mod I had several google hits already. Overnight Google had hit over 800 pages on my site. Very interesting information to see and this mod is very simple to install.

This mod works with 1.4.1 but I do not know if it works with 1.4.0 but it should.

Here is the download link:
http://www.brokencar.net/im_mods/googlebot.zip

Zac
Last edited by ZacFields on Wed Dec 31, 1969 5:00 pm, edited 1 time in total.

ZacFields
Sr Integra Member
Sr Integra Member
 
Posts: 426
Likes: 0 post
Liked in: 0 post
Joined: Wed May 24, 2006 11:14 pm
Cash on hand: 0.00

Re: Mod Release: GoogleBot Detector (1.4.1)

PostAuthor: .QUACK.Major.Pain » Mon Apr 09, 2007 3:43 pm

Sounds cool.

Got it working but think it's not registering all of them.

Usually when I go on our site, there are 2-4 googlebot ip's in the ACP index
Will have to check tomorrow and check the count.

Does it show a duplicate ip if same returns at another time?

I get hit by googlebot so far: 7 pages 23 visits.
Lycos I get way more a day: 424 pages 93 visits
Last edited by .QUACK.Major.Pain on Wed Dec 31, 1969 5:00 pm, edited 1 time in total.

.QUACK.Major.Pain
Sr Integra Member
Sr Integra Member
 
Posts: 986
Likes: 0 post
Liked in: 0 post
Joined: Sat Jan 27, 2007 11:15 am
Cash on hand: 0.00

PostAuthor: ZacFields » Mon Apr 09, 2007 10:22 pm

^seems to be registering all of them for me... or at least I hope so with well over 700 hits between 2am this morning and 11am this morning. lol

I believe the mod just searches for the name "googlebot" from all I could tell in the source code. not any specific ip address or IP range.

Zac
Last edited by ZacFields on Wed Dec 31, 1969 5:00 pm, edited 1 time in total.

ZacFields
Sr Integra Member
Sr Integra Member
 
Posts: 426
Likes: 0 post
Liked in: 0 post
Joined: Wed May 24, 2006 11:14 pm
Cash on hand: 0.00

Re: Mod Release: GoogleBot Detector (1.4.1)

PostAuthor: .QUACK.Major.Pain » Tue Apr 10, 2007 4:49 am

I checked last night before I went to bed, and the ACP index showed the googlebot ip logged in.
When I checked this morning, there was no record of any googlebots.
I'm sure your site is older and propably why more hits.
Last edited by .QUACK.Major.Pain on Wed Dec 31, 1969 5:00 pm, edited 1 time in total.

.QUACK.Major.Pain
Sr Integra Member
Sr Integra Member
 
Posts: 986
Likes: 0 post
Liked in: 0 post
Joined: Sat Jan 27, 2007 11:15 am
Cash on hand: 0.00

Re: Mod Release: GoogleBot Detector (1.4.1)

PostAuthor: Whisky » Tue Apr 10, 2007 6:38 am

2 minutes after having installed this I already have 15 records lol <img>

Do you think it's possible to display the bot in the users online box?
Last edited by Whisky on Wed Dec 31, 1969 5:00 pm, edited 1 time in total.
I am the Lizard King, I can do anything

Whisky
Sr Integra Member
Sr Integra Member
 
Posts: 256
Likes: 0 post
Liked in: 0 post
Joined: Thu May 18, 2006 2:28 am
Cash on hand: 0.00
Location: Brussels

PostAuthor: ZacFields » Tue Apr 10, 2007 10:49 am

^That is something I'd like to work on. But as of right now it's not possible because that mod has not yet been ported to IM.

I am also working on porting a mod right now that will allow you to see how many results are shown on each search engine by searching for your site name. You'de be able to access this with a single click from your ACP but as of right now only 2 of the 6 search engines are working so I'm trying to update the mod to make it work. However, it is somewhat of an older mod so it might not work out.

Zac
Last edited by ZacFields on Wed Dec 31, 1969 5:00 pm, edited 1 time in total.

ZacFields
Sr Integra Member
Sr Integra Member
 
Posts: 426
Likes: 0 post
Liked in: 0 post
Joined: Wed May 24, 2006 11:14 pm
Cash on hand: 0.00

Re: Mod Release: GoogleBot Detector (1.4.1)

PostAuthor: jtadmin » Tue Apr 10, 2007 11:36 am

Does anyone know if this works with 1.4.0?
Last edited by jtadmin on Wed Dec 31, 1969 5:00 pm, edited 1 time in total.
User avatar
jtadmin
Newbie
Newbie
 
Posts: 8
Likes: 0 post
Liked in: 0 post
Joined: Tue Jun 20, 2006 9:57 am
Cash on hand: 0.00

PostAuthor: ZacFields » Tue Apr 10, 2007 11:46 am

jt: I haven't actually tested it, but I am almost 100% confident that it will work with 1.4.0.

Give it a try, there are only a couple file edits. SHould only take about 5 minutes. let us know if it works.

That being said, according to my logs, as of 1:00 yesterday, so approximately 24 hours time. There are 1,980 hits from googlebot on my site.

The problem I've noticed with googlebot is that they can get around your robots.txt restrictions by sending more than one googlebot IP to your site. Sometimes I have 5-10 google IP's on my site so even though I have a 120 second crawl delay it doesn't make a difference when there's so many different bots on.

Zac
Last edited by ZacFields on Wed Dec 31, 1969 5:00 pm, edited 1 time in total.

ZacFields
Sr Integra Member
Sr Integra Member
 
Posts: 426
Likes: 0 post
Liked in: 0 post
Joined: Wed May 24, 2006 11:14 pm
Cash on hand: 0.00

PostAuthor: jtadmin » Tue Apr 10, 2007 12:15 pm

"ZacFields";p="23921" wrote:jt:
The problem I've noticed with googlebot is that they can get around your robots.txt restrictions by sending more than one googlebot IP to your site. Sometimes I have 5-10 google IP's on my site so even though I have a 120 second crawl delay it doesn't make a difference when there's so many different bots on.

Zac


Is this going to effect the performance of my website. Today I had to figure out how to manually remove over 600 pending bots and sessions from the database. I was wondering what this add-on will give me over what currently in place for bot management.
Last edited by jtadmin on Wed Dec 31, 1969 5:00 pm, edited 1 time in total.
User avatar
jtadmin
Newbie
Newbie
 
Posts: 8
Likes: 0 post
Liked in: 0 post
Joined: Tue Jun 20, 2006 9:57 am
Cash on hand: 0.00

PostAuthor: ZacFields » Tue Apr 10, 2007 12:21 pm

The only thing this mod is really good for is telling you specifically which URL's are being visited by the googlebot.

It gives you the date/time and then the exact URL so you can see which topics googlebot has already spidered.

I haven't noticed any performance difference after installing this modificatino. it is useful to me to be able to see when googlebot is hitting my site too hard. When your forum is running slow it's a pretty easy way to tell if googlebot is simply hitting you too hard.

Zac
Last edited by ZacFields on Wed Dec 31, 1969 5:00 pm, edited 1 time in total.

ZacFields
Sr Integra Member
Sr Integra Member
 
Posts: 426
Likes: 0 post
Liked in: 0 post
Joined: Wed May 24, 2006 11:14 pm
Cash on hand: 0.00

Re: Mod Release: GoogleBot Detector (1.4.1)

PostAuthor: .QUACK.Major.Pain » Tue Apr 10, 2007 2:10 pm

My forum is only a couple of months old, but in the last 23+ hours I haven't had any gogglebots.
But I have had 28 Lycos bots.
As I am writing this, googlebot is on my ACP index. It was also on this morning when I looked. This would lead me to think that this googlebot and bot management doesn't register all of them, unless it's a bot I have already added from the pending bots, then I could understand that it probably passes over already added ones.
Last edited by .QUACK.Major.Pain on Wed Dec 31, 1969 5:00 pm, edited 1 time in total.

.QUACK.Major.Pain
Sr Integra Member
Sr Integra Member
 
Posts: 986
Likes: 0 post
Liked in: 0 post
Joined: Sat Jan 27, 2007 11:15 am
Cash on hand: 0.00

PostAuthor: ZacFields » Tue Apr 10, 2007 2:17 pm

can't imagine why it's not working for you. i'm getting about 100% success rate. all it does is search for the hostname "googlebot" which all the googlebots have in their hostname.

Did you remember to perform the SQL query from the instructions? I would assume you'de get an error if you hadn't but just an idea.

Zac
Last edited by ZacFields on Wed Dec 31, 1969 5:00 pm, edited 1 time in total.

ZacFields
Sr Integra Member
Sr Integra Member
 
Posts: 426
Likes: 0 post
Liked in: 0 post
Joined: Wed May 24, 2006 11:14 pm
Cash on hand: 0.00

Re: Mod Release: GoogleBot Detector (1.4.1)

PostAuthor: .QUACK.Major.Pain » Tue Apr 10, 2007 2:27 pm

I did that.
I checked my database and there is a phpbb_googlebot thingy was there. (don't recall the proper name but it was there)

I got 3 or 4 in the first 20 minutes after installing, but nothing since.
Last edited by .QUACK.Major.Pain on Wed Dec 31, 1969 5:00 pm, edited 1 time in total.

.QUACK.Major.Pain
Sr Integra Member
Sr Integra Member
 
Posts: 986
Likes: 0 post
Liked in: 0 post
Joined: Sat Jan 27, 2007 11:15 am
Cash on hand: 0.00

PostAuthor: ZacFields » Tue Apr 10, 2007 2:31 pm

that's rather odd. I'd say just leave it for a few more days and see if anything turns up. Could be some sort of compatibility issue or something with your php version.

Zac
Last edited by ZacFields on Wed Dec 31, 1969 5:00 pm, edited 1 time in total.

ZacFields
Sr Integra Member
Sr Integra Member
 
Posts: 426
Likes: 0 post
Liked in: 0 post
Joined: Wed May 24, 2006 11:14 pm
Cash on hand: 0.00

PostAuthor: tekguru » Tue Apr 10, 2007 10:31 pm

417 pages here in 6 hours! Does this indicate that the site is getting over googled?
Last edited by tekguru on Wed Dec 31, 1969 5:00 pm, edited 1 time in total.
[size=99px]http][/size]
[url=http][img=left]http://www.4winmobile.com/news/MVP_Horizontal_FullColor.png[/img][/url]
User avatar
tekguru
Sr Integra Member
Sr Integra Member
 
Posts: 329
Likes: 0 post
Liked in: 0 post
Joined: Tue Mar 28, 2006 11:29 pm
Cash on hand: 0.00

PostAuthor: ZacFields » Tue Apr 10, 2007 10:56 pm

Well, the only way you can really be over-googled is if you're either:

A. Experiencing page load time problems on your site

B. You're having problems with exceeding your monthly bandwidth.

Otherwise, the more the merrier! Tek, your home page on your site is actually a PR6 right now, so I would assume google never leaves your site. At a PR4 (which is what my index.php page is) Google is supposed to spider your site once every 24 hours approximately. Well if you have several thousand (or hundreds of thousands) of pages, it will take google more than 24 hours to scan your site, thus it will essentially never leave.

Google never leaves my site. You can easily tell that from my google logs from this mod on my forum.

Remember if google visits your site a lot like me and tekguru, it would be a good idea to clear those logs at least once a week. as of right now the .script does not do it by itself already.

Zac
Last edited by ZacFields on Wed Dec 31, 1969 5:00 pm, edited 1 time in total.

ZacFields
Sr Integra Member
Sr Integra Member
 
Posts: 426
Likes: 0 post
Liked in: 0 post
Joined: Wed May 24, 2006 11:14 pm
Cash on hand: 0.00

PostAuthor: tekguru » Wed Apr 11, 2007 8:59 am

Zac I'll stop worring then <img> Question though how do you find out he PR rating and what it means?
Last edited by tekguru on Wed Dec 31, 1969 5:00 pm, edited 1 time in total.
[size=99px]http][/size]
[url=http][img=left]http://www.4winmobile.com/news/MVP_Horizontal_FullColor.png[/img][/url]
User avatar
tekguru
Sr Integra Member
Sr Integra Member
 
Posts: 329
Likes: 0 post
Liked in: 0 post
Joined: Tue Mar 28, 2006 11:29 pm
Cash on hand: 0.00

PostAuthor: ZacFields » Wed Apr 11, 2007 11:59 am

TekGuru:

I saw your Google PR rating using the google toolbar I have installed on my internet explorer, but you can also check it at sites like this one: http://www.prchecker.net/check.php

Basically your google PR rating is compiled by how many people link to you, and what the google PR rating is of sites that are linked to you. For instance, if you had a bunch of PR8 and PR7 sites that had links to your site, then that would affect your PR ratingin a positive way.

Your site having a PR rating of 6 tells me a lot of things. For one it tells me that your site is very well established and that you probably get a lot of people coming to your site from search engines. Basically it means that google hits your site very hard and probably places you very high on search results relevant to your site.

Another important thing it tells me, and you should be aware of is that a single text link on your site is worth upwards of $40 per month. Because being linked to a PR6 site is very good for sites that lack a PR rating.

In short, you could get on textlinkads.com and sell text links on your index page for $30-$40 for a month's worth each. If you placed a block on your homepage and sold 10 text links a month you could bring $300-$400 easy.

So congratulations to you Tekguru. You've done an outstanding job with your website by google standards and you stand to be able to make a lot of money based simply on your google PR rating.

Zac
Last edited by ZacFields on Wed Dec 31, 1969 5:00 pm, edited 1 time in total.

ZacFields
Sr Integra Member
Sr Integra Member
 
Posts: 426
Likes: 0 post
Liked in: 0 post
Joined: Wed May 24, 2006 11:14 pm
Cash on hand: 0.00

Re: Mod Release: GoogleBot Detector (1.4.1)

PostAuthor: .QUACK.Major.Pain » Wed Apr 11, 2007 2:48 pm

Just a question regarding what was mentioned above about showing googlebot in the online box.

Could you inject or insert a new user in the user database?

I would assume the user database would have info such:

username: me
ip: a.b.c.d

Could you insert in the database:

username: googlebot
ip: 66.249.x.x

Would that recognize and register in the online box? Or am I out to lunch on this thought? LOL
Last edited by .QUACK.Major.Pain on Wed Dec 31, 1969 5:00 pm, edited 1 time in total.

.QUACK.Major.Pain
Sr Integra Member
Sr Integra Member
 
Posts: 986
Likes: 0 post
Liked in: 0 post
Joined: Sat Jan 27, 2007 11:15 am
Cash on hand: 0.00

PostAuthor: tekguru » Wed Apr 11, 2007 2:53 pm

Cheers for that Xac one lives and occasionally learns.

Very interested in the text ads as we only just break even with the Google banners.

I thought though that one can not have any conflicting ads when one is running Google advertising though?

Oh and did you mean http://www.text-link-ads.com/ for the text ads?

As this if viable coule let us make a bit of money at last!
Last edited by tekguru on Wed Dec 31, 1969 5:00 pm, edited 1 time in total.
[size=99px]http][/size]
[url=http][img=left]http://www.4winmobile.com/news/MVP_Horizontal_FullColor.png[/img][/url]
User avatar
tekguru
Sr Integra Member
Sr Integra Member
 
Posts: 329
Likes: 0 post
Liked in: 0 post
Joined: Tue Mar 28, 2006 11:29 pm
Cash on hand: 0.00

PostAuthor: ZacFields » Wed Apr 11, 2007 10:23 pm

Yeah. For some reason they didn't accept my site (the text-link-ads.com site) I think it's because my portal.php page is only a PR 2 even though my index.php page is a PR4.

Basically on text-link-ads.com the people pay you just for the link, not for clicks or anything like that. Google ads perform very poorly on forum sites due to the large amount of return visitors (return visitors generally won't click on ads)

With a PR6, you will be able to monetize that site (and btw google shouldn't mind your text links) very easily. If text-link-ads doesn't get you what you want, you should try listing an auction on ebay. I've seen some good text links sell on ebay from time to time so you could throw it out there for $30 for a text link on your website for one month. And promise them you are only going to sell 10 links or something. You should be able to pull that off.

Good luck to you tekguru... with a little research you can definitely make some money on that site.

Zac
Last edited by ZacFields on Wed Dec 31, 1969 5:00 pm, edited 1 time in total.

ZacFields
Sr Integra Member
Sr Integra Member
 
Posts: 426
Likes: 0 post
Liked in: 0 post
Joined: Wed May 24, 2006 11:14 pm
Cash on hand: 0.00

PostAuthor: sanji » Wed Apr 11, 2007 10:30 pm

Working fine... I got 3000 pages viewed by google in less than 24 hours!

It would be great, as someone suggested, to also have google bot indicated as "google" on the who is online box on the portal. Is that feasible?

sanji
Last edited by sanji on Wed Dec 31, 1969 5:00 pm, edited 1 time in total.
[img]http://www.secret-japan.com/forum/images/banners/fuji%20secret-japan%2088x31.gif[/img] [url=http]Secret Japan[/url] : discover Japan off the beaten tracks

sanji
Sr Integra Member
Sr Integra Member
 
Posts: 291
Likes: 0 post
Liked in: 0 post
Joined: Wed Apr 12, 2006 9:18 pm
Cash on hand: 0.00

Re: Mod Release: GoogleBot Detector (1.4.1)

PostAuthor: ZacFields » Wed Apr 11, 2007 11:33 pm

".=QUACK=.Major.Pain";p="23980" wrote:Just a question regarding what was mentioned above about showing googlebot in the online box.

Could you inject or insert a new user in the user database?

I would assume the user database would have info such:

username: me
ip: a.b.c.d

Could you insert in the database:

username: googlebot
ip: 66.249.x.x

Would that recognize and register in the online box? Or am I out to lunch on this thought? LOL


That wouldn't be of any use because the googlebot would first have to log on to the site in order to be shown, so you'de have to bypass that. There is a modification that allows the googlebot and other bots to be shown in your users online. The problem is that I don't believe it has been ported to integramod yet.

If somebody finds me the link to the specific modification I can try to look at it and see if it will work. I just don't know what it's called right off-hand.

Zac
Last edited by ZacFields on Wed Dec 31, 1969 5:00 pm, edited 1 time in total.

ZacFields
Sr Integra Member
Sr Integra Member
 
Posts: 426
Likes: 0 post
Liked in: 0 post
Joined: Wed May 24, 2006 11:14 pm
Cash on hand: 0.00

PostAuthor: sanji » Thu Apr 12, 2007 2:04 am

Last edited by sanji on Wed Dec 31, 1969 5:00 pm, edited 1 time in total.
[img]http://www.secret-japan.com/forum/images/banners/fuji%20secret-japan%2088x31.gif[/img] [url=http]Secret Japan[/url] : discover Japan off the beaten tracks

sanji
Sr Integra Member
Sr Integra Member
 
Posts: 291
Likes: 0 post
Liked in: 0 post
Joined: Wed Apr 12, 2006 9:18 pm
Cash on hand: 0.00

PostAuthor: tekguru » Thu Apr 12, 2007 9:15 am

Cheers Zac, will give it a go!
Last edited by tekguru on Wed Dec 31, 1969 5:00 pm, edited 1 time in total.
[size=99px]http][/size]
[url=http][img=left]http://www.4winmobile.com/news/MVP_Horizontal_FullColor.png[/img][/url]
User avatar
tekguru
Sr Integra Member
Sr Integra Member
 
Posts: 329
Likes: 0 post
Liked in: 0 post
Joined: Tue Mar 28, 2006 11:29 pm
Cash on hand: 0.00

Re: Mod Release: GoogleBot Detector (1.4.1)

PostAuthor: .QUACK.Major.Pain » Thu Apr 12, 2007 1:34 pm

I finally got mine working - had to remove google and add it again to work.

My only thing is now, I only have 1 google IP that continues to come back (66.249.65.110) never any other google IP's
Last edited by .QUACK.Major.Pain on Wed Dec 31, 1969 5:00 pm, edited 1 time in total.

.QUACK.Major.Pain
Sr Integra Member
Sr Integra Member
 
Posts: 986
Likes: 0 post
Liked in: 0 post
Joined: Sat Jan 27, 2007 11:15 am
Cash on hand: 0.00

Re: Mod Release: GoogleBot Detector (1.4.1)

PostAuthor: .QUACK.Major.Pain » Wed May 23, 2007 2:53 pm

Hey Zacfields, Your link is no good anymore.
You have another location?
Last edited by .QUACK.Major.Pain on Wed Dec 31, 1969 5:00 pm, edited 1 time in total.

.QUACK.Major.Pain
Sr Integra Member
Sr Integra Member
 
Posts: 986
Likes: 0 post
Liked in: 0 post
Joined: Sat Jan 27, 2007 11:15 am
Cash on hand: 0.00


Return to IntegraMOD Modifications

Who is online

Registered users: App360MonitorBot, Bing [Bot]