Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
New property ServerController.SessionOptions.AllowSearchEngines
#1
Hi,

I already use my own installed content handler for search engines along with the options
ServerController.SearchEngineOptions->RedirectToContentHandler and
ServerController.SearchEngineOptions->ContentHandlerPath.

Will that still work if I set ServerController.SessionOptions.AllowSearchEngines = false ?

What is the content of the build-in robots.txt file and where it will be located ?

Regards
JuergenS
Reply
#2
Hi,

The feature was introduced in one of the last INTRAWEB versions, but I could not find any more information on this topic. Does anyone at ATOZED have more information on this topic?

Regards
JuergenS
Reply
#3
(08-21-2019, 06:51 AM)JuergenS Wrote: Hi,

I already use my own installed content handler for search engines along with the options
ServerController.SearchEngineOptions->RedirectToContentHandler and
ServerController.SearchEngineOptions->ContentHandlerPath.

Will that still work if I set ServerController.SessionOptions.AllowSearchEngines = false ?

What is the content of the build-in robots.txt file and where it will be located ?

Regards
JuergenS

Hi Juergen,

SessionOptions.AllowSearchEngines := False

will block SearchEngines completely. Your content handler won't work because the request from the SearchEngine will be intercepted before.

Built-in Robots.txt contains this:

User-agent: *
Disallow: /

However, it will only be used if AllowSearchEngines is False. Otherwise, you can use your own Robots.txt file in wwwroot folder.

If you want it to behave like previous versions, just set AllowSearchEngines to TRUE and the new feature will be completely disabled.
Reply
#4
Thanks for help!
Reply
#5
Hi,

although I have set AllowSearchEngines = true, I have not received requests from search engines for some time now.
That's why I tested my WEB application with a Searchbot Simulator (http://www.botsimulator.com/).
The simulator unfortunately returns 405 for my WEB application as an HTTP status code.
It looks like INTRAWEB is no longer responding correctly to search queries or does it need to change anything besides the AllowSearchEngines property?

Regards
Juergen
Reply
#6
Can you check which are the requests that are being responded with 405?
Some requests are blocked by default....
Reply
#7
Hi,

In addition to the HTTP status code, I only have the complete HTML display of the simulator, which I could send.
How can I find out the blocked requests?
Is there any message for this in INTRAWEB?
I also tested other commercial websites with this simulator, without any mistakes.

Regards
Juergen
Reply
#8
I'll try the simulator and let you know, but IW should not blocking anything if AllowSearchEngines is set.
Reply
#9
I'm afraid that http://www.botsimulator.com/botsim_send.php doesn't actually work as expected.

The site receives 405 (method not allowed) error code for any request. However, the only reason for IW to return this error code is when the request
method is not one of the allowed methods (GET and POST are always allowed). Googlebot uses GET. So there is something wrong with the request.

So I decided to use another page which simulates google bot: https://www.dnsqueries.com/en/googlebot_simulator.php

This works correctly. Can you please try some other simulator and see if it works?
Reply
#10
Hi,

I have also used the other simulator and it actually provides HTTP status code 200.
I can also see the request in the server trace.
But unfortunately my installed BOT content handler (descendent of TContentBase) is no longer running and that is the problem.
This still worked in IW15.0.24 but not anymore with IW15.1.4.

By the way, for https://www.atozed.com both simulators provide HTTP status code 200 Huh

Regards
Juergen
Reply


Forum Jump:


Users browsing this thread: 3 Guest(s)