public abstract class BaseRobotsParser
extends java.lang.Object
implements java.io.Serializable
Constructor and Description |
---|
BaseRobotsParser() |
Modifier and Type | Method and Description |
---|---|
abstract BaseRobotRules |
failedFetch(int httpStatusCode)
The fetch of robots.txt failed, so return rules appropriate give the
HTTP status code.
|
abstract BaseRobotRules |
parseContent(java.lang.String url,
byte[] content,
java.lang.String contentType,
java.lang.String robotName)
Parse the robots.txt file in
|
public abstract BaseRobotRules parseContent(java.lang.String url, byte[] content, java.lang.String contentType, java.lang.String robotName)
url
- URL that content was fetched from (for reporting purposes)content
- raw bytes from the site's robots.txt filecontentType
- HTTP response header (mime-type)robotName
- name of crawler, to be used when processing file contents
(just the name portion, w/o version or other details)public abstract BaseRobotRules failedFetch(int httpStatusCode)
httpStatusCode
- a failure status code (NOT 2xx)Copyright © 2012 Bixo Labs