Package | Description |
---|---|
bixo.fetcher | |
bixo.operations | |
bixo.pipes | |
bixo.robots |
Modifier and Type | Class and Description |
---|---|
class |
LoggingFetcher |
class |
SimpleHttpFetcher |
Constructor and Description |
---|
FetchTask(IFetchMgr fetchMgr,
BaseFetcher httpFetcher,
java.util.List<ScoredUrlDatum> items,
java.lang.String ref) |
Modifier and Type | Method and Description |
---|---|
static BaseFetcher |
UrlLengthener.makeFetcher(int maxThreads,
UserAgent userAgent)
Return a SimpleHttpFetcher that's appropriate for lengthening URLs.
|
Constructor and Description |
---|
FetchBuffer(BaseFetcher fetcher) |
FilterAndScoreByUrlAndRobots(BaseFetcher fetcher,
BaseRobotsParser parser,
BaseScoreGenerator scorer) |
ProcessRobotsTask(java.lang.String protocolAndDomain,
BaseScoreGenerator scorer,
java.util.Queue<GroupedUrlDatum> urls,
BaseFetcher fetcher,
BaseRobotsParser parser,
cascading.tuple.TupleEntryCollector collector,
com.scaleunlimited.cascading.LoggingFlowProcess flowProcess) |
ResolveRedirectsTask(java.lang.String url,
BaseFetcher fetcher,
cascading.tuple.TupleEntryCollector collector,
cascading.flow.FlowProcess flowProcess) |
UrlLengthener(BaseFetcher fetcher) |
UrlLengthener(BaseFetcher fetcher,
cascading.tuple.Fields resultField) |
Constructor and Description |
---|
FetchPipe(cascading.pipe.Pipe urlProvider,
BaseScoreGenerator scorer,
BaseFetcher fetcher,
BaseFetcher robotsFetcher,
BaseRobotsParser parser,
BaseFetchJobPolicy fetchJobPolicy,
int numReducers) |
FetchPipe(cascading.pipe.Pipe urlProvider,
BaseScoreGenerator scorer,
BaseFetcher fetcher,
int numReducers)
Generate an assembly that will fetch all of the UrlDatum tuples coming out of urlProvider.
|
Modifier and Type | Method and Description |
---|---|
static BaseFetcher |
RobotUtils.createFetcher(BaseFetcher fetcher) |
static BaseFetcher |
RobotUtils.createFetcher(UserAgent userAgent,
int maxThreads) |
Modifier and Type | Method and Description |
---|---|
static BaseFetcher |
RobotUtils.createFetcher(BaseFetcher fetcher) |
static BaseRobotRules |
RobotUtils.getRobotRules(BaseFetcher fetcher,
BaseRobotsParser parser,
java.net.URL robotsUrl)
Externally visible, static method for use in tools and for testing.
|
Copyright © 2012 Bixo Labs