public class DefaultFetchJobPolicy extends BaseFetchJobPolicy
BaseFetchJobPolicy.FetchSetInfo
DEFAULT_CRAWL_DELAY, UNSET_CRAWL_DELAY
Constructor and Description |
---|
DefaultFetchJobPolicy() |
DefaultFetchJobPolicy(FetcherPolicy policy) |
DefaultFetchJobPolicy(int maxUrlsPerSet,
int maxUrlsPerServer,
long defaultCrawlDelay) |
Modifier and Type | Method and Description |
---|---|
BaseFetchJobPolicy.FetchSetInfo |
endFetchSet() |
protected int |
getMaxUrlsPerServer(ScoredUrlDatum scoredDatum)
Return max URLs per fetch job for the server indicated by the URL in
|
protected int |
getMaxUrlsPerSet(ScoredUrlDatum scoredDatum)
Return max URLs per fetch set for the server indicated by the URL in
|
BaseFetchJobPolicy.FetchSetInfo |
nextFetchSet(ScoredUrlDatum scoredDatum) |
static long |
nextSortKey(java.util.Random rand,
long divisor,
long curRequestTime)
Time to move the request time forward.
|
void |
startFetchSet(java.lang.String groupingKey,
long crawlDelay) |
getDefaultCrawlDelay, setDefaultCrawlDelay
public DefaultFetchJobPolicy()
public DefaultFetchJobPolicy(FetcherPolicy policy)
public DefaultFetchJobPolicy(int maxUrlsPerSet, int maxUrlsPerServer, long defaultCrawlDelay)
public void startFetchSet(java.lang.String groupingKey, long crawlDelay)
startFetchSet
in class BaseFetchJobPolicy
public BaseFetchJobPolicy.FetchSetInfo nextFetchSet(ScoredUrlDatum scoredDatum)
nextFetchSet
in class BaseFetchJobPolicy
public BaseFetchJobPolicy.FetchSetInfo endFetchSet()
endFetchSet
in class BaseFetchJobPolicy
protected int getMaxUrlsPerServer(ScoredUrlDatum scoredDatum)
scoredDatum
- datum containing URL to serverprotected int getMaxUrlsPerSet(ScoredUrlDatum scoredDatum)
scoredDatum
- datum containing URL to serverpublic static long nextSortKey(java.util.Random rand, long divisor, long curRequestTime)
rand
- divisor
- What slice of remaining time range to consume (randomly)curRequestTime
- Current time (actually offset)Copyright © 2012 Bixo Labs