Package | Description |
---|---|
bixo.fetcher | |
bixo.parser |
Modifier and Type | Method and Description |
---|---|
FetchedDatum |
SimpleHttpFetcher.get(ScoredUrlDatum scoredUrl) |
FetchedDatum |
LoggingFetcher.get(ScoredUrlDatum datum) |
abstract FetchedDatum |
BaseFetcher.get(ScoredUrlDatum scoredUrl) |
Modifier and Type | Method and Description |
---|---|
protected java.lang.String |
BaseParser.getCharset(FetchedDatum datum)
Extract encoding from content-type
If a charset is returned, then it's a valid/normalized charset name that's
supported on this platform.
|
protected java.net.URL |
BaseParser.getContentLocation(FetchedDatum fetchedDatum)
Figure out the right base URL to use, for when we need to resolve relative URLs.
|
protected java.lang.String |
BaseParser.getLanguage(FetchedDatum fetchedDatum,
java.lang.String charset)
Extract language from (first) explicit header
|
ParsedDatum |
SimpleParser.parse(FetchedDatum fetchedDatum) |
abstract ParsedDatum |
BaseParser.parse(FetchedDatum fetchedDatum) |
Copyright © 2012 Bixo Labs