|
HTML Parser Home Page | |||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectorg.htmlparser.parserapplications.StringExtractor
public class StringExtractor
Extract plaintext strings from a web page.
Illustrative program to gather the textual contents of a web page.
Uses a StringBean
to accumulate
the user visible text (what a browser would display) into a single string.
Constructor Summary | |
---|---|
StringExtractor(String resource)
Construct a StringExtractor to read from the given resource. |
Method Summary | |
---|---|
String |
extractStrings(boolean links)
Extract the text from a page. |
static void |
main(String[] args)
Mainline. |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Constructor Detail |
---|
public StringExtractor(String resource)
resource
- Either a URL or a file name.Method Detail |
---|
public String extractStrings(boolean links) throws ParserException
links
- if true
include hyperlinks in output.
ParserException
- If a parse error occurs.public static void main(String[] args)
args
- The command line arguments.
|
© 2005 Derrick Oswald Jun 10, 2006
|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
HTML Parser is an open source library released under LGPL. | |