StringExtractor (HTML Parser 1.6)

Overview

Package

Class

Use

Tree

Deprecated

Index

Help

HTML Parser Home Page

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

org.htmlparser.parserapplications
Class StringExtractor

java.lang.Object
  org.htmlparser.parserapplications.StringExtractor

public class StringExtractor
extends Object
extends Object

Extract plaintext strings from a web page. Illustrative program to gather the textual contents of a web page. Uses a StringBean to accumulate the user visible text (what a browser would display) into a single string.

Constructor Summary
`StringExtractor(String resource)` Construct a StringExtractor to read from the given resource.

Method Summary
`String`	`extractStrings(boolean links)` Extract the text from a page.
`static void`	`main(String[] args)` Mainline.

Methods inherited from class java.lang.Object
`clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait`

Constructor Detail

StringExtractor

public StringExtractor(String resource)

Construct a StringExtractor to read from the given resource.

Parameters:: resource - Either a URL or a file name.

Method Detail

extractStrings

public String extractStrings(boolean links)
                      throws ParserException

Extract the text from a page.

Parameters:: links - if true include hyperlinks in output.
Returns:: The textual contents of the page.
Throws:: ParserException - If a parse error occurs.

main

public static void main(String[] args)

Mainline.

Parameters:: args - The command line arguments.