org.htmlparser.Parser.extractAllNodesThatMatch java code examples

 Parser parser = new Parser(url);
NodeList movies = parser.extractAllNodesThatMatch(
  new AndFilter(new TagNameFilter("div"),
  new HasAttributeFilter("class", "movie")));

/**
 * Convenience method to extract all nodes of a given class type.
 * Equivalent to
 * <code>extractAllNodesThatMatch (new NodeClassFilter (nodeType))</code>.
 * @param nodeType The class of the nodes to collect.
 * @throws ParserException If a parse error occurs.
 * @return A list of nodes which have the class specified.
 * @deprecated Use extractAllNodesThatMatch (new NodeClassFilter (cls)).
 * @see #extractAllNodesThatAre
 */
public Node [] extractAllNodesThatAre (Class nodeType)
  throws
    ParserException
{
  NodeList ret;
  ret = extractAllNodesThatMatch (new NodeClassFilter (nodeType));
  return (ret.toNodeArray ());
}

 Parser parser = Parser.createParser(comment.getText(), "UTF-8");
NodeList htmlAnchorNodes = null;
try {
  htmlAnchorNodes = parser
      .extractAllNodesThatMatch(new TagNameFilter("a"));
} catch (ParserException e) {
  e.printStackTrace();
}

int size = htmlAnchorNodes.size();

public static List<String> getLinks(String url) throws ParserException {
  Parser htmlParser = new Parser(url);
  List<String> links = new LinkedList<String>();
  NodeList tagNodeList = htmlParser.extractAllNodesThatMatch(new NodeClassFilter(LinkTag.class));
  for (int m = 0; m < tagNodeList.size(); m++) {
    LinkTag loopLinks = (LinkTag) tagNodeList.elementAt(m);
    String linkName = loopLinks.getLink();
    links.add(linkName);
  }
  return links;
}

 int size;

{
  Parser parser = Parser.createParser(comment.getText(), "UTF-8");
  NodeList htmlAnchorNodes = null;
  try {
    htmlAnchorNodes = parser
        .extractAllNodesThatMatch(new TagNameFilter("a"));
  } catch (ParserException e) {
    e.printStackTrace();
  }
  size = htmlAnchorNodes.size();
}

try
  list = mParser.extractAllNodesThatMatch (filter);
  list = mParser.extractAllNodesThatMatch (filter);

try
  list = mParser.extractAllNodesThatMatch (filter);
  list = mParser.extractAllNodesThatMatch (filter);

NodeList list = parser.extractAllNodesThatMatch(new TagNameFilter("P"));

NodeList list = parser.extractAllNodesThatMatch(new TagNameFilter("P"));

/**
 * Extracts the title from the given HTML.
 *
 * @return never null, just an empty string if not parsable.
 */
public static String extractTitle(String html) throws ParserException {
  String title = "";
  Parser parser = new Parser(html);
  NodeList matches = parser.extractAllNodesThatMatch(TITLE_FILTER);
  SimpleNodeIterator it = matches.elements();
  while (it.hasMoreNodes()) {
    TitleTag node = (TitleTag) it.nextNode();
    title = node.getTitle().trim();
  }
  return title;
}

NodeList matches = parser.extractAllNodesThatMatch(LINK_FILTER);
SimpleNodeIterator it = matches.elements();
while (it.hasMoreNodes()) {

list = parser.extractAllNodesThatMatch (filter);
for (int i = 0; i < list.size (); i++)
  System.out.println (list.elementAt (i).toHtml ());

Node[] tables = parser.extractAllNodesThatMatch( new TagNameFilter( "table" ) ).toNodeArray();

list = parser.extractAllNodesThatMatch(filter);

 list = parser.extractAllNodesThatMatch(filter);
} catch (ParserException e) {
 reporter.incrCounter(LinkCounter.PARSER_FAILED, 1);

NodeList links = new NodeList ();
parser = createParserParsingAnInputString(output);
links = parser.extractAllNodesThatMatch(filter);

NodeList links = new NodeList ();
parser = createParserParsingAnInputString(output);
links = parser.extractAllNodesThatMatch(filter);

NodeList links = new NodeList ();
parser = createParserParsingAnInputString(output);
links = parser.extractAllNodesThatMatch(filterLink);

NodeList links = new NodeList ();
parser = createParserParsingAnInputString(output);
links = parser.extractAllNodesThatMatch(filterLink);

Javadoc

Extract all nodes matching the given filter.

Popular methods of Parser

<init>
Construct a parser using the provided lexer and feedback object. This would be used to create a pars
parse
Parse the given resource, using the filter provided. This can be used to extract information from sp
visitAllNodesWith
Apply the given visitor to the current page. The visitor is passed to the accept() method of each n
createParser
Creates the parser on an input string.
setNodeFactory
Set the current node factory.
setLexer
Set the lexer for this parser. The current NodeFactory is transferred to (set on) the given lexer, s
elements
Returns an iterator (enumeration) over the html nodes. org.htmlparser.nodes can be of three main typ
reset
Reset the parser to start from the beginning again. This assumes support for a reset from the underl
getConnection
Return the current connection.
setInputHTML
Initializes the parser with the given input HTML String.
setURL
Set the URL for this parser. This method creates a new Lexer reading from the given URL. Trying to s
getConnectionManager
Get the connection manager all Parsers use.

Popular in Java

Making http requests using okhttp
onRequestPermissionsResult (Fragment)
getResourceAsStream (ClassLoader)
startActivity (Activity)
ConnectException (java.net)
A ConnectException is thrown if a connection cannot be established to a remote host on a specific po
MessageDigest (java.security)
Uses a one-way hash function to turn an arbitrary number of bytes into a fixed-length byte sequence.
Timestamp (java.sql)
A Java representation of the SQL TIMESTAMP type. It provides the capability of representing the SQL
Pattern (java.util.regex)
Patterns are compiled regular expressions. In many cases, convenience methods such as String#matches
Servlet (javax.servlet)
Defines methods that all servlets must implement. A servlet is a small Java program that runs within
BorderLayout (java.awt)
A border layout lays out a container, arranging and resizing its components to fit in five regions:
Top Sublime Text plugins

How to use extractAllNodesThatMatchmethodin org.htmlparser.Parser

Best Java code snippets using org.htmlparser.Parser.extractAllNodesThatMatch (Showing top 19 results out of 315)

How to use
extractAllNodesThatMatch
method
in
org.htmlparser.Parser