Class TagNode

  • All Implemented Interfaces:
    BaseToken, HtmlNode

    public class TagNode
    extends TagToken
    implements HtmlNode

    XML node tag - basic node of the cleaned HTML tree. At the same time, it represents start tag token after HTML parsing phase and before cleaning phase. After cleaning process, tree structure remains containing tag nodes (TagNode class), content (text nodes - ContentNode), comments (CommentNode) and optionally doctype node (DoctypeToken).

    • Constructor Detail

      • TagNode

        public TagNode​(java.lang.String name)
    • Method Detail

      • getName

        public java.lang.String getName()
        Overrides:
        getName in class TagToken
      • getAttributeByName

        public java.lang.String getAttributeByName​(java.lang.String attName)
        Parameters:
        attName -
        Returns:
        Value of the specified attribute, or null if it this tag doesn't contain it.
      • getAttributes

        public java.util.Map<java.lang.String,​java.lang.String> getAttributes()
        Returns the attributes of the tagnode.
        Returns:
        Map instance containing all attribute name/value pairs.
      • getAttributesInLowerCase

        public java.util.Map<java.lang.String,​java.lang.String> getAttributesInLowerCase()
        Returns the attributes of the tagnode in lower case.
        Returns:
        Map instance containing all attribute name/value pairs, with attribute names transformed to lower case
      • setAttributes

        public void setAttributes​(java.util.Map<java.lang.String,​java.lang.String> attributes)
        Replace the current set of attributes with a new set.
        Parameters:
        attributes -
      • hasAttribute

        public boolean hasAttribute​(java.lang.String attName)
        Checks existence of specified attribute.
        Parameters:
        attName -
        Returns:
        true if TagNode has attribute
      • addAttribute

        public void addAttribute​(java.lang.String attName,
                                 java.lang.String attValue)
        Adds specified attribute to this tag or overrides existing one.
        Parameters:
        attName -
        attValue -
      • removeAttribute

        public void removeAttribute​(java.lang.String attName)
        Removes specified attribute from this tag.
        Parameters:
        attName -
      • getChildren

        @Deprecated
        public java.util.List<TagNode> getChildren()
        Deprecated.
        use getChildTagList(), will be refactored and possibly removed in future versions. TODO This method should be refactored because is does not properly match the commonly used Java's getter/setter strategy.
        Returns:
        List of child TagNode objects.
      • setChildren

        public void setChildren​(java.util.List<? extends BaseToken> children)
      • getAllChildren

        public java.util.List<? extends BaseToken> getAllChildren()
      • getChildTagList

        public java.util.List<TagNode> getChildTagList()
        Returns:
        List of child TagNode objects.
      • hasChildren

        public boolean hasChildren()
        Returns:
        Whether this node has child elements or not.
      • getChildTags

        public TagNode[] getChildTags()
        Returns:
        An array of child TagNode instances.
      • getText

        public java.lang.CharSequence getText()
        Returns:
        Text content of this node and it's subelements.
      • getChildIndex

        public int getChildIndex​(HtmlNode child)
        Parameters:
        child - Child to find index of
        Returns:
        Index of the specified child node inside this node's children, -1 if node is not the child
      • insertChild

        public void insertChild​(int index,
                                HtmlNode childToAdd)
        Inserts specified node at specified position in array of children
        Parameters:
        index -
        childToAdd -
      • insertChildBefore

        public void insertChildBefore​(HtmlNode node,
                                      HtmlNode nodeToInsert)
        Inserts specified node in the list of children before specified child
        Parameters:
        node - Child before which to insert new node
        nodeToInsert - Node to be inserted at specified position
      • insertChildAfter

        public void insertChildAfter​(HtmlNode node,
                                     HtmlNode nodeToInsert)
        Inserts specified node in the list of children after specified child
        Parameters:
        node - Child after which to insert new node
        nodeToInsert - Node to be inserted at specified position
      • setDocType

        public void setDocType​(DoctypeToken docType)
      • addChild

        public void addChild​(java.lang.Object child)
      • addChildren

        public void addChildren​(java.util.List newChildren)
        Add all elements from specified list to this node.
        Parameters:
        newChildren -
      • getElementList

        public java.util.List<? extends TagNode> getElementList​(ITagNodeCondition condition,
                                                                boolean isRecursive)
        Get all elements in the tree that satisfy specified condition.
        Parameters:
        condition -
        isRecursive -
        Returns:
        List of TagNode instances with specified name.
      • getAllElementsList

        public java.util.List<? extends TagNode> getAllElementsList​(boolean isRecursive)
      • getAllElements

        public TagNode[] getAllElements​(boolean isRecursive)
      • findElementByName

        public TagNode findElementByName​(java.lang.String findName,
                                         boolean isRecursive)
      • getElementListByName

        public java.util.List<? extends TagNode> getElementListByName​(java.lang.String findName,
                                                                      boolean isRecursive)
      • getElementsByName

        public TagNode[] getElementsByName​(java.lang.String findName,
                                           boolean isRecursive)
      • findElementHavingAttribute

        public TagNode findElementHavingAttribute​(java.lang.String attName,
                                                  boolean isRecursive)
      • getElementListHavingAttribute

        public java.util.List<? extends TagNode> getElementListHavingAttribute​(java.lang.String attName,
                                                                               boolean isRecursive)
      • getElementsHavingAttribute

        public TagNode[] getElementsHavingAttribute​(java.lang.String attName,
                                                    boolean isRecursive)
      • findElementByAttValue

        public TagNode findElementByAttValue​(java.lang.String attName,
                                             java.lang.String attValue,
                                             boolean isRecursive,
                                             boolean isCaseSensitive)
      • getElementListByAttValue

        public java.util.List<? extends TagNode> getElementListByAttValue​(java.lang.String attName,
                                                                          java.lang.String attValue,
                                                                          boolean isRecursive,
                                                                          boolean isCaseSensitive)
      • getElementsByAttValue

        public TagNode[] getElementsByAttValue​(java.lang.String attName,
                                               java.lang.String attValue,
                                               boolean isRecursive,
                                               boolean isCaseSensitive)
      • evaluateXPath

        public java.lang.Object[] evaluateXPath​(java.lang.String xPathExpression)
                                         throws XPatherException
        Evaluates XPath expression on give node.
        This is not fully supported XPath parser and evaluator. Examples below show supported elements:
        • //div//a
        • //div//a[@id][@class]
        • /body/*[1]/@type
        • //div[3]//a[@id][@href='r/n4']
        • //div[last() >= 4]//./div[position() = last()])[position() > 22]//li[2]//a
        • //div[2]/@*[2]
        • data(//div//a[@id][@class])
        • //p/last()
        • //body//div[3][@class]//span[12.2
        • data(//a['v' < @id])
        Parameters:
        xPathExpression -
        Returns:
        result of XPather evaluation.
        Throws:
        XPatherException
      • removeFromTree

        public boolean removeFromTree()
        Remove this node from the tree.
        Returns:
        True if element is removed (if it is not root node).
      • removeChild

        public boolean removeChild​(java.lang.Object child)
        Remove specified child element from this node.
        Parameters:
        child -
        Returns:
        True if child object existed in the children list.
      • removeAllChildren

        public void removeAllChildren()
        Removes all children (subelements and text content).
      • setAutoGenerated

        public void setAutoGenerated​(boolean autoGenerated)
        Parameters:
        autoGenerated - the autoGenerated to set
      • isAutoGenerated

        public boolean isAutoGenerated()
        Returns:
        the autoGenerated
      • isPruned

        public boolean isPruned()
        Returns:
        true, if node was marked to be pruned.
      • setPruned

        public void setPruned​(boolean pruned)
      • isEmpty

        public boolean isEmpty()
      • addNamespaceDeclaration

        public void addNamespaceDeclaration​(java.lang.String nsPrefix,
                                            java.lang.String nsURI)
        Adds namespace declaration to the node
        Parameters:
        nsPrefix - Namespace prefix
        nsURI - Namespace URI
      • getNamespaceDeclarations

        public java.util.Map<java.lang.String,​java.lang.String> getNamespaceDeclarations()
        Returns:
        Map of namespace declarations for this node
      • makeCopy

        public TagNode makeCopy()
      • isCopy

        public boolean isCopy()
      • traverse

        public void traverse​(TagNodeVisitor visitor)
        Traverses the tree and performs visitor's action on each node. It stops when it finishes all the tree or when visitor returns false.
        Parameters:
        visitor - TagNodeVisitor implementation
      • isForeignMarkup

        public boolean isForeignMarkup()
        Returns:
        the isForeignMarkup
      • setForeignMarkup

        public void setForeignMarkup​(boolean isForeignMarkup)
        Parameters:
        isForeignMarkup - the isForeignMarkup to set
      • isTrimAttributeValues

        public boolean isTrimAttributeValues()
        Returns:
        the isTrimAttributeValues
      • setTrimAttributeValues

        public void setTrimAttributeValues​(boolean isTrimAttributeValues)
        Parameters:
        isTrimAttributeValues - the isTrimAttributeValues to set