ICEpdf
  1. ICEpdf
  2. PDF-603

XObject text is not being passed to parent shapes

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 5.0.2
    • Fix Version/s: 5.0.3
    • Component/s: Core/Parsing
    • Labels:
      None
    • Environment:
      any

      Description

      This one came in through the forms. The document in question has two content stream the second being initialized from an xObject. The xobject contains all of the documents contact. During text extraction a xobject should pass it's text up to the parent content stream as was the case in 4.x

      When optional content (layers) was added in the 5.x an error was made in the implementation. The method PageText.getPageLines() was changed to return a copy of the pageLines array. The copy was then altered to respect if the text was actually visible. The copy was then returned and thus the passing of the text from the child to the parent didn't take place as the original pageLines data was not altered.

        Activity

        Patrick Corless created issue -
        Hide
        Patrick Corless added a comment -

        I've added a new method PageTExt.addPageLines( ArrayList<LineText> pageLines) which will always add the text the the instance's pageLines array and not the copy.

        Show
        Patrick Corless added a comment - I've added a new method PageTExt.addPageLines( ArrayList<LineText> pageLines) which will always add the text the the instance's pageLines array and not the copy.
        Repository Revision Date User Message
        ICEsoft Public SVN Repository #36212 Thu Jun 13 11:21:24 MDT 2013 patrick.corless PDF-603 fixed issue where xobject text was not correctly being added to the parent text array, breaking text extraction in some circumstances.
        Files Changed
        Commit graph MODIFY /icepdf/branches/icepdf-5.0.1/icepdf/core/src/org/icepdf/core/pobjects/graphics/text/PageText.java
        Commit graph MODIFY /icepdf/branches/icepdf-5.0.1/icepdf/core/src/org/icepdf/core/util/content/AbstractContentParser.java
        Hide
        Patrick Corless added a comment -

        Fix has been checked into the 5.0.1 branch and trunk.

        Show
        Patrick Corless added a comment - Fix has been checked into the 5.0.1 branch and trunk.
        Patrick Corless made changes -
        Field Original Value New Value
        Status Open [ 1 ] Resolved [ 5 ]
        Resolution Fixed [ 1 ]
        Repository Revision Date User Message
        ICEsoft Public SVN Repository #36213 Thu Jun 13 11:23:45 MDT 2013 patrick.corless PDF-603 fixed issue where xobject text was not correctly being added to the parent text array, breaking text extraction in some circumstances.
        Files Changed
        Commit graph MODIFY /icepdf/trunk/icepdf/core/src/org/icepdf/core/pobjects/graphics/text/PageText.java
        Commit graph MODIFY /icepdf/trunk/icepdf/core/src/org/icepdf/core/util/content/AbstractContentParser.java
        Patrick Corless made changes -
        Status Resolved [ 5 ] Closed [ 6 ]

          People

          • Assignee:
            Patrick Corless
            Reporter:
            Patrick Corless
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved: