ICEpdf
  1. ICEpdf
  2. PDF-841

OContentParser concatenates content streams incorrectly

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 5.1.1
    • Fix Version/s: 5.1.2
    • Component/s: Core/Parsing
    • Labels:
      None
    • Environment:
      N/A

      Description

      I have a PDF where the page contents directory contains two streams. The first one consists of
      {code}
      /Basemap_Form Do
      {code}
      the second one starts with
      {code}
      q
      1.0 0.0 0.0 1.0 103.680000 51.840000 cm
      /LGIT:W Do
      Q
      {code}

      OContentParser#parse receives these streams as two byte[] objects which are then concatenated using a ByteDoubleArrayInputStream which presents the byte[]s as a single concatenated InputStream. This is causes incorrect parsing.

      During parsing the parser sees the following sequence of tokens
      {code}
      /Basemap_Form
      Doq
      1.0
      0.0
      ...
      {code}

      The Doq token is the result of concatenating the two streams without introducing a white-space character to separate the Do and q tokens. The PDF spec on Page/Contents states that the division between streams is always at a lexical token boundary, so the parser needs to insert a token boundary between the streams somehow.

      Using org.icepdf.core.io.SequenceInputStream with a ' ' separator character resolves the parsing problem.

        Activity

        Hide
        Patrick Corless added a comment -

        Added the extra space while we assemble the streams[]. Marking as resolved.

        Show
        Patrick Corless added a comment - Added the extra space while we assemble the streams[]. Marking as resolved.

          People

          • Assignee:
            Patrick Corless
            Reporter:
            Pepijn Van Eeckhoudt
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved: