ICEpdf
  1. ICEpdf
  2. PDF-419

getPageImage() has fatal usability issues with attached PDF

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 4.3.2
    • Fix Version/s: 4.3.3
    • Component/s: Core/Parsing
    • Labels:
      None
    • Environment:
      ICEpdf PRO 4.3.2, Java Vendor: Sun J2SE 1.6.0_26, Sun J2SE 1.6.0_25

      Description

      Images generated from PDF using getPageImage() have fatal usability issues including:

      (1) Blank page generated for page 2
      (2) Black boxes behind the text
      (3) Text rendering as white

      1. Chevron_2010.pdf
        5.14 MB
        Evgheni Sadovoi
      1. Region capture 1.png
        824 kB
      2. Region capture 2.png
        313 kB

        Issue Links

          Activity

          Evgheni Sadovoi created issue -
          Evgheni Sadovoi made changes -
          Field Original Value New Value
          Salesforce Case [5007000000LGhi5]
          Evgheni Sadovoi made changes -
          Attachment Chevron_2010.pdf [ 14273 ]
          Evgheni Sadovoi made changes -
          Attachment Region capture 1.png [ 14274 ]
          Attachment Region capture 2.png [ 14275 ]
          Hide
          Evgheni Sadovoi added a comment -

          Log feedback during review of the attached PDF in the ICEpdf Viewer:

          org.icepdf.core.pobjects.Name convertHexChars
          WARNING: Error parsing hexadecimal characters.

          Show
          Evgheni Sadovoi added a comment - Log feedback during review of the attached PDF in the ICEpdf Viewer: org.icepdf.core.pobjects.Name convertHexChars WARNING: Error parsing hexadecimal characters.
          Hide
          Patrick Corless added a comment -

          1.) There is an error with the image decode and conversion to RGB, error coming out of the type3 function handling.
          2.) This appears to a be a clipping issue that we've seen quite a bit of as of late.
          3. ) Either a parser error or related to the colour issue in 1.

          The PDF in question is very odd, having no mentioned of source producer. I think all the issue can be solve but will take some time to figure out all the root causes. My hunch is that this PDF isn't well formed.

          Show
          Patrick Corless added a comment - 1.) There is an error with the image decode and conversion to RGB, error coming out of the type3 function handling. 2.) This appears to a be a clipping issue that we've seen quite a bit of as of late. 3. ) Either a parser error or related to the colour issue in 1. The PDF in question is very odd, having no mentioned of source producer. I think all the issue can be solve but will take some time to figure out all the root causes. My hunch is that this PDF isn't well formed.
          Patrick Corless made changes -
          Fix Version/s 4.3.3 [ 10333 ]
          Repository Revision Date User Message
          ICEsoft Public SVN Repository #28849 Fri Apr 27 07:58:10 MDT 2012 patrick.corless PDF-419 addition of corrective code for type 3 functions that don't have bounds specified.
          Files Changed
          Commit graph MODIFY /icepdf/trunk/icepdf/core/src/org/icepdf/core/pobjects/functions/Function_3.java
          Repository Revision Date User Message
          ICEsoft Public SVN Repository #28850 Fri Apr 27 07:59:45 MDT 2012 patrick.corless PDF-419 separation nameColour will be used for black, red, green and blue to avoid some tint and alternative colour problems as suggested by the specification even for a additive colour model.
          Files Changed
          Commit graph MODIFY /icepdf/trunk/icepdf/core/src/org/icepdf/core/pobjects/graphics/Separation.java
          Hide
          Patrick Corless added a comment -

          I have fixes for one and three.

          The function error is technically a malformed PDF, the function definition is lacking a bounds attribute which is need to properly execute the function. There isn't anything in the spec to saw what should be done when this error occurs. I've touched up to code so that it will return a valid set of numbers instead of null. The strange part about the problem is that it it doesn't affect the visible appearance of the page. So I'm guessing the PDF encoder has inadvertently left some junk in and my change is more of a work around.

          The colour error for 3 is related to the Seperation colour space. This colour space is a bit odd but after reviewing the spec seemed to following the rules but the generated colour was till wrong. I did find a note in the latest PDF specification that that stated regardless of additive or subtractive device, if the named colour could be represented in the colour space it should be used as is and the alternative colour and tint should be avoided. I made a small modification so that if black, red, green or blue was detected for colour name then we would just use it verbatim. I'll have to see if anything strange comes out of QA as a result of the change.

          Show
          Patrick Corless added a comment - I have fixes for one and three. The function error is technically a malformed PDF, the function definition is lacking a bounds attribute which is need to properly execute the function. There isn't anything in the spec to saw what should be done when this error occurs. I've touched up to code so that it will return a valid set of numbers instead of null. The strange part about the problem is that it it doesn't affect the visible appearance of the page. So I'm guessing the PDF encoder has inadvertently left some junk in and my change is more of a work around. The colour error for 3 is related to the Seperation colour space. This colour space is a bit odd but after reviewing the spec seemed to following the rules but the generated colour was till wrong. I did find a note in the latest PDF specification that that stated regardless of additive or subtractive device, if the named colour could be represented in the colour space it should be used as is and the alternative colour and tint should be avoided. I made a small modification so that if black, red, green or blue was detected for colour name then we would just use it verbatim. I'll have to see if anything strange comes out of QA as a result of the change.
          Repository Revision Date User Message
          ICEsoft Public SVN Repository #28864 Fri Apr 27 11:44:21 MDT 2012 patrick.corless PDF-419 separation nameColour will be used for black, red, green and blue to avoid some tint and alternative colour problems as suggested by the specification even for a additive colour model.
          Files Changed
          Commit graph MODIFY /icepdf/trunk/icepdf/core/src/org/icepdf/core/util/ContentParser.java
          Commit graph MODIFY /icepdf/trunk/icepdf/core/src/org/icepdf/core/pobjects/graphics/Shapes.java
          Hide
          Patrick Corless added a comment -

          The last issue related to this document is not related to marked content but actually to the text rendering mode 7 or add text to path for clipping. I've looked back through svn and cvs and it appears that we never support this text rendering mode. According to the spec we are to keep the outlines of any glyph that uses this mode. And when the ET token is encountered we apply the summation of the outlines as the current clip. In theory this is quite a simple task but implementing it has proven more difficult.

          The rendering core uses affine transforms to move the plotter needle if you will before each shape is painted. So when the ET token is encounter the needle has moved to the end of the text line at which point all the outlines are drawn at the same location. My first attempt at solving the problem was implemented in the contentParser but after quite a bit of analysis outline clipping can't be done here as we need to know the exact location of where the outline is to be drawn and we don't know that until the shapes.paint() is called.

          So I added a new new object Marker OutlineTextClip which stores the outline of a textSprite and is added to the shapes vector. Once a ET is encounted a new shapes command TextOutlineClip is put on the stack. The idea being that when the pages is painted the text outlines can be added and then used as a clip. This is working but for for some reason can can't seem to translate the x,y of the outline as the text is written out. Still have to figure out how to correctly transform an area to the correct location.

          Show
          Patrick Corless added a comment - The last issue related to this document is not related to marked content but actually to the text rendering mode 7 or add text to path for clipping. I've looked back through svn and cvs and it appears that we never support this text rendering mode. According to the spec we are to keep the outlines of any glyph that uses this mode. And when the ET token is encountered we apply the summation of the outlines as the current clip. In theory this is quite a simple task but implementing it has proven more difficult. The rendering core uses affine transforms to move the plotter needle if you will before each shape is painted. So when the ET token is encounter the needle has moved to the end of the text line at which point all the outlines are drawn at the same location. My first attempt at solving the problem was implemented in the contentParser but after quite a bit of analysis outline clipping can't be done here as we need to know the exact location of where the outline is to be drawn and we don't know that until the shapes.paint() is called. So I added a new new object Marker OutlineTextClip which stores the outline of a textSprite and is added to the shapes vector. Once a ET is encounted a new shapes command TextOutlineClip is put on the stack. The idea being that when the pages is painted the text outlines can be added and then used as a clip. This is working but for for some reason can can't seem to translate the x,y of the outline as the text is written out. Still have to figure out how to correctly transform an area to the correct location.
          Repository Revision Date User Message
          ICEsoft Public SVN Repository #28929 Fri May 04 08:23:45 MDT 2012 patrick.corless PDF-419 separation nameColour will be used for black, red, green and blue to avoid some tint and alternative colour problems as suggested by the specification even for a additive colour model.
          Files Changed
          Commit graph MODIFY /icepdf/branches/icepdf-4.3.2/icepdf/core/src/org/icepdf/core/pobjects/graphics/Shapes.java
          Commit graph MODIFY /icepdf/branches/icepdf-4.3.2/icepdf/core/src/org/icepdf/core/util/ContentParser.java
          Repository Revision Date User Message
          ICEsoft Public SVN Repository #28930 Fri May 04 08:24:16 MDT 2012 patrick.corless PDF-419 separation nameColour will be used for black, red, green and blue to avoid some tint and alternative colour problems as suggested by the specification even for a additive colour model.
          Files Changed
          Commit graph MODIFY /icepdf/branches/icepdf-4.3.2/icepdf/core/src/org/icepdf/core/pobjects/graphics/Separation.java
          Repository Revision Date User Message
          ICEsoft Public SVN Repository #28931 Fri May 04 08:24:40 MDT 2012 patrick.corless PDF-419 addition of corrective code for type 3 functions that don't have bounds specified.
          Files Changed
          Commit graph MODIFY /icepdf/branches/icepdf-4.3.2/icepdf/core/src/org/icepdf/core/pobjects/functions/Function_3.java
          Hide
          Patrick Corless added a comment -

          Another users experiencing the same clipping issue.

          http://jforum.icesoft.org/JForum/posts/list/0/20892.page#73736

          Show
          Patrick Corless added a comment - Another users experiencing the same clipping issue. http://jforum.icesoft.org/JForum/posts/list/0/20892.page#73736
          Patrick Corless made changes -
          Link This issue blocks PDF-422 [ PDF-422 ]
          Hide
          Patrick Corless added a comment -

          I managed to cut in a new object GlyphOutlineClip that is updated as the Text block is processed and add support for font modes, 4, 5, 6 and 7. The cut out effect renders correctly but it's a bit slow and jagged, I'll need to see if there is some way to improve both of these issue. I still have to apply the changes to the OS font handling.

          The file in the form posting above as well as the attached file now render however PDF-422 is still not correctly rendering.

          Show
          Patrick Corless added a comment - I managed to cut in a new object GlyphOutlineClip that is updated as the Text block is processed and add support for font modes, 4, 5, 6 and 7. The cut out effect renders correctly but it's a bit slow and jagged, I'll need to see if there is some way to improve both of these issue. I still have to apply the changes to the OS font handling. The file in the form posting above as well as the attached file now render however PDF-422 is still not correctly rendering.
          Repository Revision Date User Message
          ICEsoft Public SVN Repository #30018 Tue Jul 17 09:42:59 MDT 2012 patrick.corless PDF-419 updated the content parser, fonts and graphic shapes stack to handle text rendering modes, 4, 5, 6 and seven with regards to glyph outline clipping.
          Files Changed
          Commit graph MODIFY /icepdf/trunk/icepdf/core/src/org/icepdf/core/pobjects/graphics/TextSprite.java
          Commit graph MODIFY /icepdf/trunk/icepdf/core/src/org/icepdf/core/util/ContentParser.java
          Commit graph MODIFY /icepdf/trunk/icepdf/core/src/org/icepdf/core/pobjects/fonts/FontFile.java
          Commit graph MODIFY /icepdf/trunk/icepdf/core/src/org/icepdf/core/pobjects/fonts/ofont/OFont.java
          Commit graph MODIFY /icepdf/trunk/icepdf/core/src/org/icepdf/core/pobjects/graphics/GraphicsState.java
          Commit graph ADD /icepdf/trunk/icepdf/core/src/org/icepdf/core/pobjects/graphics/GlyphOutlineClip.java
          Commit graph MODIFY /icepdf/trunk/icepdf/core/src/org/icepdf/core/pobjects/graphics/Shapes.java
          Hide
          Patrick Corless added a comment -

          Updated the content parser, fonts and graphic shapes stack to handle text rendering modes, 4, 5, 6 and seven with regards to glyph outline clipping. Also updated the pro font library as well.

          The overall clipping effect is not not anti-aliased which makes for a less then perfect cutout effect. If time permits I try and circle back to see if there is better way to apply the clip.

          Show
          Patrick Corless added a comment - Updated the content parser, fonts and graphic shapes stack to handle text rendering modes, 4, 5, 6 and seven with regards to glyph outline clipping. Also updated the pro font library as well. The overall clipping effect is not not anti-aliased which makes for a less then perfect cutout effect. If time permits I try and circle back to see if there is better way to apply the clip.
          Patrick Corless made changes -
          Status Open [ 1 ] Resolved [ 5 ]
          Resolution Fixed [ 1 ]
          Patrick Corless made changes -
          Link This issue blocks PDF-463 [ PDF-463 ]
          Patrick Corless made changes -
          Link This issue blocks PDF-693 [ PDF-693 ]
          Patrick Corless made changes -
          Status Resolved [ 5 ] Closed [ 6 ]

            People

            • Assignee:
              Patrick Corless
              Reporter:
              Evgheni Sadovoi
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: