[PDF-44] Page 2 of PDF does not display - Class cast exception thrown - ICEsoft JIRA Issue Tracker

Details

Type: Bug
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: 3.0
Fix Version/s: 3.1
Component/s: Core/Parsing
Labels:
None
Environment:
-

Description

When using the customer's custom HTML 2 PDF converter on the PDF in question, page 2 does not display and the following exception is thrown:

FINE: Error initializing Page.
java.lang.ClassCastException: java.lang.String cannot be cast to java.lang.StringBuffer
at org.icepdf.core.pobjects.fonts.nfont.CMap.<init>(Unknown Source)
at org.icepdf.core.pobjects.fonts.nfont.CMap.<init>(Unknown Source)
at org.icepdf.core.pobjects.fonts.nfont.Font.init(Unknown Source)
at org.icepdf.core.pobjects.Resources.getFont(Resources.java:186)
at
org.icepdf.core.util.ContentParser.consume_Tf(ContentParser.java:1994)
at
org.icepdf.core.util.ContentParser.parseText(ContentParser.java:1114)
at org.icepdf.core.util.ContentParser.parse(ContentParser.java:276)
at org.icepdf.core.pobjects.Page.init(Page.java:353)
at
org.icepdf.core.views.swing.PageViewComponentImpl$PageInitilizer.run(PageVie
wComponentImpl.java:1111)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown
Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown
Source)
at java.lang.Thread.run(Unknown Source)
1-Sep-2009 10:38:50 PM org.icepdf.core.util.Library printObjectDebug

Activity

Ascending order - Click to sort in descending order

Hide

Permalink

Patrick Corless added a comment - 10/Sep/09 2:52 PM

Updated the PostScropt tokenizer so that it doesn't cast the StringBuffer to String. As far as the CMap parser is concerned it's expected a StringBuffer. Its interesting that this is the first time this code path has been executed.

Show

Patrick Corless added a comment - 10/Sep/09 2:52 PM Updated the PostScropt tokenizer so that it doesn't cast the StringBuffer to String. As far as the CMap parser is concerned it's expected a StringBuffer. Its interesting that this is the first time this code path has been executed.

Hide

Permalink

Patrick Corless added a comment - 10/Sep/09 2:52 PM

Still have to push though qa but all should be good, closing.

Show

Patrick Corless added a comment - 10/Sep/09 2:52 PM Still have to push though qa but all should be good, closing.

Hide

Permalink

Patrick Corless added a comment - 11/Sep/09 6:58 AM

QA frame showed some issue with the "quick fix", so I dug a little deeper. The cmap being parsed is being used for toUnicode and thus only used for text extraction and for font substitution in the open source version. The cmap parser is puking because the cmap in question is malformed.

12 dict begin
begincmap
/CIDSystemInfo << /Registry (Adobe) /Ordering (UCS) /Supplement 0 >> def
/CMapName /Adobe-Identity-UCS def
/CMapType 2 def
1 begincodespacerange
<0000> <FFFF>
endcodespacerange
2 beginbfrange
<0001> <0001> <0020>
endbfrange
endcmap
CMapName currentdict /CMap defineresource pop
end
end

The "2 beginbfrange" is the problem, there is only one definition defined where there should be two. I going to try and tweak the parser as well as make sure the error doesn't prevent the pdf from being rendered.

Show

Patrick Corless added a comment - 11/Sep/09 6:58 AM QA frame showed some issue with the "quick fix", so I dug a little deeper. The cmap being parsed is being used for toUnicode and thus only used for text extraction and for font substitution in the open source version. The cmap parser is puking because the cmap in question is malformed. 12 dict begin begincmap /CIDSystemInfo << /Registry (Adobe) /Ordering (UCS) /Supplement 0 >> def /CMapName /Adobe-Identity-UCS def /CMapType 2 def 1 begincodespacerange <0000> <FFFF> endcodespacerange 2 beginbfrange <0001> <0001> <0020> endbfrange endcmap CMapName currentdict /CMap defineresource pop end end The "2 beginbfrange" is the problem, there is only one definition defined where there should be two. I going to try and tweak the parser as well as make sure the error doesn't prevent the pdf from being rendered.

Hide

Permalink

Patrick Corless added a comment - 07/Oct/09 8:16 PM

Turns out this was a similar case to ~~PDF-17~~. The cmap parser was having problems with the beginbfrange notation <src1> <srcn> [<dest1> <dest2> ...] as itwasn't implemented by our CMap parser. Also the PDF in question was malformed in that it was reporting two entires when there was only two which resulted in parsing error. I updated parser to be a little more robust.

Show

Patrick Corless added a comment - 07/Oct/09 8:16 PM Turns out this was a similar case to PDF-17 . The cmap parser was having problems with the beginbfrange notation <src1> <srcn> [<dest1> <dest2> ...] as itwasn't implemented by our CMap parser. Also the PDF in question was malformed in that it was reporting two entires when there was only two which resulted in parsing error. I updated parser to be a little more robust.

Hide

Permalink

Patrick Corless added a comment - 09/Oct/09 3:00 PM

ICEpdf 3.1.0 has been released, closing issues.

Show

Patrick Corless added a comment - 09/Oct/09 3:00 PM ICEpdf 3.1.0 has been released, closing issues.

People

Assignee:

Patrick Corless

Reporter:

Tyler Johnson

Votes:

0 Vote for this issue

Watchers:

0 Start watching this issue

Dates

Created:

09/Sep/09 2:20 PM

Updated:

09/Oct/09 3:00 PM

Resolved:

07/Oct/09 8:16 PM