[PDF-17] ICEpdf OS Cmaping error - ICEsoft JIRA Issue Tracker

Details

Type: Bug
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: 3.0
Fix Version/s: 3.1
Component/s: Core/Parsing
Labels:
None
Environment:
ICEpdf OS, PRO version if OK.

Assignee Priority:
P1
ICEsoft Forum Reference:
http://www.icefaces.org/JForum/posts/list/0/13027.page

Description

The PDF in question has CID font and a respective toUniocde CMap file. There is a small amount of text that is not being mapped correctly from CID to unicode. Further investigation is needed to look into the cause of the mapping issue. My guess is that the CMap parser is incomplete.

Options
- Sort By Name
- Sort By Date
- Ascending
- Descending
- Download All

Attachments

Test document for decoding issue.pdf

20/May/09 8:48 AM

125 kB

Patrick Corless

Activity

Ascending order - Click to sort in descending order

Patrick Corless created issue - 20/May/09 8:47 AM

Hide

Permalink

Patrick Corless added a comment - 20/May/09 8:48 AM

CID test file

Show

Patrick Corless added a comment - 20/May/09 8:48 AM CID test file

Patrick Corless made changes - 20/May/09 8:48 AM

Field	Original Value	New Value
Attachment		Test document for decoding issue.pdf [ 11749 ]

Patrick Corless made changes - 20/May/09 8:48 AM

Salesforce Case		[]
Fix Version/s		3.1 [ 10181 ]

Ken Fyten made changes - 05/Oct/09 11:33 AM

Salesforce Case		[]
Assignee Priority		P1

Repository	Revision	Date	User	Message
ICEsoft Public SVN Repository	#19368	Wed Oct 07 17:58:57 MDT 2009	patrick.corless	~~PDF-17~~ - fixed issues with parsing multiple entires for beginbfchar and beginbfrange definitions in the same cmap definition and added support for the cmap beginbfrange notation <src1> <srcn> [<dest1> <dest2> ...]
				Files Changed
				MODIFY /icepdf/trunk/icepdf/core/src/org/icepdf/core/pobjects/fonts/ofont/CMap.java

Patrick Corless made changes - 07/Oct/09 7:54 PM

Status

Open [ 1 ]

In Progress [ 3 ]

Hide

Permalink

Patrick Corless added a comment - 07/Oct/09 8:13 PM

This turned out to be a very interesting bug. The file in question showed a couple issue with our cmap parsers for both the Pro and OS version of ICEpdf.

Both the pro and os version of the cmap did not correct handle the cmap entries with values for beginbfrange in the format <src1> <srcn> [<dest1> <dest2> ...]

Also the OS version did not correctly handle multiple entires for beginbfchar and beginbfrange definitions.

As a result both the OS and PRO version do a much better job at text extraction and font substitution. I've increased the severity of the cmap parsing errors so that they will be more visible when they occur, hopefully this will help identify any future issue.

Show

Patrick Corless added a comment - 07/Oct/09 8:13 PM This turned out to be a very interesting bug. The file in question showed a couple issue with our cmap parsers for both the Pro and OS version of ICEpdf. Both the pro and os version of the cmap did not correct handle the cmap entries with values for beginbfrange in the format <src1> <srcn> [<dest1> <dest2> ...] Also the OS version did not correctly handle multiple entires for beginbfchar and beginbfrange definitions. As a result both the OS and PRO version do a much better job at text extraction and font substitution. I've increased the severity of the cmap parsing errors so that they will be more visible when they occur, hopefully this will help identify any future issue.

Patrick Corless made changes - 07/Oct/09 8:13 PM

Status	In Progress [ 3 ]	Resolved [ 5 ]
Resolution		Fixed [ 1 ]

Hide

Permalink

Patrick Corless added a comment - 09/Oct/09 3:00 PM

ICEpdf 3.1.0 has been released, closing issues.

Show

Patrick Corless added a comment - 09/Oct/09 3:00 PM ICEpdf 3.1.0 has been released, closing issues.

Patrick Corless made changes - 09/Oct/09 3:00 PM

Status

Resolved [ 5 ]

Closed [ 6 ]

People

Assignee:

Patrick Corless

Reporter:

Patrick Corless

Votes:

1 Vote for this issue

Watchers:

1 Start watching this issue

Dates

Created:

20/May/09 8:47 AM

Updated:

09/Oct/09 3:00 PM

Resolved:

07/Oct/09 8:13 PM