Details
-
Type: Improvement
-
Status: Closed
-
Priority: Major
-
Resolution: Fixed
-
Affects Version/s: 6.1.2
-
Fix Version/s: 6.1.3
-
Component/s: Core/Parsing
-
Labels:None
-
Environment:OS/PRO common rendering core
Description
OCR programs do a pretty cool job at capturing text but layout can be a little different then a document that was type set for print. If the page is when scanned is slightly skewed the text coordinates will reflect the skew.
Our code for detecting spaces and line breaks wasn't designed for the text that might slowing move vertically from the start of a line to the ends.
This bug will capture changes needed to improve word and line detection and work ordering.
Our code for detecting spaces and line breaks wasn't designed for the text that might slowing move vertically from the start of a line to the ends.
This bug will capture changes needed to improve word and line detection and work ordering.
Activity
Patrick Corless
created issue -
Christoph Keimel
made changes -
Field | Original Value | New Value |
---|---|---|
Attachment | 2 B 3.16_09-07-2013.pdf [ 22285 ] | |
Attachment | 2 B 3.16_09-07-2016.pdf [ 22286 ] | |
Attachment | 2 B 3.16_09-09-2016.pdf [ 22287 ] |
Patrick Corless
made changes -
Fix Version/s | 6.1.3 [ 13086 ] |
Repository | Revision | Date | User | Message |
ICEsoft Public SVN Repository | #49323 | Wed Sep 28 00:37:45 MDT 2016 | patrick.corless | |
Files Changed | ||||
MODIFY
/icepdf/branches/icepdf-6.1.0/icepdf/core/src/org/icepdf/core/pobjects/graphics/text/GlyphText.java
MODIFY /icepdf/branches/icepdf-6.1.0/icepdf/core/src/org/icepdf/core/pobjects/graphics/text/WordText.java |
Patrick Corless
made changes -
Summary | Improve text selection ordering for OCR's documetns. | Improve text selection ordering for OCR's documents. |
Patrick Corless
made changes -
Status | Open [ 1 ] | Resolved [ 5 ] |
Resolution | Fixed [ 1 ] |
Repository | Revision | Date | User | Message |
ICEsoft Public SVN Repository | #49330 | Thu Sep 29 23:08:36 MDT 2016 | patrick.corless | |
Files Changed | ||||
MODIFY
/icepdf/branches/icepdf-6.1.0/icepdf/core/src/org/icepdf/core/pobjects/graphics/text/PageText.java
|
Repository | Revision | Date | User | Message |
ICEsoft Public SVN Repository | #49503 | Tue Nov 08 11:11:15 MST 2016 | patrick.corless | |
Files Changed | ||||
MODIFY
/icepdf/trunk/icepdf/core/src/org/icepdf/core/pobjects/graphics/text/PageText.java
|
Patrick Corless
made changes -
Status | Resolved [ 5 ] | Closed [ 6 ] |
Attached 3 documents to test the effect