ICEpdf
  1. ICEpdf
  2. PDF-1022

Improve text selection ordering for OCR's documents.

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 6.1.2
    • Fix Version/s: 6.1.3
    • Component/s: Core/Parsing
    • Labels:
      None
    • Environment:
      OS/PRO common rendering core

      Description

      OCR programs do a pretty cool job at capturing text but layout can be a little different then a document that was type set for print. If the page is when scanned is slightly skewed the text coordinates will reflect the skew.

      Our code for detecting spaces and line breaks wasn't designed for the text that might slowing move vertically from the start of a line to the ends.

      This bug will capture changes needed to improve word and line detection and work ordering.
      1. 2 B 3.16_09-07-2013.pdf
        173 kB
        Christoph Keimel
      2. 2 B 3.16_09-07-2016.pdf
        4.04 MB
        Christoph Keimel
      3. 2 B 3.16_09-09-2016.pdf
        74 kB
        Christoph Keimel

        Activity

          People

          • Assignee:
            Patrick Corless
            Reporter:
            Patrick Corless
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved: