ICEpdf
  1. ICEpdf
  2. PDF-624

Text extraction is not correctly mapping GID to valid unicode value.

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 5.0.2
    • Fix Version/s: 5.0.3
    • Component/s: Core/Parsing
    • Labels:
      None
    • Environment:
      ny

      Description

      The PDF in question (support drive) produces garbage when the text is extracted from the second page. I've verified that there is valid Unicode information available in the file. For some reason the the PRO version is having difficulty getting at this information.

      Further investigation is needed.

        Activity

        Patrick Corless created issue -
        Hide
        Patrick Corless added a comment -

        I've fixed a bug in the Encoding class for NFont that insures the encoding differences array is properly parsed and stored.

        Show
        Patrick Corless added a comment - I've fixed a bug in the Encoding class for NFont that insures the encoding differences array is properly parsed and stored.
        Hide
        Patrick Corless added a comment -

        Update 5.0.1 branch and trunk.

        Show
        Patrick Corless added a comment - Update 5.0.1 branch and trunk.
        Patrick Corless made changes -
        Field Original Value New Value
        Status Open [ 1 ] Resolved [ 5 ]
        Resolution Fixed [ 1 ]
        Patrick Corless made changes -
        Fix Version/s 5.0.3 [ 11070 ]
        Fix Version/s 5.0.4 [ 11072 ]
        Patrick Corless made changes -
        Status Resolved [ 5 ] Closed [ 6 ]

          People

          • Assignee:
            Patrick Corless
            Reporter:
            Patrick Corless
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved: