Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 3.0
    • Fix Version/s: 3.1
    • Component/s: Core/Parsing
    • Labels:
      None
    • Environment:
      ICEpdf OS, PRO version if OK.

      Description

      The PDF in question has CID font and a respective toUniocde CMap file. There is a small amount of text that is not being mapped correctly from CID to unicode. Further investigation is needed to look into the cause of the mapping issue. My guess is that the CMap parser is incomplete.

        Activity

        Hide
        Patrick Corless added a comment -

        CID test file

        Show
        Patrick Corless added a comment - CID test file
        Hide
        Patrick Corless added a comment -

        This turned out to be a very interesting bug. The file in question showed a couple issue with our cmap parsers for both the Pro and OS version of ICEpdf.

        Both the pro and os version of the cmap did not correct handle the cmap entries with values for beginbfrange in the format <src1> <srcn> [<dest1> <dest2> ...]

        Also the OS version did not correctly handle multiple entires for beginbfchar and beginbfrange definitions.

        As a result both the OS and PRO version do a much better job at text extraction and font substitution. I've increased the severity of the cmap parsing errors so that they will be more visible when they occur, hopefully this will help identify any future issue.

        Show
        Patrick Corless added a comment - This turned out to be a very interesting bug. The file in question showed a couple issue with our cmap parsers for both the Pro and OS version of ICEpdf. Both the pro and os version of the cmap did not correct handle the cmap entries with values for beginbfrange in the format <src1> <srcn> [<dest1> <dest2> ...] Also the OS version did not correctly handle multiple entires for beginbfchar and beginbfrange definitions. As a result both the OS and PRO version do a much better job at text extraction and font substitution. I've increased the severity of the cmap parsing errors so that they will be more visible when they occur, hopefully this will help identify any future issue.
        Hide
        Patrick Corless added a comment -

        ICEpdf 3.1.0 has been released, closing issues.

        Show
        Patrick Corless added a comment - ICEpdf 3.1.0 has been released, closing issues.

          People

          • Assignee:
            Patrick Corless
            Reporter:
            Patrick Corless
          • Votes:
            1 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved: