Details

    • Type: New Feature New Feature
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 4.0 - Beta
    • Fix Version/s: 4.0
    • Component/s: Core/Parsing
    • Labels:
      None
    • Environment:
      Windows, Mac

      Description

      In the ICEpdf standard, it says that inside of dictionaries, strings can be either in PDFDocEncoding or 16 bit BE (big endian) Unicode. To complicate things, there may be certain dictionary strings that are in UTF-8. That has to be investigated. Right now our parser is just making strings from the bytes, which means we're only handling ASCII correctly. Accented characters using the top 8th bit are not necessarily being handled right. Java defaults to using the platform encoding, so WinAnsi on Windows and MacRoman on the Mac. Have to see what on Linux. Some documentation shows PDFDocEncoding to be similar to, if not the same as Latin1. We have to investigate if there is something in the specification for overriding the PDFDocEncoding default to specify a specific one. Then we need the Parser to use the correct encoding to create the Java strings, so we're not corrupting the inputs.

        Issue Links

          Activity

          Mark Collette created issue -
          Mark Collette made changes -
          Field Original Value New Value
          Link This issue blocks PDF-72 [ PDF-72 ]
          Mark Collette made changes -
          Link This issue blocks PDF-97 [ PDF-97 ]
          Patrick Corless made changes -
          Status Open [ 1 ] Resolved [ 5 ]
          Fix Version/s 4.0 [ 10222 ]
          Resolution Fixed [ 1 ]
          Patrick Corless made changes -
          Status Resolved [ 5 ] Closed [ 6 ]

            People

            • Assignee:
              Patrick Corless
              Reporter:
              Mark Collette
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: