Details
-
Type: Bug
-
Status: Closed
-
Priority: Major
-
Resolution: Fixed
-
Affects Version/s: 5.0.7
-
Fix Version/s: 5.1
-
Component/s: Core/Parsing
-
Labels:None
-
Environment:Pro content parser
-
Support Case References:Support Case #13007 - https://icesoft.my.salesforce.com/5007000000l0lA9
Description
A client has given us a few sample PDF that where generated from a PowerPoint document. The background tiling used in the file has been translated to PostScript using thousands of inline images, rather then using xObjects and shared resource or tiling pattern proper.
Th bug will be used to investigate if we can't optimize the amount of memory used and time taken to parse the file.
Th bug will be used to investigate if we can't optimize the amount of memory used and time taken to parse the file.
The sample file as well as a few others we have in QA suffer from the issue of a large number of embedded images. The test file for example has 32,813 inline image definitions (this is why it's slow). When we introduce a cache that uses the inline images streams that are less then 256 bytes we end up only actually have 3,820 unique images. Still a lot of images but as a result we use use about 75 MB less memory to render the page.