Details
-
Type: New Feature
-
Status: Closed
-
Priority: Major
-
Resolution: Fixed
-
Affects Version/s: 3.0
-
Fix Version/s: 5.0.0 alpha1, 5.0
-
Component/s: Core/Parsing
-
Labels:None
-
Environment:ICEpdf
Description
Note from customer:
Please notice that the test program uses two threads to convert two identical PDF files here. If I use single thread to convert two PDF files sequentially, the comparison will success (image files are the same). If I save the converted images and open in Microsoft paint, I will see that one of the image is missing a chart.
Please notice that the test program uses two threads to convert two identical PDF files here. If I use single thread to convert two PDF files sequentially, the comparison will success (image files are the same). If I save the converted images and open in Microsoft paint, I will see that one of the image is missing a chart.
I've been testing this issue with a similar application that tries to initialize or extract text using multiple threads. After numerous days of testing and debugging I think I have found a couple hots spots.
The first area of concern is the thread access mechanism related to the Implementations of the SeekableInput. There are two implementations, RandomAccessFileInputStream and SeekableByteArrayInputStream. When a SeekableInput implementations use a wait notify mechanism for starting and ending thread access. On a micro scale this locking work pretty well but becomes problematic when working dealing with more then one input stream per thread. For example
Font.init (Thread 1)
font.getFontDescriptor (new inputstream)
The problem seems to happen more often that not in the SequenceInputStream when parsing though a pag content stream that has more then one Content Stream. When sequenceInputStream closes one stream and goes on tot he next it makes it possible for another thread to jump in which usually causes problems.
Something that is not clear to me is when we uses the SeekableInput becuase it inherits from InputStream and thus hard to find all the access points. I'm pretty sure there is still some SeekableInput usuages that don't use the startThreadAccess and endThreadAccess and thus mess up
There is one last problem that seems to show up do to shared PDF resources. For example it is possible for a Font object to initialize some other object form the document file which will cause the current thread to loose access to the stream, at which point a new thread get access and will try and initialize the same font resulting in a liveLock. The initial font object will be stuck waiting for thread access and the second thread will be stock waiting for access to the font init method.