Further work has been to don compensate for these types of PDFs. The page contains a main set of text as one would expect but it also includes optional content that is drawn at a later time which causes problems with our text sorting. We generally just add the optional content at the end of the page text and do regular sorting. This works fine for most documents as the optional content represents logical blocks of text. In the documents in question the optional content contains only one or two works which are not part of a logical block of text. As a result I've added code that tries to insert this text into the correct line. This significantly smooths out the text selection.
Overall the text selection experience has been improved but further work will be done in the future to include the notation of a paragraph. But for the time being the following system properties should be used with the patch release.
-Dorg.icepdf.core.views.page.text.preserveColumns=false
-Dorg.icepdf.core.views.page.text.spaceFraction=1
Attached 3 documents to test the effect