[PDF-1073] Consolidate Page text extraction sorting calls - ICEsoft JIRA Issue Tracker

Details

Type: Bug
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: 6.1.3
Fix Version/s: 6.2
Component/s: API, Core/Parsing
Labels:
None
Environment:
any

ICEsoft Forum Reference:
http://www.icesoft.org/JForum/posts/list/0/23237.page

Description

A community member has is migrating from 4.x to 6.x and has run up against a few regressions with the expected results of the page text extraction calls. I've done a little digging around and it would appear that the docment.getPageText() method calls does not execute the same extraction algorithms as page.getPageText() call.

This bug is a place holder to review the text extraction API and make srue the non visual page extraction calls have the same sorting calls as the visual page extraction calls.

Activity

Ascending order - Click to sort in descending order

Hide

Permalink

Patrick Corless added a comment - 12/Jan/17 9:10 AM

I've reviewed our code and things seems to be in order. The sorting and formatting takes place in the PageText call ArrayList<LineText> getPageLines(). The document and Page calls getPageText() and getPageViewText() work as the javadoc suggests, that is they change the parser config and getPageText() can be a lot faster for straight up extraction with no page image capture.

I've also touched up the viewer ri text extraction calls and the extraction examples to use the fontProperties manager to speed up the start time of the examples.

Show

Patrick Corless added a comment - 12/Jan/17 9:10 AM I've reviewed our code and things seems to be in order. The sorting and formatting takes place in the PageText call ArrayList<LineText> getPageLines(). The document and Page calls getPageText() and getPageViewText() work as the javadoc suggests, that is they change the parser config and getPageText() can be a lot faster for straight up extraction with no page image capture. I've also touched up the viewer ri text extraction calls and the extraction examples to use the fontProperties manager to speed up the start time of the examples.

Hide

Permalink

Patrick Corless added a comment - 12/Jan/17 9:33 AM

Marking as fixed.

Show

Patrick Corless added a comment - 12/Jan/17 9:33 AM Marking as fixed.

People

Assignee:

Patrick Corless

Reporter:

Patrick Corless

Votes:

0 Vote for this issue

Watchers:

1 Start watching this issue

Dates

Created:

01/Dec/16 8:50 AM

Updated:

25/Jan/18 12:58 PM

Resolved:

12/Jan/17 9:33 AM