Skip to content

tableParser

Parse Tabled Content to Text Vector and Extract Statistical Standard Results

v1.0.5 · Apr 9, 2026 · GPL-3

Description

Features include the ability to extract tabled content from NISO-JATS-coded XML, any native HTML or HML file, DOCX, and PDF documents, and then collapse it into a text format that is readable by humans by mimicking the actions of a screen reader. As tables within PDF documents are extracted with the 'tabulapdf' package, and the table captions and footnotes cannot be extracted, the results on tables within PDF documents have to be considered less precise. The function table2matrix() returns a list of the tables within a document as character matrices. table2text() collapses the matrix content into a list of character strings by imitating the behavior of a screen reader. The textual representation of characters and numbers can be unified with unifyMatrix() before parsing. The function table2stats() extracts the tabled statistical test results from the collapsed text with the function standardStats() from the 'JATSdecoder' package and, if activated, checks the reported and coded p-values for consistency. Due to the great variability and potential complexity of table structures, parsing accuracy may vary. A detailed description of how 'tableParser' works is provided here: <doi:10.48550/arXiv.2603.19756>.

Downloads

434

Last 30 days

8426th

1.2K

Last 90 days

1.6K

Last year

Trend: -20.8% (30d vs prior 30d)

CRAN Check Status

13 OK
Show all 13 flavors
Flavor Status
r-devel-linux-x86_64-debian-clang OK
r-devel-linux-x86_64-debian-gcc OK
r-devel-linux-x86_64-fedora-clang OK
r-devel-linux-x86_64-fedora-gcc OK
r-devel-windows-x86_64 OK
r-oldrel-macos-arm64 OK
r-oldrel-macos-x86_64 OK
r-oldrel-windows-x86_64 OK
r-patched-linux-x86_64 OK
r-release-linux-x86_64 OK
r-release-macos-arm64 OK
r-release-macos-x86_64 OK
r-release-windows-x86_64 OK

Check History

OK 12 OK · 0 NOTE · 0 WARNING · 0 ERROR · 0 FAILURE Apr 27, 2026
ERROR 11 OK · 0 NOTE · 0 WARNING · 1 ERROR · 0 FAILURE Apr 26, 2026
ERROR r-devel-linux-x86_64-debian-gcc

examples

Running examples in ‘tableParser-Ex.R’ failed
The error most likely occurred in:

> base::assign(".ptime", proc.time(), pos = "CheckExEnv")
> ### Name: table2matrix
> ### Title: table2matrix
> ### Aliases: table2matrix
> 
> ### ** Examples
> 
> ## - 
...[truncated]...
n/tableExamples.pdf': HTTP status was '502 Bad Gateway'
Error in download.file(p, paste0(tempdir(), "/", "tableExamples.pdf")) : 
  cannot open URL 'https://github.com/ingmarboeschen/tableParser/raw/refs/heads/main/tableExamples.pdf'
Execution halted
OK 14 OK · 0 NOTE · 0 WARNING · 0 ERROR · 0 FAILURE Apr 22, 2026
ERROR 13 OK · 0 NOTE · 0 WARNING · 1 ERROR · 0 FAILURE Apr 18, 2026
ERROR r-devel-linux-x86_64-debian-clang

examples

Running examples in ‘tableParser-Ex.R’ failed
The error most likely occurred in:

> base::assign(".ptime", proc.time(), pos = "CheckExEnv")
> ### Name: get.HTML.tables
> ### Title: get.HTML.tables
> ### Aliases: get.HTML.tables
> 
> ### ** Examples
>
...[truncated]...
 'https://en.wikipedia.org/wiki/R_(programming_language)': Timeout of 60 seconds was reached
Error in file(con, "r") : 
  cannot open the connection to 'https://en.wikipedia.org/wiki/R_(programming_language)'
Calls: readLines -> file
Execution halted
OK 14 OK · 0 NOTE · 0 WARNING · 0 ERROR · 0 FAILURE Apr 6, 2026
ERROR 13 OK · 0 NOTE · 0 WARNING · 1 ERROR · 0 FAILURE Apr 3, 2026
ERROR r-devel-linux-x86_64-debian-gcc

package dependencies

Packages required but not available: 'JATSdecoder', 'tabulapdf'

See section ‘The DESCRIPTION file’ in the ‘Writing R Extensions’
manual.
OK 14 OK · 0 NOTE · 0 WARNING · 0 ERROR · 0 FAILURE Mar 10, 2026

Dependency Network

Dependencies Reverse dependencies JATSdecoder tabulapdf tableParser

Version History

updated 1.0.5 ← 1.0.4 diff Apr 9, 2026
updated 1.0.4 ← 1.0.3 diff Mar 30, 2026
new 1.0.3 Mar 10, 2026
updated 1.0.3 ← 1.0.2 diff Feb 19, 2026
updated 1.0.2 ← 1.0.1 diff Feb 1, 2026
new 1.0.1 Jan 26, 2026