Skip to content

dataprep

Efficient and Flexible Data Preprocessing Tools

v0.1.5 · Jan 15, 2022 · GPL (>= 2)

Description

Efficiently and flexibly preprocess data using a set of data filtering, deletion, and interpolation tools. These data preprocessing methods are developed based on the principles of completeness, accuracy, threshold method, and linear interpolation and through the setting of constraint conditions, time completion & recovery, and fast & efficient calculation and grouping. Key preprocessing steps include deletions of variables and observations, outlier removal, and missing values (NA) interpolation, which are dependent on the incomplete and dispersed degrees of raw data. They clean data more accurately, keep more samples, and add no outliers after interpolation, compared with ordinary methods. Auto-identification of consecutive NA via run-length based grouping is used in observation deletion, outlier removal, and NA interpolation; thus, new outliers are not generated in interpolation. Conditional extremum is proposed to realize point-by-point weighed outlier removal that saves non-outliers from being removed. Plus, time series interpolation with values to refer to within short periods further ensures reliable interpolation. These methods are based on and improved from the reference: Liang, C.-S., Wu, H., Li, H.-Y., Zhang, Q., Li, Z. & He, K.-B. (2020) <doi:10.1016/j.scitotenv.2020.140923>.

Downloads

330

Last 30 days

9631st

635

Last 90 days

635

Last year

Trend: +8.2% (30d vs prior 30d)

CRAN Check Status

2 NOTE
12 OK
Show all 14 flavors
Flavor Status
r-devel-linux-x86_64-debian-clang NOTE
r-devel-linux-x86_64-debian-gcc NOTE
r-devel-linux-x86_64-fedora-clang OK
r-devel-linux-x86_64-fedora-gcc OK
r-devel-macos-arm64 OK
r-devel-windows-x86_64 OK
r-oldrel-macos-arm64 OK
r-oldrel-macos-x86_64 OK
r-oldrel-windows-x86_64 OK
r-patched-linux-x86_64 OK
r-release-linux-x86_64 OK
r-release-macos-arm64 OK
r-release-macos-x86_64 OK
r-release-windows-x86_64 OK
Check details (14 non-OK)
NOTE r-devel-linux-x86_64-debian-clang

CRAN incoming feasibility

Maintainer: ‘Chun-Sheng Liang <liangchunsheng@lzu.edu.cn>’

No Authors@R field in DESCRIPTION.
Please add one, modifying
  Authors@R: c(person(given = "Chun-Sheng",
                      family = "Liang",
                      role = c("aut", "cre"),
                      email = "liangchunsheng@lzu.edu.cn"),
               person(given = "Hao",
                      family = "Wu",
                      role = "aut"),
               person(given = "Hai-Yan",
                      family = "Li",
                      role = "aut"),
               person(given = "Qiang",
                      family = "Zhang",
                      role = "aut"),
               person(given = "Zhanqing",
                      family = "Li",
                      role = "aut"),
               person(given = "Ke-Bin",
                      family = "He",
                      role = "aut"),
               person(given = "Lanzhou",
                      family = "University",
                      role = "aut"),
               person(given = "Tsinghua",
                      family = "University",
                      role = "aut"))
as necessary.
NOTE r-devel-linux-x86_64-debian-gcc

CRAN incoming feasibility

Maintainer: ‘Chun-Sheng Liang <liangchunsheng@lzu.edu.cn>’

No Authors@R field in DESCRIPTION.
Please add one, modifying
  Authors@R: c(person(given = "Chun-Sheng",
                      family = "Liang",
                      role = c("aut", "cre"),
                      email = "liangchunsheng@lzu.edu.cn"),
               person(given = "Hao",
                      family = "Wu",
                      role = "aut"),
               person(given = "Hai-Yan",
                      family = "Li",
                      role = "aut"),
               person(given = "Qiang",
                      family = "Zhang",
                      role = "aut"),
               person(given = "Zhanqing",
                      family = "Li",
                      role = "aut"),
               person(given = "Ke-Bin",
                      family = "He",
                      role = "aut"),
               person(given = "Lanzhou",
                      family = "University",
                      role = "aut"),
               person(given = "Tsinghua",
                      family = "University",
                      role = "aut"))
as necessary.
OK r-devel-linux-x86_64-fedora-clang

*


            
OK r-devel-linux-x86_64-fedora-gcc

*


            
OK r-devel-macos-arm64

*


            
OK r-devel-windows-x86_64

*


            
OK r-oldrel-macos-arm64

*


            
OK r-oldrel-macos-x86_64

*


            
OK r-oldrel-windows-x86_64

*


            
OK r-patched-linux-x86_64

*


            
OK r-release-linux-x86_64

*


            
OK r-release-macos-arm64

*


            
OK r-release-macos-x86_64

*


            
OK r-release-windows-x86_64

*


            

Check History

NOTE 12 OK · 2 NOTE · 0 WARNING · 0 ERROR · 0 FAILURE Mar 9, 2026
NOTE r-devel-linux-x86_64-debian-clang

CRAN incoming feasibility

Maintainer: ‘Chun-Sheng Liang <liangchunsheng@lzu.edu.cn>’

No Authors@R field in DESCRIPTION.
Please add one, modifying
  Authors@R: c(person(given = "Chun-Sheng",
                      family = "Liang",
                      role = c("aut", "cre"),
                      email = "liangchunsheng@lzu.edu.cn"),
               person(given = "Hao",
                      family = "Wu",
                      role = "aut"),
               person(given = "Hai-Yan",
                      family = "Li",
NOTE r-devel-linux-x86_64-debian-gcc

CRAN incoming feasibility

Maintainer: ‘Chun-Sheng Liang <liangchunsheng@lzu.edu.cn>’

No Authors@R field in DESCRIPTION.
Please add one, modifying
  Authors@R: c(person(given = "Chun-Sheng",
                      family = "Liang",
                      role = c("aut", "cre"),
                      email = "liangchunsheng@lzu.edu.cn"),
               person(given = "Hao",
                      family = "Wu",
                      role = "aut"),
               person(given = "Hai-Yan",
                      family = "Li",

Dependency Network

Dependencies Reverse dependencies ggplot2 scales foreach doParallel dplyr reshape2 data.table zoo dataprep

Version History

new 0.1.5 Mar 9, 2026