Skip to content

SimCorrMix

Simulation of Correlated Data with Multiple Variable Types Including Continuous and Count Mixture Distributions

v0.1.1 · Jul 1, 2018 · GPL-2

Description

Generate continuous (normal, non-normal, or mixture distributions), binary, ordinal, and count (regular or zero-inflated, Poisson or Negative Binomial) variables with a specified correlation matrix, or one continuous variable with a mixture distribution. This package can be used to simulate data sets that mimic real-world clinical or genetic data sets (i.e., plasmodes, as in Vaughan et al., 2009 <DOI:10.1016/j.csda.2008.02.032>). The methods extend those found in the 'SimMultiCorrData' R package. Standard normal variables with an imposed intermediate correlation matrix are transformed to generate the desired distributions. Continuous variables are simulated using either Fleishman (1978)'s third order <DOI:10.1007/BF02293811> or Headrick (2002)'s fifth order <DOI:10.1016/S0167-9473(02)00072-5> polynomial transformation method (the power method transformation, PMT). Non-mixture distributions require the user to specify mean, variance, skewness, standardized kurtosis, and standardized fifth and sixth cumulants. Mixture distributions require these inputs for the component distributions plus the mixing probabilities. Simulation occurs at the component level for continuous mixture distributions. The target correlation matrix is specified in terms of correlations with components of continuous mixture variables. These components are transformed into the desired mixture variables using random multinomial variables based on the mixing probabilities. However, the package provides functions to approximate expected correlations with continuous mixture variables given target correlations with the components. Binary and ordinal variables are simulated using a modification of ordsample() in package 'GenOrd'. Count variables are simulated using the inverse CDF method. There are two simulation pathways which calculate intermediate correlations involving count variables differently. Correlation Method 1 adapts Yahav and Shmueli's 2012 method <DOI:10.1002/asmb.901> and performs best with large count variable means and positive correlations or small means and negative correlations. Correlation Method 2 adapts Barbiero and Ferrari's 2015 modification of the 'GenOrd' package <DOI:10.1002/asmb.2072> and performs best under the opposite scenarios. The optional error loop may be used to improve the accuracy of the final correlation matrix. The package also contains functions to calculate the standardized cumulants of continuous mixture distributions, check parameter inputs, calculate feasible correlation boundaries, and summarize and plot simulated variables.

Downloads

199

Last 30 days

16764th

199

Last 90 days

199

Last year

CRAN Check Status

7 NOTE
7 OK
Show all 14 flavors
Flavor Status
r-devel-linux-x86_64-debian-clang NOTE
r-devel-linux-x86_64-debian-gcc NOTE
r-devel-linux-x86_64-fedora-clang NOTE
r-devel-linux-x86_64-fedora-gcc NOTE
r-devel-macos-arm64 OK
r-devel-windows-x86_64 OK
r-oldrel-macos-arm64 NOTE
r-oldrel-macos-x86_64 NOTE
r-oldrel-windows-x86_64 NOTE
r-patched-linux-x86_64 OK
r-release-linux-x86_64 OK
r-release-macos-arm64 OK
r-release-macos-x86_64 OK
r-release-windows-x86_64 OK
Check details (14 non-OK)
NOTE r-devel-linux-x86_64-debian-clang

CRAN incoming feasibility

Maintainer: ‘Allison Cynthia Fialkowski <allijazz@uab.edu>’

No Authors@R field in DESCRIPTION.
Please add one, modifying
  Authors@R: person(given = c("Allison", "Cynthia"),
                    family = "Fialkowski",
                    role = c("aut", "cre"),
                    email = "allijazz@uab.edu")
as necessary.

Found the following (possibly) invalid file URIs:
  URI: corr_bounds.html
    From: inst/doc/method_comp.html
  URI: variable_types
    From: inst/doc/workflow.html
  URI: errorloop.html
    From: inst/doc/workflow.html

Found the following \keyword or \concept entries
which likely give several index terms:
  File ‘intercorr_cont.Rd’:
    \keyword{Fleishman,}
    \keyword{continuous,}
    \keyword{correlation,}
NOTE r-devel-linux-x86_64-debian-gcc

CRAN incoming feasibility

Maintainer: ‘Allison Cynthia Fialkowski <allijazz@uab.edu>’

No Authors@R field in DESCRIPTION.
Please add one, modifying
  Authors@R: person(given = c("Allison", "Cynthia"),
                    family = "Fialkowski",
                    role = c("aut", "cre"),
                    email = "allijazz@uab.edu")
as necessary.

Found the following (possibly) invalid file URIs:
  URI: corr_bounds.html
    From: inst/doc/method_comp.html
  URI: variable_types
    From: inst/doc/workflow.html
  URI: errorloop.html
    From: inst/doc/workflow.html

Found the following \keyword or \concept entries
which likely give several index terms:
  File ‘intercorr_cont.Rd’:
    \keyword{Fleishman,}
    \keyword{continuous,}
    \keyword{correlation,}
NOTE r-devel-linux-x86_64-fedora-clang

dependencies in R code

Namespaces in Imports field not imported from:
  ‘MASS’ ‘grid’
  All declared Imports should be used.
NOTE r-devel-linux-x86_64-fedora-gcc

dependencies in R code

Namespaces in Imports field not imported from:
  ‘MASS’ ‘grid’
  All declared Imports should be used.
OK r-devel-macos-arm64

*


            
OK r-devel-windows-x86_64

*


            
NOTE r-oldrel-macos-arm64

LazyData

  'LazyData' is specified without a 'data' directory
NOTE r-oldrel-macos-x86_64

LazyData

  'LazyData' is specified without a 'data' directory
NOTE r-oldrel-windows-x86_64

LazyData

  'LazyData' is specified without a 'data' directory
OK r-patched-linux-x86_64

*


            
OK r-release-linux-x86_64

*


            
OK r-release-macos-arm64

*


            
OK r-release-macos-x86_64

*


            
OK r-release-windows-x86_64

*


            

Check History

NOTE 7 OK · 7 NOTE · 0 WARNING · 0 ERROR · 0 FAILURE Mar 9, 2026
NOTE r-devel-linux-x86_64-debian-clang

CRAN incoming feasibility

Maintainer: ‘Allison Cynthia Fialkowski <allijazz@uab.edu>’

No Authors@R field in DESCRIPTION.
Please add one, modifying
  Authors@R: person(given = c("Allison", "Cynthia"),
                    family = "Fialkowski",
                    role = c("aut", "cre"),
                    email = "allijazz@uab.edu")
as necessary.

Found the following (possibly) invalid file URIs:
  URI: corr_bounds.html
    From: inst/doc/method_comp.html
  URI: variable_types
    From: inst/doc/workflow.html
  URI: err
NOTE r-devel-linux-x86_64-debian-gcc

CRAN incoming feasibility

Maintainer: ‘Allison Cynthia Fialkowski <allijazz@uab.edu>’

No Authors@R field in DESCRIPTION.
Please add one, modifying
  Authors@R: person(given = c("Allison", "Cynthia"),
                    family = "Fialkowski",
                    role = c("aut", "cre"),
                    email = "allijazz@uab.edu")
as necessary.

Found the following (possibly) invalid file URIs:
  URI: corr_bounds.html
    From: inst/doc/method_comp.html
  URI: variable_types
    From: inst/doc/workflow.html
  URI: err
NOTE r-devel-linux-x86_64-fedora-clang

dependencies in R code

Namespaces in Imports field not imported from:
  ‘MASS’ ‘grid’
  All declared Imports should be used.
NOTE r-devel-linux-x86_64-fedora-gcc

dependencies in R code

Namespaces in Imports field not imported from:
  ‘MASS’ ‘grid’
  All declared Imports should be used.
NOTE r-oldrel-macos-arm64

LazyData

  'LazyData' is specified without a 'data' directory
NOTE r-oldrel-macos-x86_64

LazyData

  'LazyData' is specified without a 'data' directory
NOTE r-oldrel-windows-x86_64

LazyData

  'LazyData' is specified without a 'data' directory

Dependency Network

Dependencies Reverse dependencies SimMultiCorrData BB nleqslv MASS mvtnorm Matrix VGAM triangle ggplot2 SimCorrMix

Version History

new 0.1.1 Mar 9, 2026