Title: | Family of Lasso Regression |
---|---|
Description: | Provide the implementation of a family of Lasso variants including Dantzig Selector, LAD Lasso, SQRT Lasso, Lq Lasso for estimating high dimensional sparse linear model. We adopt the alternating direction method of multipliers and convert the original optimization problem into a sequential L1 penalized least square minimization problem, which can be efficiently solved by linearization algorithm. A multi-stage screening approach is adopted for further acceleration. Besides the sparse linear model estimation, we also provide the extension of these Lasso variants to sparse Gaussian graphical model estimation including TIGER and CLIME using either L1 or adaptive penalty. Missing values can be tolerated for Dantzig selector and CLIME. The computation is memory-optimized using the sparse matrix output. For more information, please refer to <https://www.jmlr.org/papers/volume16/li15a/li15a.pdf>. |
Authors: | Xingguo Li [aut, cre], Tuo Zhao [aut], Lie Wang [aut], Xiaoming Yuan [aut], Han Liu [aut] |
Maintainer: | Xingguo Li <[email protected]> |
License: | GPL-2 |
Version: | 1.7.0.2 |
Built: | 2025-02-03 03:39:01 UTC |
Source: | https://github.com/cran/flare |
The package "flare" provides the implementation of a family of novel regression methods (Lasso, Dantzig Selector, LAD Lasso, SQRT Lasso, Lq Lasso) and their extensions to sparse precision matrix estimation (TIGER and CLIME using L1) in high dimensions. We adopt the alternating direction method of multipliers and convert the original optimization problem into a sequence of L1-penalized least square minimization problems with the linearization method and multi-stage screening of variables. Missing values can be tolerated for Dantzig selector in the design matrix and response vector, and CLIME in the data matrix. The computation is memory-optimized using the sparse matrix output. In addition, we also provide several convenient regularization parameter selection and visulaization tools.
Package: | flare |
Type: | Package |
Version: | 1.7.0 |
Date: | 2020-11-28 |
License: | GPL-2 |
Xingguo Li, Tuo Zhao, Lie Wang , Xiaoming Yuan and Han Liu
Maintainer: Xingguo Li <[email protected]>
1. E. Candes and T. Tao. The Dantzig selector: Statistical estimation when p is much larger than n. Annals of Statistics, 2007.
2. A. Belloni, V. Chernozhukov and L. Wang. Pivotal recovery of sparse signals via conic programming. Biometrika, 2012.
3. L. Wang. L1 penalized LAD estimator for high dimensional linear regression. Journal of Multivariate Analysis, 2012.
4. J. Liu and J. Ye. Efficient L1/Lq Norm Regularization. Technical Report, 2010.
5. T. Cai, W. Liu and X. Luo. A constrained minimization approach to sparse precision matrix estimation. Journal of the American Statistical Association, 2011.
6. S. Boyd, N. Parikh, E. Chu, B. Peleato, and J. Eckstein, Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers. Foundations and Trends in Machine Learning, 2011.
7. H. Liu and L. Wang. TIGER: A tuning-insensitive approach for optimally estimating large undirected graphs. Technical Report, 2012.
8. B. He and X. Yuan. On non-ergodic convergence rate of Douglas-Rachford alternating direction method of multipliers. Technical Report, 2012.
"slim"
Extract estimated regression coefficient vectors from the solution path.
## S3 method for class 'slim' coef(object, lambda.idx = c(1:3), beta.idx = c(1:3), ...)
## S3 method for class 'slim' coef(object, lambda.idx = c(1:3), beta.idx = c(1:3), ...)
object |
An object with S3 class |
lambda.idx |
The indices of the regularizaiton parameters in the solution path to be displayed. The default values are |
beta.idx |
The indices of the estimate regression coefficient vectors in the solution path to be displayed. The default values are |
... |
Arguments to be passed to methods. |
Xingguo Li, Tuo Zhao, Lie Wang, Xiaoming Yuan and Han Liu
Maintainer: Xingguo Li <[email protected]>
slim
and flare-package
.
Gene expression data (20 genes for 120 samples) from the microarray experiments of mammalianeye tissue samples of Scheetz et al. (2006).
data(eyedata)
data(eyedata)
The format is a list containing conatins a matrix and a vector. 1. x - an 120 by 200 matrix, which represents the data of 120 rats with 200 gene probes. 2. y - a 120-dimensional vector of, which represents the expression level of TRIM32 gene.
This data set contains 120 samples with 200 predictors
Xingguo Li, Tuo Zhao, Lie Wang, Xiaoming Yuan and Han Liu
Maintainer: Xingguo Li <[email protected]>
1. T. Scheetz, k. Kim, R. Swiderski, A. Philp, T. Braun, K. Knudtson, A. Dorrance, G. DiBona, J. Huang, T. Casavant, V. Sheffield, E. Stone .Regulation of gene expression in the mammalian eye and its relevance to eye disease. Proceedings of the National Academy of Sciences of the United States of America, 2006.
data(eyedata) image(x)
data(eyedata) image(x)
Internal flare functions
sugm.likelihood(Sigma, Omega) sugm.tracel2(Sigma, Omega) sugm.cv(obj, loss=c("likelihood", "tracel2"), fold=5) part.cv(n, fold) sugm.clime.ladm.scr(Sigma, lambda, nlambda, n, d, maxdf, rho, shrink, prec, max.ite, verbose) sugm.tiger.ladm.scr(data, n, d, maxdf, rho, lambda, shrink, prec, max.ite, verbose) slim.lad.ladm.scr.btr(Y, X, lambda, nlambda, n, d, maxdf, rho, max.ite, prec, intercept, verbose) slim.sqrt.ladm.scr(Y, X, lambda, nlambda, n, d, maxdf, rho, max.ite, prec, intercept, verbose) slim.dantzig.ladm.scr(Y, X, lambda, nlambda, n, d, maxdf, rho, max.ite, prec, intercept, verbose) slim.lq.ladm.scr.btr(Y, X, q, lambda, nlambda, n, d, maxdf, rho, max.ite, prec, intercept, verbose) slim.lasso.ladm.scr(Y, X, lambda, nlambda, n, d, maxdf, max.ite, prec, intercept, verbose)
sugm.likelihood(Sigma, Omega) sugm.tracel2(Sigma, Omega) sugm.cv(obj, loss=c("likelihood", "tracel2"), fold=5) part.cv(n, fold) sugm.clime.ladm.scr(Sigma, lambda, nlambda, n, d, maxdf, rho, shrink, prec, max.ite, verbose) sugm.tiger.ladm.scr(data, n, d, maxdf, rho, lambda, shrink, prec, max.ite, verbose) slim.lad.ladm.scr.btr(Y, X, lambda, nlambda, n, d, maxdf, rho, max.ite, prec, intercept, verbose) slim.sqrt.ladm.scr(Y, X, lambda, nlambda, n, d, maxdf, rho, max.ite, prec, intercept, verbose) slim.dantzig.ladm.scr(Y, X, lambda, nlambda, n, d, maxdf, rho, max.ite, prec, intercept, verbose) slim.lq.ladm.scr.btr(Y, X, q, lambda, nlambda, n, d, maxdf, rho, max.ite, prec, intercept, verbose) slim.lasso.ladm.scr(Y, X, lambda, nlambda, n, d, maxdf, max.ite, prec, intercept, verbose)
Sigma |
Covariance matrix. |
Omega |
Inverse covariance matrix. |
obj |
An object with S3 class returned from |
loss |
Type of loss function for cross validation. |
fold |
The number of fold for cross validatio. |
n |
The number of observations (sample size). |
d |
Dimension of data. |
maxdf |
Maximal degree of freedom. |
lambda |
Grid of non-negative values for the regularization parameter lambda. |
nlambda |
The number of the regularization parameter lambda. |
shrink |
Shrinkage of regularization parameter based on precision of estimation. |
rho |
Value of augmented Lagrangian multipiler. |
prec |
Stopping criterion. |
max.ite |
Maximal value of iterations. |
data |
|
Y |
Dependent variables in linear regression. |
X |
Design matrix in linear regression. |
q |
The vector norm used for the loss term. |
intercept |
The indicator of whether including intercepts specifically. |
verbose |
Tracing information printing is disabled if |
These are not intended for use by users.
Xingguo Li, Tuo Zhao, Lie Wang, Xiaoming Yuan and Han Liu
Maintainer: Xingguo Li <[email protected]>
sugm
, slim
and flare-package
.
Plot the ROC curve for an object with S3 class "roc"
## S3 method for class 'roc' plot(x, ...)
## S3 method for class 'roc' plot(x, ...)
x |
An object with S3 class |
... |
System reserved (No specific usage) |
Xingguo Li, Tuo Zhao, Lie Wang, Xiaoming Yuan and Han Liu
Maintainer: Xingguo Li <[email protected]>
sugm.roc
, sugm
and flare-package
.
Plot the optimal graph by model selection.
## S3 method for class 'select' plot(x, ...)
## S3 method for class 'select' plot(x, ...)
x |
An object with S3 class |
... |
System reserved (No specific usage) |
Xingguo Li, Tuo Zhao, Lie Wang, Xiaoming Yuan and Han Liu
Maintainer: Xingguo Li <[email protected]>
sugm
and sugm.select
Visualize the covariance matrix, the empirical covariance matrix, the adjacency matrix and the graph pattern of the true graph structure.
## S3 method for class 'sim' plot(x, ...)
## S3 method for class 'sim' plot(x, ...)
x |
An object with S3 class |
... |
Arguments to be passed to methods. |
Xingguo Li, Tuo Zhao, Lie Wang, Xiaoming Yuan and Han Liu
Maintainer: Xingguo Li <[email protected]>
sugm.generator
, sugm
and flare-package
Visualize the solution path of regression estimate corresponding to regularization paramters.
## S3 method for class 'slim' plot(x, ...)
## S3 method for class 'slim' plot(x, ...)
x |
An object with S3 class |
... |
Arguments to be passed to methods. |
Xingguo Li, Tuo Zhao, Lie Wang, Xiaoming Yuan and Han Liu
Maintainer: Xingguo Li <[email protected]>
slim
and flare-package
.
Plot sparsity level information and 3 typical sparse graphs from the graph path.
## S3 method for class 'sugm' plot(x, align = FALSE, ...)
## S3 method for class 'sugm' plot(x, align = FALSE, ...)
x |
An object with S3 class |
align |
If |
... |
Arguments to be passed to methods. |
Xingguo Li, Tuo Zhao, Lie Wang, Xiaoming Yuan and Han Liu
Maintainer: Xingguo Li <[email protected]>
sugm
and flare-package
"slim"
Predicting responses of the given design data.
## S3 method for class 'slim' predict(object, newdata, lambda.idx = c(1:3), Y.pred.idx = c(1:5), ...)
## S3 method for class 'slim' predict(object, newdata, lambda.idx = c(1:3), Y.pred.idx = c(1:5), ...)
object |
An object with S3 class |
newdata |
An optional data frame in which to look for variables with which to predict. If omitted, the traning data of the are used. |
lambda.idx |
The indices of the regularizaiton parameters in the solution path to be displayed. The default values are |
Y.pred.idx |
The indices of the predicted response vectors in the solution path to be displayed. The default values are |
... |
Arguments to be passed to methods. |
predict.slim
produces predicted values of the responses of the newdata
from the estimated beta
values in the object
, i.e.
Y.pred |
The predicted response vectors based on the estimated models. |
Xingguo Li, Tuo Zhao, Lie Wang, Xiaoming Yuan and Han Liu
Maintainer: Xingguo Li <[email protected]>
slim
and flare-package
.
## load library library(flare) ## generate data set.seed(123) n = 100 d = 200 d1 = 10 rho0 = 0.3 lambda = c(3:1)*sqrt(log(d)/n) Sigma = matrix(0,nrow=d,ncol=d) Sigma[1:d1,1:d1] = rho0 diag(Sigma) = 1 mu = rep(0,d) X = mvrnorm(n=2*n,mu=mu,Sigma=Sigma) X.fit = X[1:n,] X.pred = X[(n+1):(2*n),] eps = rt(n=n,df=n-1) beta = c(rep(sqrt(1/3),3),rep(0,d-3)) Y.fit = X.fit%*%beta+eps ## Regression with "dantzig". out=slim(X=X.fit,Y=Y.fit,lambda=lambda,method = "lq",q=1) ## Display results Y=predict(out,X.pred)
## load library library(flare) ## generate data set.seed(123) n = 100 d = 200 d1 = 10 rho0 = 0.3 lambda = c(3:1)*sqrt(log(d)/n) Sigma = matrix(0,nrow=d,ncol=d) Sigma[1:d1,1:d1] = rho0 diag(Sigma) = 1 mu = rep(0,d) X = mvrnorm(n=2*n,mu=mu,Sigma=Sigma) X.fit = X[1:n,] X.pred = X[(n+1):(2*n),] eps = rt(n=n,df=n-1) beta = c(rep(sqrt(1/3),3),rep(0,d-3)) Y.fit = X.fit%*%beta+eps ## Regression with "dantzig". out=slim(X=X.fit,Y=Y.fit,lambda=lambda,method = "lq",q=1) ## Display results Y=predict(out,X.pred)
"roc"
Print the information about true positive rates, false positive rates, the area under curve and maximum F1 score
## S3 method for class 'roc' print(x, ...)
## S3 method for class 'roc' print(x, ...)
x |
An object with S3 class |
... |
Arguments to be passed to methods. |
Xingguo Li, Tuo Zhao, Lie Wang, Xiaoming Yuan and Han Liu
Maintainer: Xingguo Li <[email protected]>
sugm.roc
, sugm
and flare-package
"select"
Print the information about the model usage, graph dimension, model selection criterion, sparsity level of the optimal graph
## S3 method for class 'select' print(x, ...)
## S3 method for class 'select' print(x, ...)
x |
An object with S3 class |
... |
Arguments to be passed to methods. |
Xingguo Li, Tuo Zhao, Lie Wang, Xiaoming Yuan and Han Liu
Maintainer: Xingguo Li <[email protected]>
sugm.select
, sugm
and flare-package
"sim"
Print the information about the sample size, the dimension, the pattern and sparsity of the true graph structure.
## S3 method for class 'sim' print(x, ...)
## S3 method for class 'sim' print(x, ...)
x |
An object with S3 class |
... |
Arguments to be passed to methods. |
Xingguo Li, Tuo Zhao, Lie Wang, Xiaoming Yuan and Han Liu
Maintainer: Xingguo Li <[email protected]>
sugm
and sugm.generator
"slim"
Print a summary of the information about an object with S3 class "slim"
.
## S3 method for class 'slim' print(x, ...)
## S3 method for class 'slim' print(x, ...)
x |
An object with S3 class |
... |
Arguments to be passed to methods. |
This call simply outlines the options used for computing a slim object.
Xingguo Li, Tuo Zhao, Lie Wang, Xiaoming Yuan and Han Liu
Maintainer: Xingguo Li <[email protected]>
slim
and flare-package
.
"sugm"
Print a summary of the information about an object with S3 class "slim"
.
## S3 method for class 'sugm' print(x, ...)
## S3 method for class 'sugm' print(x, ...)
x |
An object with S3 class |
... |
Arguments to be passed to methods. |
This call simply outlines the options used for computing a sugm object.
Xingguo Li, Tuo Zhao, Lie Wang, Xiaoming Yuan and Han Liu
Maintainer: Xingguo Li <[email protected]>
sugm
and flare-package
.
The function "slim" implements a family of Lasso variants for estimating high dimensional sparse linear models including Dantzig Selector, LAD Lasso, SQRT Lasso, Lq Lasso for estimating high dimensional sparse linear model. We adopt the alternating direction method of multipliers (ADMM) and convert the original optimization problem into a sequential L1-penalized least square minimization problem, which can be efficiently solved by combining the linearization and multi-stage screening of varialbes. Missing values can be tolerated for Dantzig selector in the design matrix and response vector.
slim(X, Y, lambda = NULL, nlambda = NULL, lambda.min.value = NULL,lambda.min.ratio = NULL, rho = 1, method="lq", q = 2, res.sd = FALSE, prec = 1e-5, max.ite = 1e5, verbose = TRUE)
slim(X, Y, lambda = NULL, nlambda = NULL, lambda.min.value = NULL,lambda.min.ratio = NULL, rho = 1, method="lq", q = 2, res.sd = FALSE, prec = 1e-5, max.ite = 1e5, verbose = TRUE)
Y |
The |
X |
The |
lambda |
A sequence of decresing positive numbers to control the regularization. Typical usage is to leave the input |
nlambda |
The number of values used in |
lambda.min.value |
The smallest value for |
lambda.min.ratio |
The smallest ratio of the value for |
rho |
The penalty parameter used in |
method |
Dantzig selector is applied if |
q |
The loss function used in Lq Lasso. It is only applicable when |
res.sd |
Flag of whether the response varialbles are standardized. The default value is |
prec |
Stopping criterion. The default value is 1e-5. |
max.ite |
The iteration limit. The default value is 1e5. |
verbose |
Tracing information printing is disabled if |
Standard Lasso
Dantzig selector solves the following optimization problem
loss Lasso solves the following optimization problem
where . Lq Lasso is equivalent to LAD Lasso and SQR Lasso when
and
respectively.
An object with S3 class "slim"
is returned:
beta |
A matrix of regression estimates whose columns correspond to regularization parameters. |
intercept |
The value of intercepts corresponding to regularization parameters. |
Y |
The value of |
X |
The value of |
lambda |
The sequence of regularization parameters |
nlambda |
The number of values used in |
method |
The |
sparsity |
The sparsity levels of the solution path. |
ite |
A list of vectors where ite[[1]] is the number of external iteration and ite[[2]] is the number of internal iteration with the i-th entry corresponding to the i-th regularization parameter. |
verbose |
The |
Xingguo Li, Tuo Zhao, Lie Wang, Xiaoming Yuan and Han Liu
Maintainer: Xingguo Li <[email protected]>
1. E. Candes and T. Tao. The Dantzig selector: Statistical estimation when p is much larger than n. Annals of Statistics, 2007.
2. A. Belloni, V. Chernozhukov and L. Wang. Pivotal recovery of sparse signals via conic programming. Biometrika, 2012.
3. L. Wang. L1 penalized LAD estimator for high dimensional linear regression. Journal of Multivariate Analysis, 2012.
4. J. Liu and J. Ye. Efficient L1/Lq Norm Regularization. Technical Report, 2010.
5. S. Boyd, N. Parikh, E. Chu, B. Peleato, and J. Eckstein, Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers. Foundations and Trends in Machine Learning, 2011.
6. B. He and X. Yuan. On non-ergodic convergence rate of Douglas-Rachford alternating direction method of multipliers. Technical Report, 2012.
flare-package
, print.slim
, plot.slim
, coef.slim
and predict.slim
.
## load library library(flare) ## generate data n = 50 d = 100 X = matrix(rnorm(n*d), n, d) beta = c(3,2,0,1.5,rep(0,d-4)) eps = rnorm(n) Y = X%*%beta + eps nlamb = 5 ratio = 0.3 ## Regression with "dantzig", general "lq" and "lasso" respectively out1 = slim(X=X,Y=Y,nlambda=nlamb,lambda.min.ratio=ratio,method="dantzig") out2 = slim(X=X,Y=Y,nlambda=nlamb,lambda.min.ratio=ratio,method="lq",q=1) out3 = slim(X=X,Y=Y,nlambda=nlamb,lambda.min.ratio=ratio,method="lq",q=1.5) out4 = slim(X=X,Y=Y,nlambda=nlamb,lambda.min.ratio=ratio,method="lq",q=2) out5 = slim(X=X,Y=Y,nlambda=nlamb,lambda.min.ratio=ratio,method="lasso") ## Display results print(out4) plot(out4) coef(out4)
## load library library(flare) ## generate data n = 50 d = 100 X = matrix(rnorm(n*d), n, d) beta = c(3,2,0,1.5,rep(0,d-4)) eps = rnorm(n) Y = X%*%beta + eps nlamb = 5 ratio = 0.3 ## Regression with "dantzig", general "lq" and "lasso" respectively out1 = slim(X=X,Y=Y,nlambda=nlamb,lambda.min.ratio=ratio,method="dantzig") out2 = slim(X=X,Y=Y,nlambda=nlamb,lambda.min.ratio=ratio,method="lq",q=1) out3 = slim(X=X,Y=Y,nlambda=nlamb,lambda.min.ratio=ratio,method="lq",q=1.5) out4 = slim(X=X,Y=Y,nlambda=nlamb,lambda.min.ratio=ratio,method="lq",q=2) out5 = slim(X=X,Y=Y,nlambda=nlamb,lambda.min.ratio=ratio,method="lasso") ## Display results print(out4) plot(out4) coef(out4)
The function "sugm" estimates sparse undirected graphical models, i.e. Gaussian precision matrix, in high dimensions. We adopt two estimation procedures based on column by column regression scheme: (1) Tuning-Insensitive Graph Estimation and Regression based on square root Lasso (tiger); (2) The Constrained L1 Minimization for Sparse Precision Matrix Estimation using either L1 penalty (clime). The optimization algorithm for all three methods are implemented based on the alternating direction method of multipliers (ADMM) with the linearization method and multi-stage screening of variables. Missing values can be tolerated for CLIME in the data matrix. The computation is memory-optimized using the sparse matrix output.
sugm(data, lambda = NULL, nlambda = NULL, lambda.min.ratio = NULL, rho = NULL, method = "tiger", sym = "or", shrink=NULL, prec = 1e-4, max.ite = 1e4, standardize = FALSE, perturb = TRUE, verbose = TRUE)
sugm(data, lambda = NULL, nlambda = NULL, lambda.min.ratio = NULL, rho = NULL, method = "tiger", sym = "or", shrink=NULL, prec = 1e-4, max.ite = 1e4, standardize = FALSE, perturb = TRUE, verbose = TRUE)
data |
There are 2 options for |
lambda |
A sequence of decresing positive numbers to control the regularization. Typical usage is to leave the input |
nlambda |
The number of values used in |
lambda.min.ratio |
The smallest value for |
rho |
Penalty parameter used in the optimization algorithm for |
method |
|
sym |
Symmetrization of output graphs. If |
shrink |
Shrinkage of regularization parameter based on precision of estimation. The default value is 1.5 if |
prec |
Stopping criterion. The default value is 1e-4. |
max.ite |
The iteration limit. The default value is 1e4. |
standardize |
Variables are standardized to have mean zero and unit standard deviation if |
perturb |
The diagonal of |
verbose |
Tracing information printing is disabled if |
CLIME solves the following minimization problem
where and
are element-wise 1-norm and
-norm respectively.
"tiger"
solves the following minimization problem
where and
are element-wise 1-norm and
-norm respectively.
An object with S3 class "sugm"
is returned:
data |
The |
cov.input |
An indicator of the sample covariance. |
lambda |
The sequence of regularization parameters |
nlambda |
The number of values used in |
icov |
A list of |
sym |
The |
method |
The |
path |
A list of |
sparsity |
The sparsity levels of the graph path. |
ite |
If |
df |
It is a |
standardize |
The |
perturb |
The |
verbose |
The |
Xingguo Li, Tuo Zhao, Lie Wang, Xiaoming Yuan and Han Liu
Maintainer: Xingguo Li <[email protected]>
1. T. Cai, W. Liu and X. Luo. A constrained L1 minimization approach to sparse precision matrix estimation. Journal of the American Statistical Association, 2011.
2. H. Liu, L. Wang. TIGER: A tuning-insensitive approach for optimally estimating large undirected graphs. Technical Report, 2012.
3. B. He and X. Yuan. On non-ergodic convergence rate of Douglas-Rachford alternating direction method of multipliers. Technical Report, 2012.
flare-package
, sugm.generator
, sugm.select
, sugm.plot
, sugm.roc
, plot.sugm
, plot.select
, plot.roc
, plot.sim
, print.sugm
, print.select
, print.roc
and print.sim
.
## load package required library(flare) ## generating data n = 50 d = 50 D = sugm.generator(n=n,d=d,graph="band",g=1) plot(D) ## sparse precision matrix estimation with method "clime" out1 = sugm(D$data, method = "clime") plot(out1) sugm.plot(out1$path[[4]]) ## sparse precision matrix estimation with method "tiger" out2 = sugm(D$data, method = "tiger") plot(out2) sugm.plot(out2$path[[5]])
## load package required library(flare) ## generating data n = 50 d = 50 D = sugm.generator(n=n,d=d,graph="band",g=1) plot(D) ## sparse precision matrix estimation with method "clime" out1 = sugm(D$data, method = "clime") plot(out1) sugm.plot(out1$path[[4]]) ## sparse precision matrix estimation with method "tiger" out2 = sugm(D$data, method = "tiger") plot(out2) sugm.plot(out2$path[[5]])
Implements the data generation from multivariate normal distributions with different graph structures, including "random"
, "hub"
, "cluster"
, "band"
, and "scale-free"
.
sugm.generator(n = 200, d = 50, graph = "random", v = NULL, u = NULL, g = NULL, prob = NULL, seed = NULL, vis = FALSE, verbose = TRUE)
sugm.generator(n = 200, d = 50, graph = "random", v = NULL, u = NULL, g = NULL, prob = NULL, seed = NULL, vis = FALSE, verbose = TRUE)
n |
The number of observations (sample size). The default value is |
d |
The number of variables (dimension). For |
graph |
The graph structure with 5 options: |
v |
The off-diagonal elements of the precision matrix, controlling the magnitude of partial correlations with |
u |
A positive number being added to the diagonal elements of the precision matrix, to control the magnitude of partial correlations. The default value is 0.1. |
g |
For |
prob |
For |
seed |
Set seed for data generation. The default value is 1. |
vis |
Visualize the adjacency matrix of the true graph structure, the graph pattern, the covariance matrix and the empirical covariance matrix. The default value is |
verbose |
If |
Given the adjacency matrix theta
, the graph patterns are generated as below:
(I) "random"
: Each pair of off-diagonal elements are randomly set theta[i,j]=theta[j,i]=1
for i!=j
with probability prob
, and 0
other wise. It results in about d*(d-1)*prob/2
edges in the graph.
(II)"hub"
:The row/columns are evenly partitioned into g
disjoint groups. Each group is associated with a "center" row i
in that group. Each pair of off-diagonal elements are set theta[i,j]=theta[j,i]=1
for i!=j
if j
also belongs to the same group as i
and 0
otherwise. It results in d - g
edges in the graph.
(III)"cluster"
:The row/columns are evenly partitioned into g
disjoint groups. Each pair of off-diagonal elements are set theta[i,j]=theta[j,i]=1
for i!=j
with the probability prob
if both i
and j
belong to the same group, and 0
other wise. It results in about g*(d/g)*(d/g-1)*prob/2
edges in the graph.
(IV)"band"
: The off-diagonal elements are set to be theta[i,j]=1
if 1<=|i-j|<=g
and 0
other wise. It results in (2d-1-g)*g/2
edges in the graph.
(V) "scale-free"
: The graph is generated using B-A algorithm. The initial graph has two connected nodes and each new node is connected to only one node in the existing graph with the probability proportional to the degree of the each node in the existing graph. It results in d
edges in the graph.
The adjacency matrix theta
has all diagonal elements equal to 0
. To obtain a positive definite covariance matrix, the smallest eigenvalue of theta*v
(denoted by e
) is computed. Then we set the covariance matrix equal to cov2cor(solve(theta*v+(|e|+0.1+u)*I))
to generate multivariate normal data.
An object with S3 class "sim" is returned:
data |
The |
sigma |
The covariance matrix for the generated data |
omega |
The precision matrix for the generated data |
sigmahat |
The empirical covariance matrix for the generated data |
theta |
The adjacency matrix of true graph structure (in sparse matrix representation) for the generated data |
Xingguo Li, Tuo Zhao, Lie Wang, Xiaoming Yuan and Han Liu
Maintainer: Xingguo Li <[email protected]>
flare
and flare-package
## load package required library(flare) ## band graph with bandwidth 3 L = sugm.generator(graph = "band", g = 3) plot(L) ## random sparse graph L = sugm.generator(vis = TRUE) ## hub graph with 6 hubs L = sugm.generator(graph = "hub", g = 6, vis = TRUE) ## cluster graph with 8 clusters L = sugm.generator(graph = "cluster", g = 8, vis = TRUE) ## scale-free graphs L = sugm.generator(graph="scale-free", vis = TRUE)
## load package required library(flare) ## band graph with bandwidth 3 L = sugm.generator(graph = "band", g = 3) plot(L) ## random sparse graph L = sugm.generator(vis = TRUE) ## hub graph with 6 hubs L = sugm.generator(graph = "hub", g = 6, vis = TRUE) ## cluster graph with 8 clusters L = sugm.generator(graph = "cluster", g = 8, vis = TRUE) ## scale-free graphs L = sugm.generator(graph="scale-free", vis = TRUE)
"sugm"
Implements the graph visualization using adjacency matrix. It can automatic organize 2D embedding layout.
sugm.plot(G, epsflag = FALSE, graph.name = "default", cur.num = 1, location)
sugm.plot(G, epsflag = FALSE, graph.name = "default", cur.num = 1, location)
G |
The adjacency matrix corresponding to the graph. |
epsflag |
If |
graph.name |
The name of the output eps files. The default value is "default". |
cur.num |
The number of plots saved as eps files. Only applicable when |
location |
Target directory. The default value is the current working directory. |
The user can change cur.num
to plot several figures and select the best one. The implementation is based on the popular package "igraph".
Xingguo Li, Tuo Zhao, Lie Wang, Xiaoming Yuan and Han Liu
Maintainer: Xingguo Li <[email protected]>
flare
and flare-package
## load package required library(flare) ## visualize the hub graph L = sugm.generator(graph = "hub") sugm.plot(L$theta) ## visualize the band graph L = sugm.generator(graph = "band",g=5) sugm.plot(L$theta) ## visualize the cluster graph L = sugm.generator(graph = "cluster") sugm.plot(L$theta) ## Not run: #show working directory getwd() #plot 5 graphs and save the plots as eps files in the working directory sugm.plot(L$theta, epsflag = TRUE, cur.num = 5) ## End(Not run)
## load package required library(flare) ## visualize the hub graph L = sugm.generator(graph = "hub") sugm.plot(L$theta) ## visualize the band graph L = sugm.generator(graph = "band",g=5) sugm.plot(L$theta) ## visualize the cluster graph L = sugm.generator(graph = "cluster") sugm.plot(L$theta) ## Not run: #show working directory getwd() #plot 5 graphs and save the plots as eps files in the working directory sugm.plot(L$theta, epsflag = TRUE, cur.num = 5) ## End(Not run)
"sugm"
Draws ROC curve for a graph path according to the true graph structure.
sugm.roc(path, theta, verbose = TRUE)
sugm.roc(path, theta, verbose = TRUE)
path |
A graph path. |
theta |
The true graph structure. |
verbose |
If |
To avoid the horizontal oscillation, false positive rates is automatically sorted in the ascent oder and true positive rates also follow the same order.
An object with S3 class "roc" is returned:
F1 |
The F1 scores along the graph path. |
tp |
The true positive rates along the graph path |
fp |
The false positive rates along the graph paths |
AUC |
Area under the ROC curve |
For a lasso regression, the number of nonzero coefficients is at most n-1
. If d>>n
, even when regularization parameter is very small, the estimated graph may still be sparse. In this case, the AUC may not be a good choice to evaluate the performance.
Xingguo Li, Tuo Zhao, Lie Wang, Xiaoming Yuan and Han Liu
Maintainer: Xingguo Li <[email protected]>
sugm
and flare-package
## load package required library(flare) #generate data L = sugm.generator(d = 30, graph = "random", prob = 0.1) out1 = sugm(L$data, lambda=10^(seq(log10(.4), log10(0.03), length.out=20))) #draw ROC curve Z1 = sugm.roc(out1$path,L$theta) #Maximum F1 score max(Z1$F1)
## load package required library(flare) #generate data L = sugm.generator(d = 30, graph = "random", prob = 0.1) out1 = sugm(L$data, lambda=10^(seq(log10(.4), log10(0.03), length.out=20))) #draw ROC curve Z1 = sugm.roc(out1$path,L$theta) #Maximum F1 score max(Z1$F1)
Implements the regularization parameter selection for high dimensional undirected graphical models. The optional approaches are stability approach to regularization selection (stars) and cross validation selection (cv).
sugm.select(est, criterion = "stars", stars.subsample.ratio = NULL, stars.thresh = 0.1,rep.num = 20, fold = 5, loss="likelihood", verbose = TRUE)
sugm.select(est, criterion = "stars", stars.subsample.ratio = NULL, stars.thresh = 0.1,rep.num = 20, fold = 5, loss="likelihood", verbose = TRUE)
est |
An object with S3 class |
criterion |
Model selection criterion. |
stars.subsample.ratio |
The subsampling ratio. The default value is |
stars.thresh |
The variability threshold in stars. The default value is |
rep.num |
The number of subsamplings. The default value is |
fold |
The number of folds used in cross validation. The default value is |
loss |
Loss to be used in cross validation. Two losses are available: |
verbose |
If |
Stability approach to regularization selection (stars) is a natural way to select optimal regularization parameter for all three estimation methods. It selects the optimal graph by variability of subsamplings and tends to over-select edges in Gaussian graphical models. Besides selecting the regularization parameters, stars can also provide an additional estimated graph by merging the corresponding subsampled graphs using the frequency counts. The K-fold cross validation is also provided for selecting the parameter lambda
, and two loss functions are adopted as follow
An object with S3 class "select" is returned:
refit |
The optimal graph selected from the graph path |
opt.icov |
The optimal precision matrix selected. |
merge |
The graph path estimated by merging the subsampling paths. Only applicable when the input |
variability |
The variability along the subsampling paths. Only applicable when the input |
opt.index |
The index of the selected regularization parameter. |
opt.lambda |
The selected regularization/thresholding parameter. |
opt.sparsity |
The sparsity level of |
and anything else inluded in the input est
The model selection is NOT available when the data input is the sample covaraince matrix.
Xingguo Li, Tuo Zhao, Lie Wang, Xiaoming Yuan and Han Liu
Maintainer: Xingguo Li <[email protected]>
1. T. Cai, W. Liu and X. Luo. A constrained minimization approach to sparse precision matrix estimation. Journal of the American Statistical Association, 2011.
2. B. He and X. Yuan. On non-ergodic convergence rate of Douglas-Rachford alternating direction method of multipliers. Technical Report, 2012.
sugm
and flare-package
.
## load package required library(flare) #generate data L = sugm.generator(d = 10, graph="hub") out1 = sugm(L$data) #model selection using stars #out1.select1 = sugm.select(out1, criterion = "stars", stars.thresh = 0.1) #plot(out1.select1) #model selection using cross validation out1.select2 = sugm.select(out1, criterion = "cv") plot(out1.select2)
## load package required library(flare) #generate data L = sugm.generator(d = 10, graph="hub") out1 = sugm(L$data) #model selection using stars #out1.select1 = sugm.select(out1, criterion = "stars", stars.thresh = 0.1) #plot(out1.select1) #model selection using cross validation out1.select2 = sugm.select(out1, criterion = "cv") plot(out1.select2)