Title: | Variable Selection via Tilted Correlation Screening Algorithm |
---|---|
Description: | Implements an algorithm for variable selection in high-dimensional linear regression using the "tilted correlation", a new way of measuring the contribution of each variable to the response which takes into account high correlations among the variables in a data-driven way. |
Authors: | Haeran Cho [aut, cre], Piotr Fryzlewicz [aut] |
Maintainer: | Haeran Cho <[email protected]> |
License: | GPL (>= 2) |
Version: | 1.1.1 |
Built: | 2025-03-01 02:48:45 UTC |
Source: | https://github.com/cran/tilting |
Implements an algorithm for variable selection in high-dimensional linear regression using the "tilted correlation", a way of measuring the contribution of each variable to the response which takes into account high correlations among the variables in a data-driven way.
Package: | tilting |
Type: | Package |
Version: | 1.1.1 |
Date: | 2016-12-22 |
License: | GPL (>= 2) |
LazyLoad: | yes |
The main function of the package is tilting
.
Haeran Cho, Piotr Fryzlewicz
Maintainer: Haeran Cho <[email protected]>
H. Cho and P. Fryzlewicz (2012) High-dimensional variable selection via tilting, Journal of the Royal Statistical Society Series B, 74: 593-622.
X <- matrix(rnorm(100*100), 100, 100) # 100-by-100 design matrix y <- apply(X[,1:5], 1, sum)+rnorm(100) # first five variables are significant tilt <- tilting(X, y, op=2) tilt$active.hat # returns the finally selected variables
X <- matrix(rnorm(100*100), 100, 100) # 100-by-100 design matrix y <- apply(X[,1:5], 1, sum)+rnorm(100) # first five variables are significant tilt <- tilting(X, y, op=2) tilt$active.hat # returns the finally selected variables
The function returns a vector containing the L2 norm of each column for a given matrix.
col.norm(X)
col.norm(X)
X |
a matrix for which the column norms are computed. |
A vector containing the L2 norm of the columns of X is returned.
Haeran Cho
The function selects a threshold for sample correlation matrix.
get.thr(C, n, p, max.num = 1, alpha = NULL, step = NULL)
get.thr(C, n, p, max.num = 1, alpha = NULL, step = NULL)
C |
sample correlation matrix of a design matrix. |
n |
the number of observations of the design matrix. |
p |
the number of variables of the design matrix. |
max.num |
the number of times for which the threshold selection procedure is repeated. Usually max.num==1 is used. |
alpha |
The level at which the false discovery rate is controlled. When alpha==NULL, it is set to be 1/sqrt(p). |
step |
the size of a step taken when screening the p(p-1)/2 off-diagonal elements of C. |
thr |
selected threshold. |
thr.seq |
when max.num>1, the sequence of thresholds selected at each iteration. |
Haeran Cho
H. Cho and P. Fryzlewicz (2012) High-dimensional variable selection via tilting, Journal of the Royal Statistical Society Series B, 74: 593-622.
The function returns an estimate of the coefficient vector for a linear regression problem by setting the coefficients corresponding to a given index set to be the least squares estimate and the rest to be equal to zero.
lse.beta(X, y, active = NULL)
lse.beta(X, y, active = NULL)
X |
design matrix. |
y |
response vector. |
active |
the index set on which the least squares estimate is computed. |
An estimate of the coefficient vector is returned as above. If active==NULL, a vector of zeros is returned.
Haeran Cho
The function computes the projection matrix onto a set of columns of a given matrix.
projection(X, active = NULL)
projection(X, active = NULL)
X |
a matrix containing the columns onto which the projection matrix is computed. |
active |
an index set of the columns of X. |
Returns the projection matrix onto the columns of "X" whose indices are included in "active". When active==NULL, a null matrix is returned.
Haeran Cho
The function returns the final model as a subset of the active set chosen by Tilted Correlation Screening algorithm, for which the extended BIC is minimised.
select.model(bic.seq, active)
select.model(bic.seq, active)
bic.seq |
the sequence of extended BIC at each iteration. |
active |
the index set of selected variables by Tilted Correlation Screening algorithm. |
The index set of finally selected variables is returned.
Haeran Cho
For a given matrix and a threshold, the function performs element-wise hard-thresholding based on the absolute value of each element.
thresh(C, alph, eps = 1e-10)
thresh(C, alph, eps = 1e-10)
C |
a matrix on which the hard-thresholding is performed. |
alph |
threshold. |
eps |
effective zero. |
Returns the matrix C after hard-thresholding.
Haeran Cho
Given a design matrix and a response vector, the function selects a threshold for the sample correlation matrix, computes an adaptive measure for the contribution of each variable to the response variable based on the thus-thresholded sample correlation matrix, and chooses a variable at each iteration. Once variables are selected in the "active" set, the extended BIC is used for the final model selection.
tilting(X, y, thr.step = NULL, thr.rep = 1, max.size = NULL, max.count = NULL, op = 2, bic.gamma = 1, eps = 1e-10)
tilting(X, y, thr.step = NULL, thr.rep = 1, max.size = NULL, max.count = NULL, op = 2, bic.gamma = 1, eps = 1e-10)
X |
design matrix. |
y |
response vector. |
thr.step |
a step size used for threshold selection. When thr.step==NULL, it is chosen automatically. |
thr.rep |
the number of times for which the threshold selection procedure is repeated. |
max.size |
the maximum number of the variables conditional on which the contribution of each variable to the response is measured (when max.size==NULL, it is set to be half the number of observations). |
max.count |
the maximum number of iterations. |
op |
when op==1, rescaling 1 is used to compute the tilted correlation. If op==2, rescaling 2 is used. |
bic.gamma |
a parameter used to compute the extended BIC. |
eps |
an effective zero. |
active |
active set containing the variables selected over the iterations. |
thr.seq |
a sequence of thresholds selected over the iterations. |
bic.seq |
extended BIC computed over the iterations. |
active.hat |
finally chosen variables using the extended BIC. |
Haeran Cho
H. Cho and P. Fryzlewicz (2012) High-dimensional variable selection via tilting, Journal of the Royal Statistical Society Series B, 74: 593-622.
X<-matrix(rnorm(100*100), 100, 100) # 100-by-100 design matrix y<-apply(X[,1:5], 1, sum)+rnorm(100) # first five variables are significant tilt<-tilting(X, y, op=2) tilt$active.hat # returns the finally selected variables
X<-matrix(rnorm(100*100), 100, 100) # 100-by-100 design matrix y<-apply(X[,1:5], 1, sum)+rnorm(100) # first five variables are significant tilt<-tilting(X, y, op=2) tilt$active.hat # returns the finally selected variables