Title: | Zipf Extended Distributions |
---|---|
Description: | Implementation of four extensions of the Zipf distribution: the Marshall-Olkin Extended Zipf (MOEZipf) Pérez-Casany, M., & Casellas, A. (2013) <arXiv:1304.4540>, the Zipf-Poisson Extreme (Zipf-PE), the Zipf-Poisson Stopped Sum (Zipf-PSS) and the Zipf-Polylog distributions. In log-log scale, the two first extensions allow for top-concavity and top-convexity while the third one only allows for top-concavity. All the extensions maintain the linearity associated with the Zipf model in the tail. |
Authors: | Ariel Duarte-López [aut, cre] (0000-0002-7432-0344), Marta Pérez-Casany [aut] (0000-0003-3675-6902) |
Maintainer: | Ariel Duarte-López <[email protected]> |
License: | GPL-3 |
Version: | 1.0.2 |
Built: | 2025-02-10 03:44:22 UTC |
Source: | https://github.com/ardlop/zipfextr |
The selection of appropiate initial values to compute the maximum likelihood estimations reduces the number of iterations which in turn, reduces the computation time. The initial values proposed by this function are computed using the first two empirical frequencies.
getInitialValues(data, model = "zipf")
getInitialValues(data, model = "zipf")
data |
Matrix of count data. |
model |
Specify the model that requests the initial values (default='zipf'). |
The argument data
is a two column matrix with the first column containing the observations and
the second column containing their frequencies. The argument model
refers to the selected model of those
implemented in the package. The possible values are: zipf, moezipf, zipfpe,
zipfpss or its zero truncated version zt_zipfpss. By default, the selected model is the Zipf one.
For the MOEZipf, the Zipf-PE and the zero truncated Zipf-PSS models that contain the Zipf model as
a particular case, the value will correspond to the one of the Zipf model (i.e.
for the MOEZipf,
for the Zipf-PE and
for the zero truncated Zipf-PSS model) and the initial value for
is set to be equal to:
where and
are the empirical relative frequencies of one and two.
This value is obtained equating the two empirical probabilities to their theoritical ones.
For the case of the Zipf-PSS the proposed initial values are obtained equating the empirical probability of zero to the theoretical one which gives:
where is the empirical relative frequency of zero. The initial value of
is obtained
equating the ratio of the theoretical probabilities at zero and one to the empirical ones. This gives place to:
where and
are the empirical relative frequencies associated to the values 0 and 1 respectively.
The inverse of the Riemman Zeta function is obtained using the
optim
routine.
Returns the initial values of the parameters for a given distribution.
Güney, Y., Tuaç, Y., & Arslan, O. (2016). Marshall–Olkin distribution: parameter estimation and application to cancer data. Journal of Applied Statistics, 1-13.
data <- rmoezipf(100, 2.5, 1.3) data <- as.data.frame(table(data)) data[,1] <- as.numeric(levels(data[,1])[data[,1]]) initials <- getInitialValues(data, model='zipf')
data <- rmoezipf(100, 2.5, 1.3) data <- as.data.frame(table(data)) data[,1] <- as.numeric(levels(data[,1])[data[,1]]) initials <- getInitialValues(data, model='zipf')
Probability mass function, cumulative distribution function, quantile function and random number
generation for the MOEZipf distribution with parameters and
. The support of the MOEZipf
distribution are the strictly positive integer numbers large or equal than one.
dmoezipf(x, alpha, beta, log = FALSE) pmoezipf(q, alpha, beta, log.p = FALSE, lower.tail = TRUE) qmoezipf(p, alpha, beta, log.p = FALSE, lower.tail = TRUE) rmoezipf(n, alpha, beta)
dmoezipf(x, alpha, beta, log = FALSE) pmoezipf(q, alpha, beta, log.p = FALSE, lower.tail = TRUE) qmoezipf(p, alpha, beta, log.p = FALSE, lower.tail = TRUE) rmoezipf(n, alpha, beta)
x , q
|
Vector of positive integer values. |
alpha |
Value of the |
beta |
Value of the |
log , log.p
|
Logical; if TRUE, probabilities p are given as log(p). |
lower.tail |
Logical; if TRUE (default), probabilities are |
p |
Vector of probabilities. |
n |
Number of random values to return. |
The probability mass function at a positive integer value of the MOEZipf distribution with
parameters
and
is computed as follows:
where is the Riemann-zeta function at
,
is the Hurtwitz zeta function with arguments
and x, and
.
The cumulative distribution function, at a given positive integer value ,
is computed as
, where the survival function
is equal to:
The quantile of the MOEZipf distribution of a given probability value p
is equal to the quantile of the Zipf
distribution at the value:
The quantiles of the Zipf distribution are computed by means of the tolerance
package.
To generate random data from a MOEZipf one applies the quantile function over n values randomly generated from an Uniform distribution in the interval (0, 1).
dmoezipf
gives the probability mass function,
pmoezipf
gives the cumulative distribution function,
qmoezipf
gives the quantile function, and
rmoezipf
generates random values from a MOEZipf distribution.
Casellas, A. (2013) La distribució Zipf Estesa segons la transformació Marshall-Olkin. Universitat Politécnica de Catalunya.
Devroye L. (1986) Non-Uniform Random Variate Generation. Springer, New York, NY.
Duarte-López, A., Prat-Pérez, A., & Pérez-Casany, M. (2015). Using the Marshall-Olkin Extended Zipf Distribution in Graph Generation. European Conference on Parallel Processing, pp. 493-502, Springer International Publishing.
Pérez-Casany, M. and Casellas, A. (2013) Marshall-Olkin Extended Zipf Distribution. arXiv preprint arXiv:1304.4540.
Young, D. S. (2010). Tolerance: an R package for estimating tolerance intervals. Journal of Statistical Software, 36(5), 1-39.
dmoezipf(1:10, 2.5, 1.3) pmoezipf(1:10, 2.5, 1.3) qmoezipf(0.56, 2.5, 1.3) rmoezipf(10, 2.5, 1.3)
dmoezipf(1:10, 2.5, 1.3) pmoezipf(1:10, 2.5, 1.3) qmoezipf(0.56, 2.5, 1.3) rmoezipf(10, 2.5, 1.3)
For a given sample of strictly positive integer numbers, usually of the type of ranking data or frequencies of frequencies data, estimates the parameters of the MOEZipf distribution by means of the maximum likelihood method. The input data should be provided as a frequency matrix.
moezipfFit(data, init_alpha = NULL, init_beta = NULL, level = 0.95, ...) ## S3 method for class 'moezipfR' residuals(object, ...) ## S3 method for class 'moezipfR' fitted(object, ...) ## S3 method for class 'moezipfR' coef(object, ...) ## S3 method for class 'moezipfR' plot(x, ...) ## S3 method for class 'moezipfR' print(x, ...) ## S3 method for class 'moezipfR' summary(object, ...) ## S3 method for class 'moezipfR' logLik(object, ...) ## S3 method for class 'moezipfR' AIC(object, ...) ## S3 method for class 'moezipfR' BIC(object, ...)
moezipfFit(data, init_alpha = NULL, init_beta = NULL, level = 0.95, ...) ## S3 method for class 'moezipfR' residuals(object, ...) ## S3 method for class 'moezipfR' fitted(object, ...) ## S3 method for class 'moezipfR' coef(object, ...) ## S3 method for class 'moezipfR' plot(x, ...) ## S3 method for class 'moezipfR' print(x, ...) ## S3 method for class 'moezipfR' summary(object, ...) ## S3 method for class 'moezipfR' logLik(object, ...) ## S3 method for class 'moezipfR' AIC(object, ...) ## S3 method for class 'moezipfR' BIC(object, ...)
data |
Matrix of count data in form of a table of frequencies. |
init_alpha |
Initial value of |
init_beta |
Initial value of |
level |
Confidence level used to calculate the confidence intervals (default 0.95). |
... |
Further arguments to the generic functions. The extra arguments are passing to the optim function. |
object |
An object from class "moezipfR" (output of moezipfFit function). |
x |
An object from class "moezipfR" (output of moezipfFit function). |
The argument data
is a two column matrix with the first column containing the observations and
the second column containing their frequencies.
The log-likelihood function is equal to:
where is the absolute frequency of
,
is the number of different values in the sample and
is the sample size,
i.e.
.
By default the initial values of the parameters are computed using the function getInitialValues
.
The function optim is used to estimate the parameters.
Returns a moezipfR object composed by the maximum likelihood parameter estimations jointly with their standard deviation and confidence intervals. It also contains the value of the log-likelihood at the maximum likelihood estimator.
data <- rmoezipf(100, 2.5, 1.3) data <- as.data.frame(table(data)) data[,1] <- as.numeric(as.character(data[,1])) data[,2] <- as.numeric(as.character(data[,2])) initValues <- getInitialValues(data, model='moezipf') obj <- moezipfFit(data, init_alpha = initValues$init_alpha, init_beta = initValues$init_beta)
data <- rmoezipf(100, 2.5, 1.3) data <- as.data.frame(table(data)) data[,1] <- as.numeric(as.character(data[,1])) data[,2] <- as.numeric(as.character(data[,2])) initValues <- getInitialValues(data, model='moezipf') obj <- moezipfFit(data, init_alpha = initValues$init_alpha, init_beta = initValues$init_beta)
Computes the expected value of the MOEZipf distribution for given values of parameters
and
.
moezipfMean(alpha, beta, tolerance = 10^(-4))
moezipfMean(alpha, beta, tolerance = 10^(-4))
alpha |
Value of the |
beta |
Value of the |
tolerance |
Tolerance used in the calculations (default = |
The mean of the distribution only exists for strictly greater than 2.
It is computed by calculating the partial sums of the serie, and stopping when two
consecutive partial sums differ less than the
tolerance
value.
The value of the last partial sum is returned.
A positive real value corresponding to the mean value of the distribution.
moezipfMean(2.5, 1.3) moezipfMean(2.5, 1.3, 10^(-3))
moezipfMean(2.5, 1.3) moezipfMean(2.5, 1.3, 10^(-3))
General function to compute the k-th moment of the MOEZipf distribution for any integer value ,
when it exists. The k-th moment exists if and only if
.
For k = 1, this function returns the same value as the moezipfMean function.
moezipfMoments(k, alpha, beta, tolerance = 10^(-4))
moezipfMoments(k, alpha, beta, tolerance = 10^(-4))
k |
Order of the moment to compute. |
alpha |
Value of the |
beta |
Value of the |
tolerance |
Tolerance used in the calculations (default = |
The k-th moment is computed by calculating the partial sums of the serie, and stopping when two
consecutive partial sums differ less than the tolerance
value.
The value of the last partial sum is returned.
A positive real value corresponding to the k-th moment of the distribution.
moezipfMoments(3, 4.5, 1.3) moezipfMoments(3, 4.5, 1.3, 1*10^(-3))
moezipfMoments(3, 4.5, 1.3) moezipfMoments(3, 4.5, 1.3, 1*10^(-3))
Computes the variance of the MOEZipf distribution for given values of and
.
moezipfVariance(alpha, beta, tolerance = 10^(-4))
moezipfVariance(alpha, beta, tolerance = 10^(-4))
alpha |
Value of the |
beta |
Value of the |
tolerance |
Tolerance used in the calculations. (default = |
The variance of the distribution only exists for strictly greater than 3.
A positive real value corresponding to the variance of the distribution.
moezipfVariance(3.5, 1.3)
moezipfVariance(3.5, 1.3)
Probability mass function for the zero inflated Zipf-PSS distribution with parameters ,
and
.
The support of thezero inflated Zipf-PSS distribution are the positive integer numbers including the zero value.
d_zi_zipfpss(x, alpha, lambda, w, log = FALSE)
d_zi_zipfpss(x, alpha, lambda, w, log = FALSE)
x |
Vector of positive integer values. |
alpha |
Value of the |
lambda |
Value of the |
w |
Value of the |
log |
Logical; if TRUE, probabilities p are given as log(p). |
The support of the parameter increases when the distribution is truncated at zero being
. It has been proved that when
one has the degenerated version of the distribution at one.
Panjer, H. H. (1981). Recursive evaluation of a family of compound distributions. ASTIN Bulletin: The Journal of the IAA, 12(1), 22-26.
Sundt, B., & Jewell, W. S. (1981). Further results on recursive evaluation of compound distributions. ASTIN Bulletin: The Journal of the IAA, 12(1), 27-39.
For a given sample of strictly positive integer numbers, usually of the type of ranking data or frequencies of frequencies data, estimates the parameters of the zero inflated Zipf-PSS distribution by means of the maximum likelihood method. The input data should be provided as a frequency matrix.
zi_zipfpssFit(data, init_alpha = 1.5, init_lambda = 1.5, init_w = 0.1, level = 0.95, ...) ## S3 method for class 'zi_zipfpssR' residuals(object, ...) ## S3 method for class 'zi_zipfpssR' fitted(object, ...) ## S3 method for class 'zi_zipfpssR' coef(object, ...) ## S3 method for class 'zi_zipfpssR' plot(x, ...) ## S3 method for class 'zi_zipfpssR' print(x, ...) ## S3 method for class 'zi_zipfpssR' summary(object, ...) ## S3 method for class 'zi_zipfpssR' logLik(object, ...) ## S3 method for class 'zi_zipfpssR' AIC(object, ...) ## S3 method for class 'zi_zipfpssR' BIC(object, ...)
zi_zipfpssFit(data, init_alpha = 1.5, init_lambda = 1.5, init_w = 0.1, level = 0.95, ...) ## S3 method for class 'zi_zipfpssR' residuals(object, ...) ## S3 method for class 'zi_zipfpssR' fitted(object, ...) ## S3 method for class 'zi_zipfpssR' coef(object, ...) ## S3 method for class 'zi_zipfpssR' plot(x, ...) ## S3 method for class 'zi_zipfpssR' print(x, ...) ## S3 method for class 'zi_zipfpssR' summary(object, ...) ## S3 method for class 'zi_zipfpssR' logLik(object, ...) ## S3 method for class 'zi_zipfpssR' AIC(object, ...) ## S3 method for class 'zi_zipfpssR' BIC(object, ...)
data |
Matrix of count data in form of table of frequencies. |
init_alpha |
Initial value of |
init_lambda |
Initial value of |
init_w |
Initial value of |
level |
Confidence level used to calculate the confidence intervals (default 0.95). |
... |
Further arguments to the generic functions. The extra arguments are passing to the optim function. |
object |
An object from class "zpssR" (output of zipfpssFit function). |
x |
An object from class "zpssR" (output of zipfpssFit function). |
The argument data
is a two column matrix with the first column containing the observations and
the second column containing their frequencies.
Panjer, H. H. (1981). Recursive evaluation of a family of compound distributions. ASTIN Bulletin: The Journal of the IAA, 12(1), 22-26.
Sundt, B., & Jewell, W. S. (1981). Further results on recursive evaluation of compound distributions. ASTIN Bulletin: The Journal of the IAA, 12(1), 27-39.
data <- rzipfpss(100, 2.5, 1.3) data <- as.data.frame(table(data)) data[,1] <- as.numeric(as.character(data[,1])) data[,2] <- as.numeric(as.character(data[,2])) obj <- zipfpssFit(data, init_alpha = 1.5, init_lambda = 1.5)
data <- rzipfpss(100, 2.5, 1.3) data <- as.data.frame(table(data)) data[,1] <- as.numeric(as.character(data[,1])) data[,2] <- as.numeric(as.character(data[,2])) obj <- zipfpssFit(data, init_alpha = 1.5, init_lambda = 1.5)
Probability mass function, cumulative distribution function, quantile function and random number
generation for the Zipf-PE distribution with parameters and
. The support of the Zipf-PE
distribution are the strictly positive integer numbers large or equal than one.
dzipfpe(x, alpha, beta, log = FALSE) pzipfpe(q, alpha, beta, log.p = FALSE, lower.tail = TRUE) qzipfpe(p, alpha, beta, log.p = FALSE, lower.tail = TRUE) rzipfpe(n, alpha, beta)
dzipfpe(x, alpha, beta, log = FALSE) pzipfpe(q, alpha, beta, log.p = FALSE, lower.tail = TRUE) qzipfpe(p, alpha, beta, log.p = FALSE, lower.tail = TRUE) rzipfpe(n, alpha, beta)
x , q
|
Vector of positive integer values. |
alpha |
Value of the |
beta |
Value of the |
log , log.p
|
Logical; if TRUE, probabilities p are given as log(p). |
lower.tail |
Logical; if TRUE (default), probabilities are |
p |
Vector of probabilities. |
n |
Number of random values to return. |
The probability mass function of the Zipf-PE distribution with parameters and
at a positive integer value
is computed as follows:
where is the Riemann-zeta function at
, and
is the Hurtwitz zeta function with arguments
and x.
The cumulative distribution function at a given positive
integer value ,
, is equal to:
The quantile of the Zipf-PE distribution of a given probability value p
is equal to the quantile of the Zipf
distribution at the value:
The quantiles of the Zipf distribution are computed by means of the tolerance
package.
To generate random data from a Zipf-PE one applies the quantile function over n values randomly generated from an Uniform distribution in the interval (0, 1).
dzipfpe
gives the probability mass function,
pzipfpe
gives the cumulative function,
qzipfpe
gives the quantile function, and
rzipfpe
generates random values from a Zipf-PE distribution.
Young, D. S. (2010). Tolerance: an R package for estimating tolerance intervals. Journal of Statistical Software, 36(5), 1-39.
dzipfpe(1:10, 2.5, -1.5) pzipfpe(1:10, 2.5, -1.5) qzipfpe(0.56, 2.5, 1.3) rzipfpe(10, 2.5, 1.3)
dzipfpe(1:10, 2.5, -1.5) pzipfpe(1:10, 2.5, -1.5) qzipfpe(0.56, 2.5, 1.3) rzipfpe(10, 2.5, 1.3)
For a given sample of strictly positive integer values, usually of the type of ranking data or frequencies of frequencies data, estimates the parameters of the Zipf-PE distribution by means of the maximum likelihood method. The input data should be provided as a frequency matrix.
zipfpeFit(data, init_alpha = NULL, init_beta = NULL, level = 0.95, ...) ## S3 method for class 'zipfpeR' residuals(object, ...) ## S3 method for class 'zipfpeR' fitted(object, ...) ## S3 method for class 'zipfpeR' coef(object, ...) ## S3 method for class 'zipfpeR' plot(x, ...) ## S3 method for class 'zipfpeR' print(x, ...) ## S3 method for class 'zipfpeR' summary(object, ...) ## S3 method for class 'zipfpeR' logLik(object, ...) ## S3 method for class 'zipfpeR' AIC(object, ...) ## S3 method for class 'zipfpeR' BIC(object, ...)
zipfpeFit(data, init_alpha = NULL, init_beta = NULL, level = 0.95, ...) ## S3 method for class 'zipfpeR' residuals(object, ...) ## S3 method for class 'zipfpeR' fitted(object, ...) ## S3 method for class 'zipfpeR' coef(object, ...) ## S3 method for class 'zipfpeR' plot(x, ...) ## S3 method for class 'zipfpeR' print(x, ...) ## S3 method for class 'zipfpeR' summary(object, ...) ## S3 method for class 'zipfpeR' logLik(object, ...) ## S3 method for class 'zipfpeR' AIC(object, ...) ## S3 method for class 'zipfpeR' BIC(object, ...)
data |
Matrix of count data in form of table of frequencies. |
init_alpha |
Initial value of |
init_beta |
Initial value of |
level |
Confidence level used to calculate the confidence intervals (default 0.95). |
... |
Further arguments to the generic functions.The extra arguments are passing to the optim function. |
object |
An object from class "zpeR" (output of zipfpeFit function). |
x |
An object from class "zpeR" (output of zipfpeFit function). |
The argument data
is a two column matrix with the first column containing the observations and
the second column containing their frequencies.
The log-likelihood function is equal to:
where is the absolute frequency of
,
is the number of different values in the sample and
is the sample size,
i.e.
.
By default the initial values of the parameters are computed using the function getInitialValues
.
The function optim is used to estimate the parameters.
Returns an object composed by the maximum likelihood parameter estimations jointly with their standard deviation and confidence intervals. It also contains the value of the log-likelihood at the maximum likelihood estimator.
data <- rzipfpe(100, 2.5, 1.3) data <- as.data.frame(table(data)) data[,1] <- as.numeric(as.character(data[,1])) data[,2] <- as.numeric(as.character(data[,2])) initValues <- getInitialValues(data, model='zipfpe') obj <- zipfpeFit(data, init_alpha = initValues$init_alpha, init_beta = initValues$init_beta)
data <- rzipfpe(100, 2.5, 1.3) data <- as.data.frame(table(data)) data[,1] <- as.numeric(as.character(data[,1])) data[,2] <- as.numeric(as.character(data[,2])) initValues <- getInitialValues(data, model='zipfpe') obj <- zipfpeFit(data, init_alpha = initValues$init_alpha, init_beta = initValues$init_beta)
Computes the expected value of the Zipf-PE distribution for given values of parameters
and
.
zipfpeMean(alpha, beta, tolerance = 10^(-4))
zipfpeMean(alpha, beta, tolerance = 10^(-4))
alpha |
Value of the |
beta |
Value of the |
tolerance |
Tolerance used in the calculations (default = |
The mean of the distribution only exists for strictly greater than 2.
It is computed by calculating the partial sums of the serie, and stopping when two
consecutive partial sums differ less than the
tolerance
value.
The value of the last partial sum is returned.
A positive real value corresponding to the mean value of the Zipf-PE distribution.
zipfpeMean(2.5, 1.3) zipfpeMean(2.5, 1.3, 10^(-3))
zipfpeMean(2.5, 1.3) zipfpeMean(2.5, 1.3, 10^(-3))
General function to compute the k-th moment of the Zipf-PE distribution for any integer value ,
when it exists. The k-th moment exists if and only if
.
For k = 1, this function returns the same value as the zipfpeMean function.
zipfpeMoments(k, alpha, beta, tolerance = 10^(-4))
zipfpeMoments(k, alpha, beta, tolerance = 10^(-4))
k |
Order of the moment to compute. |
alpha |
Value of the |
beta |
Value of the |
tolerance |
Tolerance used in the calculations (default = |
The k-th moment of the Zipf-PE distribution is finite for values strictly greater than
.
It is computed by calculating the partial sums of the serie, and stopping when two
consecutive partial sums differ less than the
tolerance
value.
The value of the last partial sum is returned.
A positive real value corresponding to the k-th moment of the distribution.
zipfpeMoments(3, 4.5, 1.3) zipfpeMoments(3, 4.5, 1.3, 1*10^(-3))
zipfpeMoments(3, 4.5, 1.3) zipfpeMoments(3, 4.5, 1.3, 1*10^(-3))
Computes the variance of the Zipf-PE distribution for given values of and
.
zipfpeVariance(alpha, beta, tolerance = 10^(-4))
zipfpeVariance(alpha, beta, tolerance = 10^(-4))
alpha |
Value of the |
beta |
Value of the |
tolerance |
Tolerance used in the calculations. (default = |
The variance of the distribution only exists for strictly greater than 3.
A positive real value corresponding to the variance of the distribution.
zipfpeVariance(3.5, 1.3)
zipfpeVariance(3.5, 1.3)
Probability mass function of the Zipf-Polylog distribution with parameters and
.
The support of the Zipf-Polylog distribution are the strictly positive integer numbers large or equal
than one.
dzipfpolylog(x, alpha, beta, log = FALSE, nSum = 1000) pzipfpolylog(x, alpha, beta, log.p = FALSE, lower.tail = TRUE, nSum = 1000) qzipfpolylog(p, alpha, beta, log.p = FALSE, lower.tail = TRUE, nSum = 1000) rzipfpolylog(n, alpha, beta, nSum = 1000)
dzipfpolylog(x, alpha, beta, log = FALSE, nSum = 1000) pzipfpolylog(x, alpha, beta, log.p = FALSE, lower.tail = TRUE, nSum = 1000) qzipfpolylog(p, alpha, beta, log.p = FALSE, lower.tail = TRUE, nSum = 1000) rzipfpolylog(n, alpha, beta, nSum = 1000)
x |
Vector of positive integer values. |
alpha |
Value of the |
beta |
Value of the |
log , log.p
|
Logical; if TRUE, probabilities p are given as log(p). |
nSum |
The number of terms used for computing the Polylogarithm function (Default = 1000). |
lower.tail |
Logical; if TRUE (default), probabilities are |
p |
Vector of probabilities. |
n |
Number of random values to return. |
The probability mass function at a positive integer value of the Zipf-Polylog distribution with
parameters
and
is computed as follows:
dzipfpolylog
gives the probability mass function
dzipfpolylog(1:10, 1.61, 0.98) pzipfpolylog(1:10, 1.61, 0.98) qzipfpolylog(0.8, 1.61, 0.98)
dzipfpolylog(1:10, 1.61, 0.98) pzipfpolylog(1:10, 1.61, 0.98) qzipfpolylog(0.8, 1.61, 0.98)
For a given sample of strictly positive integer numbers, usually of the type of ranking data or frequencies of frequencies data, estimates the parameters of the ZipfPolylog distribution by means of the maximum likelihood method. The input data should be provided as a frequency matrix.
zipfPolylogFit(data, init_alpha, init_beta, level = 0.95, ...) ## S3 method for class 'zipfPolyR' residuals(object, ...) ## S3 method for class 'zipfPolyR' fitted(object, ...) ## S3 method for class 'zipfPolyR' coef(object, ...) ## S3 method for class 'zipfPolyR' plot(x, ...) ## S3 method for class 'zipfPolyR' print(x, ...) ## S3 method for class 'zipfPolyR' summary(object, ...) ## S3 method for class 'zipfPolyR' logLik(object, ...) ## S3 method for class 'zipfPolyR' AIC(object, ...) ## S3 method for class 'zipfPolyR' BIC(object, ...)
zipfPolylogFit(data, init_alpha, init_beta, level = 0.95, ...) ## S3 method for class 'zipfPolyR' residuals(object, ...) ## S3 method for class 'zipfPolyR' fitted(object, ...) ## S3 method for class 'zipfPolyR' coef(object, ...) ## S3 method for class 'zipfPolyR' plot(x, ...) ## S3 method for class 'zipfPolyR' print(x, ...) ## S3 method for class 'zipfPolyR' summary(object, ...) ## S3 method for class 'zipfPolyR' logLik(object, ...) ## S3 method for class 'zipfPolyR' AIC(object, ...) ## S3 method for class 'zipfPolyR' BIC(object, ...)
data |
Matrix of count data in form of a table of frequencies. |
init_alpha |
Initial value of |
init_beta |
Initial value of |
level |
Confidence level used to calculate the confidence intervals (default 0.95). |
... |
Further arguments to the generic functions. The extra arguments are passing to the optim function. |
object |
An object from class "zipfPolyR" (output of zipfPolylogFit function). |
x |
An object from class "zipfPolyR" (output of zipfPolylogFit function). |
The argument data
is a two column matrix with the first column containing the observations and
the second column containing their frequencies.
The log-likelihood function is equal to:
The function optim is used to estimate the parameters.
Returns a zipfPolyR object composed by the maximum likelihood parameter estimations jointly with their standard deviation and confidence intervals. It also contains the value of the log-likelihood at the maximum likelihood estimator.
Computes the expected value of the ZipfPolylog distribution for given values of parameters
and
.
zipfpolylogMean(alpha, beta, tolerance = 10^(-4))
zipfpolylogMean(alpha, beta, tolerance = 10^(-4))
alpha |
Value of the |
beta |
Value of the |
tolerance |
Tolerance used in the calculations (default = |
A positive real value corresponding to the mean value of the ZipfPolylog distribution.
zipfpolylogMean(0.5, 0.8) zipfpolylogMean(2.5, 0.8, 10^(-3))
zipfpolylogMean(0.5, 0.8) zipfpolylogMean(2.5, 0.8, 10^(-3))
General function to compute the k-th moment of the ZipfPolylog distribution for any integer value ,
when it exists. #'
For k = 1, this function returns the same value as the zipfpoylogMean function.
zipfpolylogMoments(k, alpha, beta, tolerance = 10^(-4), nSum = 1000)
zipfpolylogMoments(k, alpha, beta, tolerance = 10^(-4), nSum = 1000)
k |
Order of the moment to compute. |
alpha |
Value of the |
beta |
Value of the |
tolerance |
Tolerance used in the calculations (default = |
nSum |
The number of terms used for computing the Polylogarithm function (default = 1000). |
The k-th moment of the Zipf-Polylog distribution is always finite, but,
for and
the k-th moment is only finite for all
.
It is computed by calculating the partial sums of the serie, and stopping when two
consecutive partial sums differ less than the
tolerance
value.
The value of the last partial sum is returned.
A positive real value corresponding to the k-th moment of the distribution.
zipfpolylogMoments(1, 0.2, 0.90) zipfpolylogMoments(3, 4.5, 0.90, 1*10^(-3))
zipfpolylogMoments(1, 0.2, 0.90) zipfpolylogMoments(3, 4.5, 0.90, 1*10^(-3))
Computes the variance of the ZipfPolylog distribution for given values of and
.
zipfpoylogVariance(alpha, beta, tolerance = 10^(-4))
zipfpoylogVariance(alpha, beta, tolerance = 10^(-4))
alpha |
Value of the |
beta |
Value of the |
tolerance |
Tolerance used in the calculations. (default = |
The variance of the distribution only exists for strictly greater than 3.
A positive real value corresponding to the variance of the distribution.
zipfpolylogMoments
, zipfpoylogMean
.
zipfpoylogVariance(0.5, 0.75)
zipfpoylogVariance(0.5, 0.75)
Probability mass function, cumulative distribution function, quantile function and random number
generation for the Zipf-PSS distribution with parameters and
. The support of the Zipf-PSS
distribution are the positive integer numbers including the zero value. In order to work with its zero-truncated version
the parameter
isTruncated
should be equal to True.
dzipfpss(x, alpha, lambda, log = FALSE, isTruncated = FALSE) pzipfpss(q, alpha, lambda, log.p = FALSE, lower.tail = TRUE, isTruncated = FALSE) rzipfpss(n, alpha, lambda, log.p = FALSE, lower.tail = TRUE, isTruncated = FALSE) qzipfpss(p, alpha, lambda, log.p = FALSE, lower.tail = TRUE, isTruncated = FALSE)
dzipfpss(x, alpha, lambda, log = FALSE, isTruncated = FALSE) pzipfpss(q, alpha, lambda, log.p = FALSE, lower.tail = TRUE, isTruncated = FALSE) rzipfpss(n, alpha, lambda, log.p = FALSE, lower.tail = TRUE, isTruncated = FALSE) qzipfpss(p, alpha, lambda, log.p = FALSE, lower.tail = TRUE, isTruncated = FALSE)
x , q
|
Vector of positive integer values. |
alpha |
Value of the |
lambda |
Value of the |
log , log.p
|
Logical; if TRUE, probabilities p are given as log(p). |
isTruncated |
Logical; if TRUE, the zero truncated version of the distribution is returned. |
lower.tail |
Logical; if TRUE (default), probabilities are |
n |
Number of random values to return. |
p |
Vector of probabilities. |
The support of the parameter increases when the distribution is truncated at zero being
. It has been proved that when
one has the degenerated version of the distribution at one.
Panjer, H. H. (1981). Recursive evaluation of a family of compound distributions. ASTIN Bulletin: The Journal of the IAA, 12(1), 22-26.
Sundt, B., & Jewell, W. S. (1981). Further results on recursive evaluation of compound distributions. ASTIN Bulletin: The Journal of the IAA, 12(1), 27-39.
For a given sample of strictly positive integer numbers, usually of the type of ranking data or frequencies of frequencies data, estimates the parameters of the Zipf-PSS distribution by means of the maximum likelihood method. The input data should be provided as a frequency matrix.
zipfpssFit(data, init_alpha = NULL, init_lambda = NULL, level = 0.95, isTruncated = FALSE, ...) ## S3 method for class 'zipfpssR' residuals(object, isTruncated = FALSE, ...) ## S3 method for class 'zipfpssR' fitted(object, isTruncated = FALSE, ...) ## S3 method for class 'zipfpssR' coef(object, ...) ## S3 method for class 'zipfpssR' plot(x, isTruncated = FALSE, ...) ## S3 method for class 'zipfpssR' print(x, ...) ## S3 method for class 'zipfpssR' summary(object, isTruncated = FALSE, ...) ## S3 method for class 'zipfpssR' logLik(object, ...) ## S3 method for class 'zipfpssR' AIC(object, ...) ## S3 method for class 'zipfpssR' BIC(object, ...)
zipfpssFit(data, init_alpha = NULL, init_lambda = NULL, level = 0.95, isTruncated = FALSE, ...) ## S3 method for class 'zipfpssR' residuals(object, isTruncated = FALSE, ...) ## S3 method for class 'zipfpssR' fitted(object, isTruncated = FALSE, ...) ## S3 method for class 'zipfpssR' coef(object, ...) ## S3 method for class 'zipfpssR' plot(x, isTruncated = FALSE, ...) ## S3 method for class 'zipfpssR' print(x, ...) ## S3 method for class 'zipfpssR' summary(object, isTruncated = FALSE, ...) ## S3 method for class 'zipfpssR' logLik(object, ...) ## S3 method for class 'zipfpssR' AIC(object, ...) ## S3 method for class 'zipfpssR' BIC(object, ...)
data |
Matrix of count data in form of table of frequencies. |
init_alpha |
Initial value of |
init_lambda |
Initial value of |
level |
Confidence level used to calculate the confidence intervals (default 0.95). |
isTruncated |
Logical; if TRUE, the truncated version of the distribution is returned.(default = FALSE) |
... |
Further arguments to the generic functions. The extra arguments are passing to the optim function. |
object |
An object from class "zpssR" (output of zipfpssFit function). |
x |
An object from class "zpssR" (output of zipfpssFit function). |
The argument data
is a two column matrix with the first column containing the observations and
the second column containing their frequencies.
The log-likelihood function is equal to:
where is the number of different values in the sample, being
is the absolute
frequency of
.The probabilities are calculated applying the Panjer recursion.
By default the initial values of the parameters are computed using the function
getInitialValues
.
The function optim is used to estimate the parameters.
Returns a zpssR object composed by the maximum likelihood parameter estimations jointly with their standard deviation and confidence intervals and the value of the log-likelihood at the maximum likelihood estimator.
Panjer, H. H. (1981). Recursive evaluation of a family of compound distributions. ASTIN Bulletin: The Journal of the IAA, 12(1), 22-26.
Sundt, B., & Jewell, W. S. (1981). Further results on recursive evaluation of compound distributions. ASTIN Bulletin: The Journal of the IAA, 12(1), 27-39.
data <- rzipfpss(100, 2.5, 1.3) data <- as.data.frame(table(data)) data[,1] <- as.numeric(as.character(data[,1])) data[,2] <- as.numeric(as.character(data[,2])) initValues <- getInitialValues(data, model='zipfpss') obj <- zipfpssFit(data, init_alpha = initValues$init_alpha, init_lambda = initValues$init_lambda)
data <- rzipfpss(100, 2.5, 1.3) data <- as.data.frame(table(data)) data[,1] <- as.numeric(as.character(data[,1])) data[,2] <- as.numeric(as.character(data[,2])) initValues <- getInitialValues(data, model='zipfpss') obj <- zipfpssFit(data, init_alpha = initValues$init_alpha, init_lambda = initValues$init_lambda)
Computes the expected value of the Zipf-PSS distribution for given values of parameters
and
.
zipfpssMean(alpha, lambda, isTruncated = FALSE)
zipfpssMean(alpha, lambda, isTruncated = FALSE)
alpha |
Value of the |
lambda |
Value of the |
isTruncated |
Logical; if TRUE Use the zero-truncated version of the distribution to calculate the expected value (default = FALSE). |
The expected value of the Zipf-PSS distribution only exists for values strictly
greater than 2. The value is obtained from the law of total expectation that says that:
where E[X] is the mean value of the Zipf distribution and E[N] is the expected value of a Poisson one. From where one has that:
Particularlly, if one is working with the zero-truncated version of the Zipf-PSS distribution. This values is computed as:
A positive real value corresponding to the mean value of the distribution.
Sarabia Alegría, J. M., Gómez Déniz, E. M. I. L. I. O., & Vázquez Polo, F. (2007). Estadística actuarial: teoría y aplicaciones. Pearson Prentice Hall.
zipfpssMean(2.5, 1.3) zipfpssMean(2.5, 1.3, TRUE)
zipfpssMean(2.5, 1.3) zipfpssMean(2.5, 1.3, TRUE)
General function to compute the k-th moment of the Zipf-PSS distribution for any integer value ,
when it exists. The k-th moment exists if and only if
.
zipfpssMoments(k, alpha, lambda, isTruncated = FALSE, tolerance = 10^(-4))
zipfpssMoments(k, alpha, lambda, isTruncated = FALSE, tolerance = 10^(-4))
k |
Order of the moment to compute. |
alpha |
Value of the |
lambda |
Value of the |
isTruncated |
Logical; if TRUE, the truncated version of the distribution is returned. |
tolerance |
Tolerance used in the calculations (default = |
The k-th moment of the Zipf-PSS distribution is finite for values
strictly greater than
.
It is computed by calculating the partial sums of the serie, and stopping when two
consecutive partial sums differ less than the
tolerance
value.
The value of the last partial sum is returned.
A positive real value corresponding to the k-th moment of the distribution.
zipfpssMoments(1, 2.5, 2.3) zipfpssMoments(1, 2.5, 2.3, TRUE)
zipfpssMoments(1, 2.5, 2.3) zipfpssMoments(1, 2.5, 2.3, TRUE)
Computes the variance of the Zipf-PSS distribution for given values of parameters
and
.
zipfpssVariance(alpha, lambda, isTruncated = FALSE)
zipfpssVariance(alpha, lambda, isTruncated = FALSE)
alpha |
Value of the |
lambda |
Value of the |
isTruncated |
Logical; if TRUE Use the zero-truncated version of the distribution to calculate the expected value (default = FALSE). |
The variance of the Zipf-PSS distribution only exists for values strictly greater than 3.
The value is obtained from the law of total variance that says that:
where X follows a Zipf distribution with parameter , and N follows a Poisson distribution with
parameter
. From where one has that:
Particularlly, if one is working with the zero-truncated version of the Zipf-PSS distribution. This values is computed as:
A positive real value corresponding to the variance of the distribution.
Sarabia Alegría, JM. and Gómez Déniz, E. and Vázquez Polo, F. Estadística actuarial: teoría y aplicaciones. Pearson Prentice Hall.
zipfpssVariance(4.5, 2.3) zipfpssVariance(4.5, 2.3, TRUE)
zipfpssVariance(4.5, 2.3) zipfpssVariance(4.5, 2.3, TRUE)