February 25, 2026
The authors explored new modeling for discrete paired data without normality assumptions. Skellam (1946) introduced the Skellam distribution as the difference of two Poisson random variables; Kozubowski and Inusah (2006) proposed the skew discrete Laplace (SDL) distribution based on the difference between two geometric variables. More recently, Conceição et al. (2021) proposed a zero-modified Skellam distribution and its associated regression model, also estimated via Bayesian methods. While this approach introduces flexibility in modeling the zero probability, the regression coefficients associated with this component cannot be directly interpreted in terms of the mean.
Even though it has nice properties, the SDL distribution has received little attention in a regression context. One of its major limitations is that its mode is always zero, regardless of the parameter values, restricting its flexibility. In this paper, the authors proposed a simple generalization of the SDL distribution—referred to as the modified SDL distribution—whose mode is not constrained to zero. They also introduced a new regression model for integer-valued and paired discrete data, along with a parameterization based on the mean and dispersion parameters, enhancing the interpretability of the model. The authors listed the main advantages of the proposed regression framework which include in their own words:
- 1.Flexibility: The modified SDL distribution accommodates a wide range of distributional shapes, including left- and right-skewed distributions, as well as symmetric ones.
- 2.Double regression structure: We incorporate covariates in both the mean and dispersion parameters, enabling the modeling of heteroskedasticity in a way similar to generalized linear models with dispersion covariates.
- 3.Interpretability: The parameterization in terms of mean and dispersion allows for straightforward interpretation of the regression coefficients.
- 4.Inference: Estimation is performed under the likelihood framework. Closed-form expressions for the Fisher information matrix are provided, facilitating the computation of the asymptotic covariance matrix of the estimators.
Their proposed model was implemented in the R package sdlrm, available on the Comprehensive R Archive Network at https://CRAN.R-project.org/package=sdlrm. The package includes functions for estimation, inference, and diagnostic analysis, following a structure similar to that of popular regression packages.
The authors state that when fitting models based on discrete distributions, comparing the observed and expected frequencies of the response variable is one of the first ways to check the distance between the fitted model and the observed data. Bourguignon and de Medeiros (2022) introduced a pseudo- coefficient for count models based on observed and expected frequencies as a measure of goodness-of-fit, which can be adapted for the modified SDL regression.
They discussed two types of residuals for checking the goodness-of-fit of the modified SDL regression model. They first discussed the Pearson residuals. Even though the Pearson residual distribution is not known, this residual is useful for detecting outliers, and the goodness-of-fit of the model can be verified through the construction of simulated confidence bands. The second type of residual they discussed is the randomized quantile residual proposed by Dunn and Smyth (1996).
In simulations, their method performed well in terms of bias, coefficient estimates, standard errors, root mean squared error. Also they checked the rejection rates of constant dispersion tests and they compared methods for selecting epsilon based on the sample model and the profile likelihood. Their simulations evaluated their method but didn’t compare it to another model. In their real data example, they did compare to a Poisson distribution and also compared their modified Skellam model to the Skellam model itself. The R squared they had modified for this purpose of showing goodness-of-fit showed to be considerably higher for their modified SDL model as compared to the Skellman model. Their real data analysis relied on nonparametric tests while their modified SDL regression expanded on mean and dispersion as a function of explanatory variables.
In summary their method allows modeling integer-valued data including paired samples of discrete observations. They hope their new methods will encourage broader use in integer-valued regression analysis. Further extensions are extending the model to the case where the response is defined as the difference between 2 random variables with a negative binomial distribution instead of the geometric probability model. Future studies could also show alternative estimation methods like Bayesian and EM Algorithm.
Written by,
Usha Govindarajulu
Keywords: Laplace regression, paired samples, SDL, integer-valued regression analysis
References:
Bourguignon, M., and R. M. de Medeiros. 2022. “A Simple and Useful Regression Model for Fitting Count Data.” Test 31: 790–827.
Conceição, K. S., A. K. Suzuki, and M. G. Andrade. 2021. “A Bayesian Approach for Zero-Modified Skellam Model with Hamiltonian MCMC.” Statistical Methods & Applications 32: 747–765.
De Medeiros RMR and Bourguignon M (2025) “Modified Skew Discrete Laplace Regression Models for Integer-Valued Data With Applications to Paired Samples” Biometrical Journal
https://doi.org/10.1002/bimj.70106
Dunn, P. K., and G. K. Smyth. 1996. “Randomized Quantile Residuals.” Journal of Computational and Graphical Statistics 5: 236–244.
Kozubowski, T. J., and S. Inusah. 2006. “A Skew Laplace Distribution on Integers.” Annals of the Institute of Statistical Mathematics 58: 555–571.
Skellam, J. 1946. “The Frequency Distribution of the Difference Between Two Poisson Variates Belonging to Different Populations.” Journal of the Royal Statistical Society 109: 296–296.
https://onlinelibrary.wiley.com/cms/asset/fbeaea10-dd59-4a19-a1f8-6501188c20fb/bimj70106-fig-0002-m.jpg