Research Article  Open Access
Simone Fiori, "Asymmetric Variate Generation via a Parameterless Dual Neural Learning Algorithm", Computational Intelligence and Neuroscience, vol. 2008, Article ID 426080, 8 pages, 2008. https://doi.org/10.1155/2008/426080
Asymmetric Variate Generation via a Parameterless Dual Neural Learning Algorithm
Abstract
In a previous work (S. Fiori, 2006), we proposed a random number generator based on a tunable nonlinear neural system, whose learning rule is designed on the basis of a cardinal equation from statistics and whose implementation is based on lookup tables (LUTs). The aim of the present manuscript is to improve the abovementioned random number generation method by changing the learning principle, while retaining the efficient LUTbased implementation. The new method proposed here proves easier to implement and relaxes some previous limitations.
1. Introduction
Random numbers are currently used for a variety of purposes such as: cryptographic keys generation, games, some classes of scientific experiments as well as โMonte Carloโ methods in physics and computer science [1โ6]. Standard programming environments are endowed with basic pseudorandom signal generators such as the uniform and the Gaussian ones, while usually the needed distributions are more involved than uniform/Gaussian. A simple example of application is password generation: a random password generator is a software that inputs from a random or pseudorandom number generator and automatically generates a password. An example of application where involved probability distributions are needed is in independent component analysis (ICA, [7]) testing: as the behavior of an ICA algorithm might depend on the statistical distribution of the sources, ICAalgorithm testing tools might require random sequences generators capable of producing random numbers distributed according to involved probability laws.
The principal methods known in the literature to obtain a batch of samples endowed with an arbitrary distribution from a samples batch having a simple distribution are the โtransformation methodโ and the โrejection methodโ [8]. In the present paper, we focus on the transformation method, which may be well implemented through a tunable neural system, because the availability of a random number source and of a tunable nonlinear system, along with a proper learning procedure, allows obtaining a wide class of pseudorandom signal generators.
A wellknown effect of nonlinear neural systems is to warp the statistical distribution of its input. In particular, we assume that the system under consideration has a nonlinear adaptive structure described by the transference , where denotes the system input random signal, having probability density function , and denotes the output signal, having probability density function , as shown in Figure 1. In the hypothesis that the neural system transference is strictly monotonic, namely , for all , the relationship between the input distribution, the output distribution, and the system transfer function is known to be [9] where denotes the inverse of function . Usually, (1) is interpreted as an analysis formula, which allows computing the output distribution when the input distribution and the system transference function are known. However, the cardinal equation (1) may also be interpreted as a formula that allows for designing the nonlinear system when the distribution is known and it is desired that the system responds according to a desired distribution . In fact, (1) may be rewritten as the differential equation: In general, such design operation is rather difficult, because (2) in the unknown involves the solution of a nonlinear differential equation, provided that a consistent boundary condition is specified.
In the recent contribution [10], we presented a pseudorandom samples generator based on a nonlinear monotonic neural system, whose transference function is denoted by , tuned on the basis of the differential equation (2). The cardinal design equation (2) was proposed to be solved via a (relaxationtype) fixedpoint algorithm. The key advantages of the method proposed in [10] are as follows. (a) In order to obtain a fullytunable neural transference function, a lookuptable representation was chosen. It guarantees high flexibility in the shape of the neural transference as well as easiness of representation and handling of the involved quantities. (b) The fixedpoint learning algorithm exhibits fast convergence over other possible methods such as the gradientbased one: unlike these methods, the fixedpoint learning algorithm does not require the computation of derivatives of the involved functions.
The resulting randomnumber generation method should be thus read as a twostage procedure. The first stage consists in solving the cardinal differential equation (2) in the unknown function , given the distributions and as data. The second stage consists in generating input random samples drawn from the distribution , then letting such random samples pass through the learnt nonlinear neural system by computing output values . The random samples are assured to be distributed according to the probability density function .
However, we recognized that the method presented in [10] also suffers from some drawbacks, namely the following. (a) For numerical convergence purpose, each step of the fixedpointtype tuning algorithm needed to be followed by some normalization steps. Namely, from (2), it is easily seen that when the function approaches , the computation of becomes illconditioned, therefore the quantity was replaced by , with being a small constant to be properly sized. Also, in order to refine learning, after each iteration step, the solution needed to be normalized either by affine scaling, in order to control the range of variable , or by linear scaling in order to match the true value of output distribution moment of preselected order. This, in turn, requires computing in advance the (closed form) moments of interest of the output distribution. (b) In spite of affine scaling, it was not easy to control the range of the output value , as affine scaling does not guarantee convergence in every case of interest, therefore it could not be employed in every case. (c) The developed procedure was customized to generate output distributions that are either symmetric (namely, ) or completely skewed to the right (namely, , for all ) only. Asymmetric or generalshape distributions were not considered.
In the present paper, we consider the problem of extending the previous method to the generation of asymmetric distributions by removing the constraint of symmetry or skewedness to the right. Also, we propose a way to avoid probability density function. The solution of choice implies a change in the viewpoint of cardinal equation (1): instead of converting formula (1) into the differential equation (2), we convert it into a new differential equation, hereafter referred to as dual cardinal equation, which will prove easier to solve and more flexible to use in practice, while retaining the previous numerical representation/advantages. Thus, we will retain the effective numerical representation of the involved quantity already introduced in the works [10, 11], based on the โlookup tableโ (LUT) implementation of neural activation functions as well as the efficient numerical algorithm to solve the dual cardinal equation. LUTs were proven to provide an efficient way of representing and handling the variables appearing within the devised random number generation algorithm. A prominent advantage of the procedure is the lack of hard computational requirements except for LUT handling, which consists of sorting/searching on lists of numbers and of few simple algebraic operations on numbers.
The effectiveness of the proposed approach will be evaluated through numerical experiments. In particular, the designed experiments followed a logical succession, beginning with a basic assessment of the proposed method when applied to biGaussian distribution, which is then followed by comparably more difficult distributions, namely a generalized Gaussian distribution and an asymmetric Gamma distribution.
The existing method presented in [3] is worth discussing. It concerns a neuralnetworkstype algorithm to generate random vectors with arbitrary marginal distributions and correlation matrix, based on NORTA method. The โnormaltoanythingโ (NORTA) method (see, e.g., [12]) is one of the most efficient methods for random vector generation. In [3], a technique was presented to generate the correlation matrix of normal random vectors based on an artificial neural networks approach. The NORTA algorithm works in the following way to generate random samples with prescribed probability density function. First, generate zeromean unitvariance random samples , . Then, generate the desired random samples as , where denotes the cumulative distribution function of a standard normal random variable and denotes the desired cumulative distribution function, with , . It appears, thus, as a transformation method.
Most of the methods of random vector generation known from the literature impose constraints on the size of the random vectors and many of them are applicable only for bivariate distributions whose components are equidistributed. Conversely, within the NORTA framework, marginal probability distributions for vector components as well as their correlation matrix may be specified. Obtaining the prescribed generated random vector correlation matrix requires solving an involved nonlinear system of equations, which is the most serious problem in this kind of approach. Paper [3] makes use of a multilayer perceptron neural network to estimate correlation matrices of normal random vectors, allowing thus to overcome the analytically involved equations of NORTA algorithm. While the method proposed here is more general than NORTA in the sense that it works for any kind of available generator (not only Gaussian), it is less general in the sense that it does not allow to generate multivariate random variables with prescribed joint statistics.
2. Dual Cardinal Equation AND Its Numerical Solution
The present section formalizes the learning problem at hand and illustrates a fixedpointbased numerical algorithm to solve the dual cardinal equation.
2.1. Dual cardinal equation and neural system
The key point of the new method consists in learning the inverse function instead of the function . As it will be clarified in the next sections, this choice simplifies the learning problem while adding slight computational burden to the usage of the learnt neural system as a generative model.
We denote by the inverse function of the actual neural transfer function and refer to the new neural system, having as transfer function, as the โdual neural systemโ (shown in Figure 1). The purpose here is to learn a dual neural system that warps into under the constraint , for all . We denote the interval of interest for the generated random variable as . With this hypothesis on the nonlinear dual neural transfer function, the cardinal equation (1) may be rewritten as which will be hereafter referred to as โdual cardinal equation.โ It is worth noting that the boundary condition is completely arbitrary. While there are no theoretical reasons to set the boundary condition in any specific way, the above choice is motivated by the observation that it simplifies the fixedpoint adapting algorithm with respect to the previous version proposed in [10].
In general, a closedform solution to (3) may not be realized, thus we should resort to an iterative learning algorithm to search for a solution. Formally, this means designing an algorithm that generates a succession of functions , , whose limit coincides to the solution of (3). A way to generate such a succession is to employ the fixedpoint algorithm: As a figureofconvergence of learning process, we consider the weighted difference of function between two successive iterations, namely, As initial guess, we assume , for all .
After learning an inverse function , the numerical procedure should calculate the actual nonlinear function by numerical inversion. As it will be clarified in the next section, within the framework proposed here, such operation involves a very little computational effort.
2.2. Numerical implementation of the learning procedure
From an implementation viewpoint, the algorithm (4) needs to be discretized in order to obtain a version suitable to be implemented on a computer.
We choose to represent function by a numerical vector: in practice, we suppose the interval of interest to be partitioned into discrete bins. This gives rise to the vectortype representation of the support of the output sequence probability density function, where contains values regularly spaced in with spacingwidth denoted as . Then may be represented by a numerical vector and the neural inputoutput transference is now represented by the discrete relationship , namely, a numerical lookup table. The entries of a vector may be denoted by an extra footer, that is, by , with . The interval relates to the integer and may be defined as .
In order to translate the learning rule (4) into a version suitable to numerical representation, we should consider the inherent limitations of numerical integration of differential equations. The following notes are worth taking into account. (a) Output support selection: the ultimate purpose of the random number generation method under construction is to generate random samples with desired probability distribution within a range of interest, namely, with values within an interval that is deemed suitable for the purposes that random samples generation is launched for. Therefore, the output range is to be freely selected according to the needs the random samples are to be generated for. Then, the abovementioned vector has entries computed as , with . (b) Input support selection: in order to prevent the denominator of the quantity in (4) to become too close to zero, a sensible choice is to carefully select the support . As in this paper we consider the input probability density function to be either (symmetric) Gaussian or uniform, we set , with . The value of constant is to be selected in such a way that . It is worth recalling that the support of the input distribution may be arbitrarily selected as it does not affect the support of the output distribution. (c) Iterative range scaling: after each learning step, an affine normalization operation is performed, that linearly scales the entries of the putative solution so that and .
In order to describe the numerical learning algorithm, the following operators are defined for a generic lookup table : where the subscript denotes the th entry of the vectors and . The behavior of the โcumsumโ operator is illustrated in Figure 2, which also provides a visual representation of lookup tables. In practice, the considered numerical version of the learning rule (4) writes where symbol denotes vector values assignment and denotes the vector of entries containing the values of corresponding to the values in , and its entries may be denoted as , with .
In terms of lookuptables entries, the learning relaxation index of definition (5) may be approximated as
2.3. Use of the neural system as generative model
When a suitable dual neural system described by the transference has been learnt, it may be effectively used to generate random samples drawn from the desired statistical distribution. The number of available input samples (that coincides with the number of output samples to be generated) is hereafter denoted by . The difficulty here is that the input samples are known while the output samples are supposed to be computed as . However, unlike in [10], the function is not known in the present setting as its inverse only has been learnt. Nevertheless, the inversion of function is not required in order to employ the dual neural system as a generative model, provided an appropriate usage of the lookup table representing is made. First, it is necessary to produce a realization , , drawn from the availablegenerator distribution (having, e.g., zeromean Gaussian or uniform probability density function) ranging in . About generation of input samples, as they are generated by using an available generator whose range is wider than , some generated input samples will be necessarily discarded. The amount of discarded input samples may be quantified. Let us denote by the cumulative distribution function of the input, namely, The ratio of the number of discarded samples over the total number of generated samples is given by The parameter may thus be selected in order to adjust the value of to design needs. Then, it is necessary to address the proper values in the learnt lookup table corresponding to the values of , , by finding pointers such that . This means searching, in the whole lookup table, for the closest value of to the sample . Such operation should be performed in an efficient way. Finally, the desired set of output samples, approximately distributed according to the probability density function , may be obtained by setting , where denotes the th entry of the lookup table . (Commented MATLAB code is available on request.)
3. Computerbased Numerical Experiments
In the following experiments, we consider generating random univariate samples with prescribed density function within prescribed ranges of interest, supposing that a prototype Gaussian random number generator is available. The prototype Gaussian distribution has zero mean and unitary variance. The parameter was set to 1 in all the experiments, which corresponds to a ratio that allows retaining about 70% of the generated input samples. The experiments were run on a 1.86โGHz, 512โMBRAM platform.
3.1. Experiments on a โbiGaussianโ distribution
The first case of generation of a random variable concerns a โbiGaussianโ distribution defined by that may assume fairly asymmetric shapes.
The numerical results presented below pertain to values , , and . The interval of interest for the output variable is set to . The total number of generated output samples amounts to . The number of points in which the function is computed is . The results obtained by running the learning algorithm (7) are shown in Figure 3. The values of the index shows that the fixedpoint algorithm may be stopped after 5 iterations. In Figure 3, the histogram estimates (with 50 bins) of the generated Gaussian data and of the biGaussian outputโobtained with the learnt dual systemโmay be observed as well.
(a)
(b)
(c)
(d)
Cumulative results on repeated independent trials are illustrated. The number of iterations of the algorithm (7) was set to 10, while the other data stayed the same of the previous singlerun experiment. The number of points in which the function was computed ranged from 200 to 2000 with step 200, in order to obtain some information about the sensitivity of the algorithm to the selection of the number of points in the domain and about the influence of the number in the computational complexity of the algorithm. In particular, the sensitivity of the algorithm with respect to the number was measured via a discrepancy index DSC computed as follows. (a) The histogrambased estimate of the probability density function of the generated samples is computed on a number of bins equal to 50. The discrete values of such estimate are denoted by , . (b) The true values of the probability density function are computed in correspondence of the histogram's bincenters. The discrete values of such probability density function are denoted by , . (c) The weightedsquaredifferencetype discrepancy index is computed by the expression .
The average number of generated samples varies between about 68250 and 68290. The obtained results are summarized in Tables and . The tables show the average runtime required for learning (expressed in seconds), the average runtime required to generate the samples (use of the learnt systems as a generative model) and average DSC index value. As it is readily appreciated, the computational complexity owing to the learning phase depends on the number of points used to approximate the nonlinear transference as expected, while the computational complexity owing to the generation phase depends only slightly on . The sensitivity of the method measured by the discrepancy index DSC is high for low values of the parameter , while it becomes quite low for values of larger than 1000.


3.2. Experiments on a generalized Gaussian distribution
The second example of random samples generation is about a generalized Gaussian distribution [13]: where denotes the hyperbolic sine function and denotes the reciprocal of the hyperbolic cosine function (namely, the hyperbolic secant function) The present generalized Gaussian distribution (GGD) differs by the standard GGD model encountered in literature (see, e.g., [14]). It belongs to the general exponential family of distributions of the type , with satisfying appropriate compatibility conditions. The distribution (12) as well as the GGD in [14] belong to the above exponential family.
The numerical results presented below pertain to values , , and . The interval of interest for the output variable is set to . The total number of generated output samples amounts to . The number of points in which the function is computed is . The results obtained by running the learning algorithm (7) are shown in Figure 4. The values of the index show that the fixedpoint algorithm may be safely stopped after 5 iterations again. In Figure 4, the histogram estimates (with 50 bins) of the generated Gaussian data and of the generalized Gaussian output may be observed as well.
(a)
(b)
(c)
(d)
Cumulative results are illustrated as well. The number of iterations of the algorithm (7) was set to 10, while the other data stayed the same of the previous singlerun experiment. The number of points ranged from 200 to 1000 with step 200. The average number of generated samples varies between about 68250 and 68290. The obtained results are summarized in Table .

3.3. Experiments on a Gamma distribution
The third example is repeated from [10]: we considered the generation of a (symmetric) Gamma distribution: This choice is motivated by the observation that the random number generation algorithm in [10] gives rise to the most inaccurate result when tested on the Gamma distribution .
The numerical results presented below pertain to values and . The interval of interest for the output variable is set to . The total number of generated output samples amounts to . The number of points in which the function is computed is . The results obtained by running the learning algorithm (7) are shown in Figure 5. The values of the index show that the fixedpoint algorithm may be safely stopped after 5 iterations again. Figure 5 shows the histogram estimates (with 50 bins) of the generated Gaussian data and of the Gammadistributed output.
(a)
(b)
(c)
(d)
Cumulative results were obtained by setting the number of iterations of the algorithm (7) to 20, while the other data stayed the same of the previous singlerun experiment. The number of points ranged from 1000 to 1800 with step 200. The average number of generated samples varies between about 68230 and 68280. The obtained results are summarized in Table .

4. Conclusion
The aim of the present manuscript was to present a novel random number generation technique based on dual neural system learning. We elaborated over our recent work [10] in order to obtain a new learning algorithm free of the need of choosing parameters and normalizationcriteria. The main idea is to shift the learning paradigm from the viewpoint of cardinal equation solving to dual cardinal equation solving, which appears to be more easily profitable.
The proposed numerical results confirmed the agreement between the desired and obtained distributions of the generated variate. The analysis of computational burden, in terms of running times, shows that the proposed algorithm is not computationally demanding.
References
 P. L'Ecuyer, โRandom number generation,โ in The Handbook of Simulation, J. Banks, Ed., pp. 93โ137, John Wiley & Sons, New York, NY, USA, 1998, chapter 4. View at: Google Scholar
 J. C. Lagarias, โPseudorandom number generators in cryptography and number theory,โ in Cryptology and Computational Number Theory, Proceedings of Symposia in Applied Mathematics, C. Pomerance, Ed., vol. 42, pp. 115โ143, American Mathematical Society, Providence, RI, USA, 1990. View at: Google Scholar
 S. T. A. Niaki and B. Abbasi, โNORTA and neural networks based method to generate RANDOM vectors with arbitrary marginal distributions and correlation matrix,โ in Proceedings of the 17th IASTED International Conference on Modelling and Simulation, pp. 234โ239, Montreal, Canada, May 2006. View at: Google Scholar
 G. Marsaglia, โA current view of random number generators,โ in Computer Science and Statistics: The Interface, L. Billard, Ed., pp. 3โ10, Elsevier, Amsterdam, The Netherlands, 1985. View at: Google Scholar
 H. Niederreiter, Random Number Generation and QuasiMonte Carlo Methods, SIAM, Philadelphia, Pa, USA, 1992.
 B. D. Ripley, โThoughts on pseudorandom number generators,โ Journal of Computational and Applied Mathematics, vol. 31, no. 1, pp. 153โ163, 1990. View at: Publisher Site  Google Scholar
 A. Cichocki and S.I. Amari, Adaptive Blind Signal and Image Processing: Learning Algorithms and Applications, John Wiley & Sons, New York, NY, USA, 2002.
 D. E. Knuth, The Art of Computer Programming: Seminumerical Algorithms, vol. 2, AddisonWesley, Reading, Mass, USA, 3rd edition, 1997.
 A. Papoulis, Probability and Statistics, PrenticeHall, Englewood Cliffs, NJ, USA, 1996.
 S. Fiori, โNeural systems with numericallymatched inputoutput statistic: variate generation,โ Neural Processing Letters, vol. 23, no. 2, pp. 143โ170, 2006. View at: Publisher Site  Google Scholar
 S. Fiori, โNeural systems with numerically matched inputoutput statistic: isotonic bivariate statistical modeling,โ Computational Intelligence and Neuroscience, vol. 2007, Article ID 71859, p. 23 pages, 2007. View at: Publisher Site  Google Scholar
 S. Ghosh and S. G. Henderson, โProperties of the NORTA method in higher dimensions,โ in Proceedings of the Winter Simulation Conference (WSC '02), E. Yücesan, C.H. Chen, J. L. Snowdon, and J. M. Charnes, Eds., vol. 1, pp. 263โ269, San Diego, Calif, USA, December 2002. View at: Publisher Site  Google Scholar
 A. C. Tsai, January 2006, Personal communication.
 K. Kokkinakis and A. K. Nandi, โExponent parameter estimation for generalized Gaussian probability density functions with application to speech modeling,โ Signal Processing, vol. 85, no. 9, pp. 1852โ1858, 2005. View at: Publisher Site  Google Scholar
Copyright
Copyright © 2008 Simone Fiori. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.