The number of dies used to produce a particular coin issue is of considerable interest to both numismatists and those interested in the economics of the ancient world in
general. For the benefit of interested readers I summarise here some formulae which can be used for estimating the number of dies from the data now available to us*.
Basic estimators – Good's Formula. For die estimation we would ideally have a large sample of the type in question, with full details of the make up of the sample, i.e, we would know the total number of
coins in the sample (n), the number of "singletons", i.e, dies represented by only one coin (d1), the number of dies represented by two coins (d2
), and so on, together with the total number of different dies represented in the sample (d). In practice we will often have only some of these figures, but fortunately we don't need all of them to obtain a
reasonable estimate of the total number of dies originally used, which we will denote by D. The basic factor we have to estimate is the "coverage" of the sample, meaning, in the current context, the ratio of the
number of coins originally produced by the dies in the sample to the total number of coins produced by all dies. If we assume, for the moment, that all dies produced equal numbers of coins, then the coverage will also
equal the ratio d/D, so that the coverage, C, will be given by the ratio:
where d0 is the (unknown) number of dies not present in the sample. Now you might think that the factor C depends in a complicated way on the detailed distribution of the coins in the sample, but in
fact, following on the work of Good in 1953, it turns out that C can be estimated quite generally by the simple formula**:
so that our first approximation to the total number of dies D, which we can write as D0, is just
which can also be written as D0 = nd/(n - d1)#. To get a feel for this formula we note that if most of the coins in the sample are singletons then the sample will likely include
only a small fraction of the original dies, and the coverage will be small; mathematically, d1 won't be much less than n, and hence Ce will be considerably less than one, so that D0
in formula (3) will be >> (very much greater than) d, the number of different dies in the sample. For the simple case where we have n coins with only one die match, then d1
will equal n-2, so that D0 = ½ nd (or ½ n2 close enough) - with two die matches, D0 = 1/4 nd, and so on (for not too many matches). On the other hand, if most of the dies in the sample are represented more than once, then the sample probably includes most of the original dies. In this case d1 will be << n, and hence
Ce will be not much less than 1, so that D0 will be not much greater than d, and in fact we can then write D0 = d(1 + d1/n) approximately. The above formula (3) is a
good start, but it makes a key simplifying assumption, namely that each die produced exactly the same number of coins. This is of course quite unrealistic, and probably tends to underestimate the actual number of dies
because low output dies will tend not to show up in the sample; we therefore need to improve the formula by effectively including a "spread" factor (S) to allow for the spread in the number of coins produced by the
different dies, so that our basic formula now becomes:
So how to estimate the spread factor S? Well, this is generally too difficult to do from first principles, so instead we bypass the problem by going directly to the die frequency distribution, i.e, the values of the
dr in the sample as a function of r. We assume that this distribution can be described, at least approximately, by what are known as Negative Binomial functions##. We needn't go into
these in detail at this point, but generally they produce a distribution something like a Gaussian bell curve, but skewed towards the higher outputs. The relative width of the distribution is described by a "shape
parameter" k – a lower value of k means a broad distribution, while larger values give narrower curves. Now for this type of distribution it is not too hard to show that D will be given by%:
Here we see what is probably the main justification for using the Negative Binomial model, namely the fact that Eqn 5 includes the basic Good coverage factord (1 – d1/n). But as well, this equation
includes a modifying factor recalling that in Eqn 4, namely
Now it is tempting to conclude that this modifying factor can be interpreted as the factor S in Eqn 4 due to the spread in die outputs. However, this is not strictly justified since S in Eqn 6 derives not from any
assumptions about the underlying die outputs but rather from the choice of the Negative Binary functions used to model the sample frequency distribution. Essentially Eqn 5 is really just an heuristic equation, i.e, it
calculates D by assuming that the observed die frequency distributions can be modelled by Negative Binomial functions and then simply extrapolating these functions to give the value of d0, and hence of D = d
+ d0. Hopefully though this approach will represent a reasonable approximation to reality, so that Eqn 5 will yield a worthwhile estimate of D, in which case Eqn 6 can be taken as a reasonable estimate of the
spread factor in Eqn 4%%. Note that the higher the value of k, the closer S tends to 1, and hence the closer Dk tends towards Good's basic estimate D0. Also, irrespective of the value
of k, we see that as the size of the sample increases, the ratio d1/d decreases so that S tends towards 1 anyway. In practice we usually find that k is not too different from 1, at least for reasonably large
samples, so that S will be less than 1.5 provided d1
< d/2. Hence we see that in these cases S will generally not make a great difference to the final estimate of D, which will then be determined mainly by Good's basic coverage factor, so that the accuracy (or even the theoretical validity) of the modifying factor is not so important.
In the past it has often been assumed, for reasons that we will come to later, that the value of k in Eqn 6 can be taken as 2. However recently it has become evident that in actual samples k values closer to 1 give a
better fit to the data in most cases, so putting k =1 we finally get:
The Die Output Distribution in the real world. As we have seen, Eqn 5 is really just a convenient heuristic formula. So can we provide a more solid theoretical foundation for this equation, i.e, can we show
that a Negative Binomial type die frequency distribution, or something like it, is what we would actually expect in practice, given some plausible assumptions about the underlying die output frequencies? Or, to turn the
problem around, given the observed die frequency distribution can we work backwards and infer something about the original die output distribution?
Well, in general the answer seems to be no, or at least, only with difficulty@. However, where the die output frequency distribution is exponential (i.e, geometric), then according to Esty the resulting
frequency distribution should also be exponential, so that it can be described by a Negative Binomial function with k = 1. Now, as noted above, actual frequency distributions are in fact often roughly exponential, so
it is tempting to conclude in these cases that so are the underlying die output distributions. This is important, since an exponential die output distribution implies that die life is determined solely by random
breakage at a constant average rate, which yields a die survival curve that declines exponentially with the number of coins struck, and hence also an exponentially declining die output distribution. (In practical terms
this means an exponential die output density function, i.e, the number of dies producing 0 to 1000 coins would be greater than the number producing 1000-2000 coins, and so on). However, I suspect that in fact the
shape of the frequency distribution is not very sensitive to the underlying die output distribution, i.e, in most real world cases k will turn out to be roughly 1 more or less irrespective of the original die outputs,
so we probably shouldn't press these conclusions too far. On the other hand if the value of k is going to be close to 1 anyway, then this also means that Eqn 7 will usually give a reasonable estimate of D more or less
irrespective of the actual die output distribution. (Note incidentally that in the past it was commonly assumed that the die output distribution (strictly, the die output density function) could be approximated by a
Gamma distribution (essentially a continuous version of the Negative Binomial distribution) with a shape parameter p of 2, and this value was then simply carried over to the value of k in Eqn 5). Discussion.
While the above formulae may seem to be based on some fairly specific assumptions they seem to give good results in practice. In particular, computer simulations have been run on model populations of coins with varying
compositions (i.e, numbers of dies, and spread factors) to produce test samples of various size, and from these simulations we find that estimations of D made using the formulae listed here generally match the original
die numbers quite well, or at least, within the expected margin of error, with the important proviso that we have a random sample to work from. Also, these formulae (with k values between roughly 1 and 2) seem to
give generally correct answers in real world cases where the dies were individually numbered and hence we actually have a good idea of their total numbers anyway (e.g, the Norbanus and, particularly, the Crepusius
denarius issues)$. Alternatively, if you don't want to make assumptions about die output spreads, you can simply take D0
in the basic Good Formulae 3 above (or my Formula 9 below, or in fact any equal output formula) as giving the number of "efuds", i.e, equivalent full use dies, or the number of dies that would have been used if all dies produced the same standard "full use" output. Such a figure may not be strictly realistic, but it is sufficient for purposes such as the calculation of the relative sizes of different issues. In fact Good's value of
D0 in particular can always be taken as a direct measure of the size of an issue, since it is based on coverage in terms of coins issued, irrespective of the die output distribution. In any
case it should be realised that these estimations of die numbers are only approximate, and hence the expected margin of error can be quite large. We can also produce estimates of the possible error range, but I will not
do that in detail here, except to say that for samples with R = n/d less than about 3, and n-d at least 10, a good estimate of the 95% error range in D is given by +/- 2D/sqrt(n-d) – this means that unless we have at
least a dozen or so die matches the possible error range in the die estimate can easily be plus or minus a factor of 2. This may seem a lot, but it is still quite sufficient to tell us whether the issue we are dealing
with involved only a dozen or so dies, or scores of dies, or many hundreds. My version of Good's Formula. Next I come to a formula of my own devising. This is a very simple formula which in practice gives a
good approximation to the Good's Formula 2 above for the coverage, at least for smaller samples, using only the number of different dies d in a sample of size n; i.e, it does not involve d1, the number of
singletons in the sample. Specifically, I approximate d1 by d/R, where R = n/d as before$$, so that from Formula 2 the coverage is now estimated by:
Applying this as in Equation 3, we get:
This last formula assumes equal die outputs, so add a total spread factor of your own choice. Formula 6 above becomes here:
giving ultimately:
Eqn 11 applies for any value of k, but taking k = 1, the formula reduces to the simple form:
(12) D1 = d/(1 - 1/R), (k = 1, any R).
This formula can be conveniently written as D1
= nd/(n-d), a form used in recent publications by Esty for exponential frequency distributions. (Note that Eqn 11 is generally valid only for smaller samples, with R less than, say, 1.5, but Eqn 12, where k = 1, is valid for any R value
$$). As well, if we write R as 1+x then Eqn 12 can be written as:
a form which is particularly instructive for low values of R where x << 1, and where we can also write D1 = d/x approximately. Note that where we have only one die match n-d = 1, in which case
D1 = nd (= n2 close enough), twice the value derived earlier for the equal die output case. Note also that for exponential outputs Eqn 12 can be rearranged to give R = 1+n/D, so that a
sample total equal to the number of original dies will yield an R value of around 2, and so on. Yet another formula - the "Truncated Exponential" Estimate. As we have seen, it is generally assumed that
exponential die outputs (from random die breakage) produce an exponential frequency distribution, i.e, that a die output shape parameter of p = 1 results in a die frequency shape parameter of k = 1. In practice however
this assumption may be only approximately correct, given that while the frequency curves often start out with a more or less geometric decline, they often then ultimately roll off at an increasing rate for higher r
values, more in line with an ordinary binomial distribution. Nonetheless, it seems that in these cases we can still model at least the beginning of the frequency distribution curve with an exponential function. This
is convenient, since it allows us to develop a simple estimate of the original die numbers which is particularly appropriate for small samples, i.e, where R = n/d is significantly less than 2 (although it is also valid
for large samples). For the ideal exponential frequency distribution we have d0/d1 = d1/d2 = d2/d3 etc. so that we can estimate d0 simply by
d12/d2, giving ultimately the following formula for the number of original dies:
(14) Dte = d + d
0 = d + d12/d2 We use Dte
here to signify the "truncated exponential", or perhaps "asymptotic exponential", estimate of D, as it extrapolates D from d using only the first two values of dr. For a strictly exponential distribution
this formula is equivalent to Eqns 7 and 12 above, although for real sample distributions the various equations will give somewhat different results for D. For small samples the main contribution to D
te will be from d0, and the relative error in this term comes from d1 and particularly the smallest factor, namely d2. We won't go into a detailed error calculation
here, but in most cases the 95% error range in Dte will be less than +/- 2Dte/sqrt(d2), with a reasonable estimate being given by +/- 2d0/sqrt(d2
) + 2sqrt(d). In my experience this formula seems to give better estimates of D for the uniquely numbered republican denarii than the other exponential estimators such as Eqns 7 and 12, which generally seem to
overestimate D (at least for lower values of R). This is possibly because the latter formulae, unlike Eqn 14, are susceptible to the roll-off evident in the values of the higher order dr in the
relatively small samples I have been using, since this will lower theand value of R = n/d and hence increase the modifying factor in Eqn 10, and ultimately the estimate of D&. However it should be
remembered that Eqn 14 is more susceptible to variations in the value of d2
than the other formulae, so overall the best approach is to crosscheck the results of as many different formulae as possible against each, and use what seems to be the most appropriate in all the circumstances.
Testing the Frequency Distribution.
In much of our discussion we have been assuming that the frequency distribution in real world samples is more or less exponential, but how do we check whether this is
actually the case? We can do this by comparing the sample data with the theoretical values of the dr. For the exponential distribution these can be calculated from the basic formula dr
= d x b(1-b)r-1 for r = 0 to n, where b is a scale parameter appropriate to the sample&&. The value of b can be approximated here by b = d/n = 1/R (which effectively fits the exponential
distribution to the values of n and d, as in Eqn 12), giving d1 = d/R, d2 = d1(1-1/R) and so on. The fit between the actual and theoretical distributions can then be assessed by the
usual methods (Chi squared, K-S maximum distance etc. - the degrees of freedom will be n-2 since n and d in the formula are determined by their sample values, thus putting two constraints on the possible values of the
calculated dr, as d = sum(dr) and n = sum(rdr)). Alternatively it may be more convenient in some cases to express the basic formula for the frequency distribution in strict Negative
Binomial terms, with D as the basic parameter rather than b; doing this we get dr = D x b(1-b)r
, where here b = 1-d/D (see note below for details). This form is useful as it enables us to set D to some expected value in order to test the fit with the sample. In theory, this approach can be extended by fitting
higher order Negative Binomial functions to the frequency data, so as to find the combination of shape parameter k and scale parameter b (or die number estimate D) which best fits the data (while matching the sample
values of n and d). In practice this will likely yield k values not significantly different from 1, and hence it is probably worth only doing if the frequency data are clearly statistically inconsistent with an
exponential distribution to begin with. This will normally require a sample with R = n/d at least equal to 2, as if both k and D are variable, then with smaller samples a wide range of k values can give a reasonable
match to the sample data. In fact with small samples, where only the first few dr
values are non-zero, and the distribution function is not very sensitive to the value of k, the mathematics means that a value of D given by Eqn 5, or thereabouts, will normally give a good fit to the data for just about any value of k from 1 to infinity (although some k values will give a better fit than others). In particular, a simple Binomial distribution (effectively, k = infinity) with the probabilities calculated using a D value close to Good's estimate in Eqn 3 will usually give an acceptable fit no matter what the real distribution is. Obviously this will normally underestimate the real value of D, which shows that small samples generally can't be used to put meaningful limits on both k and D simultaneously, although they can still be used to investigate, e.g, the possible values of D on the assumption that k = 1, say, or the range of k values compatible with a given value of D, where this is known.
Conclusion. Now that you have the basic formulae go to an archive site (CNG, Coin Archives or the like), search on your favourite coin, save off the results and get to work on a die number estimate. Note
that for large issues you don't need to check every single coin that comes up – just select a manageable number (say not more than 30 to begin with) of coins in reasonable condition and go to it. (To make life easier
for yourself set the formulae up in a small spreadsheet). Note also that for scyphate types you need to check both the left and right sides of the "obverse" (the convex side in this case), as these types are usually
double struck, and not infrequently different obverse dies are used for the two strikes – it seems the die holders often dropped their dies and had to grab another (cooler) one.
But be warned – die estimating can become addictive, and is rather time consuming. * The basic formulae in this note are mostly taken from an article by Warren Esty in Numismatic Chronicle 2007, p. 359-364;
they are discussed in more detail in an article by the same author in Numismatic Chronicle 1986, p. 185-216. ** The essence of the proof is to show that, while the proportion in the sample (the frequency) of
coins from all dies appearing r times in the sample is rdr/n, the proportion of coins from these dies in the population is best estimated by (r+1)dr+1/n, the difference being due to
the coins in the population from dies not in the sample. The sum of these latter frequencies from r = 1 upwards is the coverage C, and is evidently 1 - d1/n (since the sum of all the sample frequencies rd
r/n from r = 1 upwards will necessarily total 1). We therefore see that the proportion in the population of coins from dies not appearing in the sample can be estimated by d1
/n, which, assuming equal die outputs, equals d0/D, the proportion in the population of dies not present in the sample. In this case, therefore, Ce = 1 - d1/n also estimates 1 -
d0/D, the coverage of dies in the sample. Informally, we can argue like this - given that we have a sample that includes d dies, the probability that the next coin examined will be from a known die is
d/D, i.e, it equals the coverage C. Now the relative fraction of singletons in the sample is d1/n, so that we might intuitively expect that the probability that the next coin will be another singleton, and
hence from a new die, is also d1/n, so that the probability that next coin will be from a known die is 1 - d1/n. Hence Ce = 1 - d1/n. However, in the end intuition is not
formally rigorous proof. Note incidentally that the coverage in terms of coins issued, as Good originally showed, is represented quite generally by Eqn 2 irrespective of the outputs of the various dies. # For
the particular case of equal die outputs we can also derive Eqn 3 directly, without invoking Good's work. Here the statistics will be Binomial, i.e, for a sample of n coins from an issue with a total of D dies the
probability that a particular die will appear r times is nCrpr(1-p)n-r, where p = 1/D is the probability that a given coin comes from the die in question. Hence the average
number of dies that do not appear in the sample will be d0 = D x (1-p)n, while the number that appear only once will be d1 = D x np(1-p)n-1. Thus d0 = d1
x (1-p)/np = d1 x (D-1)/n = d1 x D/n close enough in most cases, and so we have D = d + d0 = d + D(d1/n), leading to D = d/(1-d1/n). (Strictly D = (d-d1
/n)/(1-d1/n)). ## Strictly, "zero-truncated" Negative Binomial distributions, as d0 is assumed to be zero to match the sample. The theoretical value of d0
in the full distribution is then taken to estimate the unknown value for the sample. % Assuming that the frequency distribution can be described by a Negative Binomial function with a shape parameter k, we
have dr = D x k+r-1Ck-1pkqr, where q = 1-p = n/(n+kD). Hence d0 = Dpk and d1 = Dkpkq, so that d0 = d1
/kq, and hence we have D = d + d0 = d + d1(n+kD)/kn, leading to D = d(1+d1/kd)/(1-d1/n) as in Eqn 5. %% In an actual sample the spread in the frequency
distribution will be determined as much, or more, by the sampling process as by the spread in the die outputs. As an extreme example, in the case of equal die outputs there is no spread at all in the original output
frequencies, but the sampling process means that the frequency curve of the sample will follow a Binomial distribution whose spread depends on the ratio of the sample size n to the number of original dies D (in fact the
spread in the distribution, as measured by its standard deviation, will be sqrt(n/D) close enough, since for a Binomial distribution the variance is npq, where p = 1/D here, and q=1-p). For our purposes we can
approximate this distribution with a similar Negative Binomial function which will have an arbitrarily large value of the shape parameter k, but much the same mean and spread as the Binomial ditribution. In this case
Eqn 6 will give S = 1, and hence Eqn 5 will give the correct result for D. @ In the general case where we have varying die outputs, the frequency distribution in the sample for each original die will be simple
Binomial, so the resulting overall frequency distribution will be the weighted sum of the separate and different Binomial distributions from each original die. Our assumption, or perhaps hope, is that this overall
distribution can be approximated by a Negative Binomial function. $ Note that Esty now thinks (Dec. 2010) that in most cases k = 1 gives the best fit to the data in most cases, i.e, that die life is mostly
determined by simple breakage at a constant rate. However, I have found that k values higher than 1 seem to give a better fit to reality with some of the uniquely numbered Republican denarii where we know the actual
number of dies. However, the samples I have been using are small and the statistical significance as yet low, and in any case what this may actually indicate is that for smaller samples the frequency distribution is not
strictly exponential for random breakage anyway. $$ This approximation is derived from the fact that if d = n - x, then for x << n, (i.e, R not much more than 1), d1
will usually equal d - x, or not much more, so that d1 = n - 2x = d2/n = d/R, approximately, just as d = n/R. For low R values this approximation is evidently generally correct for any value of k,
and in fact, as shown in more detail below, it is also correct for all values of R in the case of exponential frequency distributions (k = 1). & To put it another way, if the frequency distribution
for the random breakage case is not strictly exponential, then the appropriate value for the shape parameter p in Eqn 10 will be greater than 1, by some factor dependent on R which (hopefully) declines towards unity as
R increases. && This formula can be easily derived from first principles. We note that for an ideal exponential (geometric) distribution dr+1 = dr
x q, where q is some constant less than 1, so that dr = d1 x qr-1. Evidently d = d1+d2+d3... = d1 x (1+q+q2+...) = d1
/(1-q), giving d1 = d(1-q), with d2 = d1 x q etc. Putting b = 1-q then gives our basic formula dr = d x b(1-b)r-1. Further, we note n = d1+2d2
+3d3... = d1 x (1+2q+3q2+4q3...), so that n-d = d1 x (q+2q2+3q3...) = qn, i.e, q = (n-d)/n and hence finally b = d/n = 1/R. (This means
incidentally that the calculated frequency distributions for all samples with the same level of coverage have the same roll-off factor q = 1-b = 1-1/R). Note also that since d0 = d1
/q = d(1-q)/q we have D = d + d0
= d + d(1-q)/q = d/q = d/(1-b) = nb/(1-b). For the ideal distribution b = d/n, as we have just shown, in which case we have D = nd/(n-d), i.e, Eqn 12. For an actual sample Eqn 12 gives a value of D which best fits an exponential distribution to the sample data while matching the observed values of n and d.
The basic formula for the dr
above is not quite in conventional Negative Binomial format, but we can easily remedy this, and at the same time make D the basic parameter rather than b. From the the general formula for D above we have d = Dq = D(1-b), so putting this in the basic formula we get d
r = D(1-b) x b(1-b)r-1 = D x b(1-b)r, where b = 1-q is given by 1-d/D. Ross Glanfield. January 2010. Latest revisions: 21 May '10: My own formula added.
22 May '10: Discussion section rewritten, modifying views on value of k. 2 Sept. '10: Error range formula simplified.
25 Sept. '10: Discussion of die output distribution function revised. 20 Dec. '10: Esty's preference for k = 1 noted.
22 May '11: Separate Frequency Distribution section added. 4 July '11: Nomenclature tidied up.
11 Aug. '11: "Yet another formula" added. 24 Aug. '11: Direct derivation of Eqn 3 added (see subnote #).
11 Oct. '11: Discussion of frequency distributions and Formula 14 revised. 16 Nov. '11: Discussion of Eqn 6 revised (yet again) clarifying relation between the shape parameters p
and k. 18 Dec. '11: Error estimate for Eqn 14 revised. |