Dealing with the unknown. A proposal for a method for redistributing skeletons of unknown sex and age in an assemblage

One of the first jobs for any bone specialist examining an assemblage is to assign an age at death and a sex to the individual skeletons. Having done so, he or she will then construct a table in which the number of skeletons in each of the age and sex categories is shown. An example is shown in table 1, based on the published account of the assemblage from Barton-on-Humber in Lincolnshire (Waldron, 2007a). It can be seen that there are a substantial number of skeletons for which neither the sex nor the age (or both) can be determined. None of the sub-adults has been given a sex, since there are no wholly reliable ways to sex the juvenile skeleton as the skeleton is not sufficiently dimorphic before the onset of puberty; there are several skeletons which have been given an age but no sex; and a considerable number of adults (446 or about 16% of the total) for which neither age nor sex could be assigned. The number of ‘known unknowns’ in this assemblage is high but by no means unusual, and, in general, is directly related to the state of preservation; the poorer the preservation, the greater the number of skeletons that cannot be given an age or a sex. The fact that there are so many skeletons of unknown age and/or sex may have a distorting effect on the proportional distribution of the assemblage; in figure 1, for example, it can be seen that the groups of unknown age, or unknown age and sex together, account for a large proportion of the total (in this case, no less than 38% of the total, excluding the sub-adults). If they are ignored, and the age and sex distribution is calculated only for those skeletons that receive such attributions (fig. 2), then, although the relative proportions are maintained, the actual proportions are different, all proportions being greater than before. It would clearly be a more accurate representation of the ‘true’ age and sex distribution if the unknowns could be redistributed to the appropriate age and sex classes and I suggest here a way in which this might be achieved. The method is based on a number of assumptions, but most importantly that the preservation of the skeleton is independent of age or sex; that is to say, there is no bias towards one age or one sex preserving less well than any other. I also make the assumption that the ratio of deaths among sub-adults is the same as can be observed in some appropriate historical population. On the basis of these assumptions, we can start to redistribute the unknowns in table 1, in the following steps.


Introduction
One of the first jobs for any bone specialist examining an assemblage is to assign an age at death and a sex to the individual skeletons. Having done so, he or she will then construct a table in which the number of skeletons in each of the age and sex categories is shown. An example is shown in table 1, based on the published account of the assemblage from Barton-on-Humber in Lincolnshire (Waldron, 2007a). It can be seen that there are a substantial number of skeletons for which neither the sex nor the age (or both) can be determined. None of the sub-adults has been given a sex, since there are no wholly reliable ways to sex the juvenile skeleton as the skeleton is not sufficiently dimorphic before the onset of puberty; there are several skeletons which have been given an age but no sex; and a considerable number of adults (446 or about 16% of the total) for which neither age nor sex could be assigned. The number of 'known unknowns' in this assemblage is high but by no means unusual, and, in general, is directly related to the state of preservation; the poorer the preservation, the greater the number of skeletons that cannot be given an age or a sex.
The fact that there are so many skeletons of unknown age and/or sex may have a distorting effect on the proportional distribution of the assemblage; in figure 1, for example, it can be seen that the groups of unknown age, or unknown age and sex together, account for a large proportion of the total (in this case, no less than 38% of the total, excluding the sub-adults). If they are ignored, and the age and sex distribution is calculated only for those skeletons that receive such attributions ( fig. 2), then, although the relative proportions are maintained, the actual proportions are different, all proportions being greater than before. It would clearly be a more accurate representation of the 'true' age and sex distribution if the unknowns could be redistributed to the appropriate age and sex classes and I suggest here a way in which this might be achieved. The method is based on a number of assumptions, but most importantly that the preservation of the skeleton is independent of age or sex; that is to say, there is no bias towards one age or one sex preserving less well than any other. I also make the assumption that the ratio of deaths among sub-adults is the same as can be observed in some appropriate historical population. On the basis of these assumptions, we can start to redistribute the unknowns in table 1, in the following steps.  ; the mean ratio (male:female) for those dying aged 0-4 is 1.13:1, and between 5-14, 1.02:1. These ratios were used to distribute the sub-adults and a total number of males and females by age-class was finally determined, with the results shown in table 2; the age and sex distribution is shown in figure 3. The effect is generally to increase the proportions in each adult age and sex class, and in the case of the 25-year olds, to reverse the proportions of males and females, so that now a greater proportion of females than males is represented.

An attempt at validation
In an attempt to test the validity of the method, a model population of 400 individuals was established to represent an assemblage, based on the number of deaths reported by the Registrar General in his first Annual Report of 1839. The deaths were grouped by age group as shown in table 1 but truncated at age 65 in order to avoid too long a right tail. (There were many deaths recorded over this age for both men and women, including about a dozen reputedly older than 100.) The proportion of deaths in each of the age groups was calculated from the Registrar General's data and then the model population of 200 males and 200 females was randomly distributed among each age and sex class until the correct proportion of the total was achieved. This was done by randomising the numbers from 1-400 and allocating each number to the age and sex classes in order from youngest to oldest (male 15-, female 15-, male 25-, female 25-and so on. Following this, two tests of the method were made. In the first it was assumed that 20% of the total assemblage was unknown, and in the second, that 30% was unknown. For the first test, 40 numbers between 1 and 400 were selected at random and these numbers removed from the total, while for the second the same procedure was followed but with 80 randomly selected numbers. The resultant unknowns were then redistributed following steps 2-4 above. The proportions in each age and sex class were then calculated with the results shown in table 3. From this table it can be seen that there is very close agreement between the redistributed proportions and the original, and in no case is the difference statistically significant at the 5% level. However, it remains to be seen whether this would still be the case with a different model population and different proportions of missing data.

Comment
The comparison of the structure of death assemblages is an important feature of palaeodemography but, if based on the analysis of skeletal assemblages, may be somewhat distorted by what is usually a substantial num-   ber of skeletons to which neither an age nor a sex can be assigned. The method described here has arisen out of the perceived need to use an assemblage as a whole, rather than just a part -albeit what is almost always a considerable part -of it.
One of the main assumptions of the method is that the state of preservation of the skeleton is independent of age and sex; indeed very little is known about the factors that determine the rate at which the skeleton degrades after death, although it is generally agreed that the most significant is the acidity of the matrix in which it is deposited (Nielsen-Marsh et al. 2007). It is likely that poorly preserved skeletons will be more difficult to excavate and there may be some bias towards recovering only those skeletons that are better preserved. This may be a particular problem with older assemblages when recovery techniques were less well developed than is presently the case. I know of no way in which this possible bias can be quantified, however. Some authors have claimed that juvenile skeletons preserve less well than those of adults (Gordon and Buikstra 1981) but my experience suggests that this is not the case; the juvenile skeletons that I have examined seem to preserve no better and no worse than their adult contemporaries, and this is the experience of other authors also (Walker et al. 1988;Saunders 2008). There may be other factors that determine the completeness with which sub-adult skeletons are recovered -they may be buried apart from the adults; their small size may hamper recovery, or cause them to go undiscovered -but, again, I cannot see how this bias -if it exists -can be quantified except when the actual composition of the death assemblage is known (as, for example, it was at the City Bunhill burial ground (Connell and Miles 2010)). Unfortunately, such cases are bound to be the exception, especially when dealing with assemblages from more remote times.
The second assumption, that sub-adults die in the same proportions as in some reference population, seems a priori reason-able. For example, in those modern developing countries for which reasonable data exist, the ratio of deaths in childhood seems always to be slightly in favour (if that is the right word) of male children (WHO 2010) and in contemporary England and Wales this excess persists: in the 0-4 age group the male/female ratio is 1.31:1 and in the 5-14 age group it is 1.17:1 (nationalstatistics.gov. uk). The difficulty is which reference population to select given that there are almost no archaeological assemblages of children of known age and sex. At Christ Church, Spitalfields, there were 389 individuals for whom sex and age at death were known from extant coffin plates (Molleson and Cox 1993). Of these, 80 were children under 15, all but 8 of whom had died before their fifth birthday. The ratio of boys to girls among the underfives (43 boys 29 girls) was 1.48:1 and among the 5-14 year olds (5 boys 3 girls), the ratio was 1.67:1, giving a ratio overall of 1.5:1. The numbers involved here are really too small to generalise from, although the trend noted above is maintained.
The reference population chosen here was the earliest historical one available -that reported in the first five Annual Reports of the Registrar General of England and Wales for the years 1837 -1842. These are the oldest reliable data that we have and are also closest in time to any archaeological assemblage that might be dealt with. The mean ratio reported for these five years was used and, as expected, showed that there was a slight male excess in the two age groups shown in table 1 (male/female ratio 1.13:1 and 1.02:1, respectively). It may be that a more appropriate reference population is available -if so, I am not aware of it -but even so, there is no reason to suppose that it would show anything other that the usual male excess even though the magnitude of the excess may be slightly different from that used here.
There does not seem to be any means by which to validate the method except by the use of a model population (as has been done here) as there is no a priori means of knowing the exact distribution of the assemblage. Using the present model indicates that the method does redistribute the unknowns with an accuracy that is within the limits of statistical significance and is thus satisfactory to this limited extent. It might be thought that a more stringent -and appropriate -test would be to compare a redistributed assemblage against the data from the pertinent parish records. Few such comparisons are likely to be possible, however, and there are difficulties even here, because it is by no means certain that the structure of the assemblage recovered from a particular cemetery is actually representative of its whole, given the vagaries in recovery (see, for example, Waldron, 2007b, especially chapter 2).
The method described here may not be the best that can be achieved; it is presented to provoke discussion and any improvements upon it would be most welcome. Whatever its failings, it is hardly likely to make things any worse than they already are.