Archive | SPSS RSS feed for this section

Databse Fun: Of Databases, Statistics and Isotopes

25 Nov

I know what you are thinking – what sort of misspelled title is that for a blog post? The answer is below dear reader (1)!

Databases are, in my humble opinion, awful, tedious and time consuming beasts to create and are often best tackled head on armed only with a black coffee for sustenance as you try to accurately type a mind-numbing amount of data into an excel spreadsheet at 2am in the university library.  (That may just be my experience though!).  The beauty of a completed database, however, cannot be overestimated.  This is where you get to test out hypotheses based on the data that you have selected and gathered for your research question, where all of the core information lies and where the data can be repeatedly and demonstratively tested again and again.  A completed and ordered database is a thing of beauty and, when looked at 6am in the morning after a tiring night of inputting data, a thing of magnificence!

But let’s start at the beginning.  I recently had cause to look again at the database I had made for my MSc dissertation and, as I scrolled across and down the excel spreadsheet, I could just about remember the hours I had spent producing the spreadsheet, justifying the column titles and entering the data itself.  My data set included strontium isotopic results gathered from 422 individuals across 9 different sites from the Neolithic Linearbandkeramik (LBK, roughly 5500BC to 4800BC) culture of Central Europe, with my sample ranging geographically from the modern countries of Austria, Czech Republic and Germany. The data set used for my study was carefully culled from a literature review and a close reading of a number of journal articles that were available at that time (mid 2012).

My aim was to investigate statistically the claim of patrilocality in the LBK culture as proposed by Bentley et al. (2012) by investigating the specific sex and age differences within the profile group by using strontium isotopes as proxies.  Strontium isotopes samples (specifically 87Sr/86Sr) are often taken from both human and animal skeletal remains (primarily from teeth, specifically the 1st, 2nd and 3rd molars as they reflect Sr values throughout the life of an individual) as it survives well in archaeological contexts and is an informative approach to investigate mobility and local/non-local status of individuals.  Strontium values reflect geochemical signatures in the dietary component of the individuals, which comes from the soils and the underlying geological landscape that the individual lived on.  There are issues with this method (2) (see also this blog’s comments section).  Strontium isotopic investigations in archaeology are often studied in conjunction with oxygen isotopes (18O/16O) sampled from tooth enamel as well (specifically the 2nd molar) which represents water drank in life, but, frustratingly, this has not been the case in the LBK literature.

I knew that I wanted to statistically test the data set using SPSS 19, the standard statistical program widely used in the social sciences, but I first needed to tabulate and code the data so it would be useful when it came to testing the data.  As the study also included comparisons of the funerary grave goods and a basic demographic investigation of each site coding the entries (1=male, 2=female or 1=present 2=absent) allowed for comparisons to be made in the SPSS program and for statistical tests to be carried out.  The strontium itself was, as expected, non-parametric, which meant that the data adhered to no specific characteristic structure or parameter.

nonparamet

The normality test, using the Kolmogorov-Smirnov and Shapiro-Wilk statistical tests, indicates that the strontium data used for this study of the 422 individuals was not distributed normally (the P-value, nominally a significance value, is 0.000 for these tests). This means that tests such Spearman’s Rho correlation (quantity between variation), Mann-Whitney U (2 independent variables) and Kruskal-Wallis (3 or more independent variables) are the most appropriate statistical tests to perform on this data set (Bryman & Cramer 2011: 245).

When building the database I also wanted any relevant information and references easy to hand so I included the skeletal number (as given in the articles), site name, period, sex, sex code, isotope source, body position, funerary artefacts found and reference etc for each individual used in the study (see below).

MScdatabase

A screen shot of the database used in my MSc dissertation displaying the revelent information of the 422 individuals from the LBK sites used in the study. The data was entered in a Excel spreadsheet before being transferred to SPSS for statistical investigation. Click to enlarge.

The data was carefully added over a number of days once I had gathered all the required journal articles discussing the sites I had chosen  The sites themselves were largely located in southern Germany, with the 9 sites nicely split into three time periods throughout the chronology of the LBK period.  Perhaps somewhat hastily I added this to the database and assigned the values of the individuals with a Early, Middle and Late ranking for their respective site.

MScdatabase22

Towards the bottom of the database used for the study. Here we can see the references cited for each site used in the study and the specific coding for funerary items (the two columns before reference column on the right hand side, where 1 depicts present and 0 absent).

During the construction of this database I did encounter problems as I had not built such a large database before, indeed the only time I had really used a database properly was for my undergraduate dissertation some years previous whilst using ArcGIS.  The problems this time included whether I was actually coding the funerary items the right way round or not, reading back through the database and correcting any errors in typing (especially for the strontium values) and making sure I correctly identifying the individuals used in their respective articles.  There are some things inherent in archaeology that cannot be solved.  This includes lacking contextual data or written site reports (which may or may not exist hidden in regional archaeological unit headquarters, not known or available to the public or indexed on any site).

Of course there were problems with my approach, which I expounded on in fuller detail in the thesis itself.  This did include problems interpreting the strontium results and distinguishing between local and non-local individuals at the site when there is no reference data to compare it to and debating my own statistical approach.  Still, as frustrating as building the database was, I did enjoy carrying out my own investigation of it immensely.  On rainy days I often think that my dataset could do with a second look at and investigation, perhaps I could change this approach or that, use this statistical method instead and isolate that clump of individuals etc.

It may be a pipe dream for the moment (I lack a working SPSS program for one!) but this is as much of a key part of archaeology and archaeological research as digging in the mud is.  Research is what drives archaeology and human osteology forward, from new scientific techniques to reviewing old data and finding new patterns.  The past is always present in new technology, you just have to drive it forward sometimes.

I will be introducing the Neolithic LBK culture in further detail in an upcoming post and discussing the merits of my thesis in further detail in another post.  For now I hope you have enjoyed this brief delve into what was the core of that research, the database itself.

Notes:

(1.) This post was named in honour of a spelling mistake I made in the contents pages of my MSc thesis, spotted only when I proudly showed a friend a copy of the thesis a few weeks after the hand in date.  This, of course, led to gales of laughter from both of us (and to my internal cringing) as my poor editing skills came to light and it still remains a favoured joke to this day.

(2.) A few problems have become apparent with the strontium isotope technique, as with any mature and widespread application of a scientific technique, and it is worth mentioning them here (Bentley et al. 2004: 366).

Firstly is the issue of what a local and non-local signature mean for the prehistoric individual, as technically the 87Sr/86Sr ratio reflects diet over a period of time, and said food could have come from non-local sources.  However, this could be a distinct benefit, as it may be possible to identify individuals whose subsistence activity took place over a diverse range of territories (Bentley et al. 2004: 366, Price et al. 2002: 131).  Secondly, diagenesis affects anything buried and groundwater strontium has a tendency to penetrate the skeleton after burial (Bentley et al. 2004: 366).  In this study only enamel from the permanent dentition (1st or 2nd molars) is used, as this mitigates the effects of diagenesis because enamel is a strong biological material containing large mineral crystals, rendering it much less porous than bone and it is highly resistant to biochemical alteration (Killgrove 2010, Richards et al. 2008).  The third issue concerns the environmental heterogeneity of the strontium isotope signatures, which as Bentley (et al 2004: 366) points out ‘vary in different minerals of a single rock, in the leaves, stems and roots of a plant, or in water sources such as streams and precipitation’.  The measurement of small herbivore bones, or snail shells, at the locality of the archaeological site, preferably from the same chronological age, can obtain a remarkably consistent 87Sr/86Sr ratio, which is representative of the local catchment area (Bentley et al. 2004: 366).  The use of strontium ratio is however just one tool among many that is used to shed light on our ancestors; it should always be used in combination with other techniques of investigation to elucidate the full range of potential data present of archaeological sites and materials (Montgomery 2010, Richards et al. 2001, Van Klinken et al. 2000).

Bibliography:

Bentley, R. A., Price, T. D. & Stephan, E. 2004. Determining the ‘local’ 87Sr/88Sr Range for Archaeological Skeletons: A Case Study from Neolithic Europe. Journal of Archaeological Science. 32 (4): 365-375.

Bentley, R. A., Bickle, P., Fibiger, L., Nowell, G. M., Dale C. W., Hedges, R. E. M., Hamiliton,. J., Wahl, J., Francken, M., Grupe, G., Lenneis, E., Teschler-Nicola, M., Arbogast, R-M., Hofmann, D. & Whittle, A. 2012. Community Differentiation and Kinship Among Europe’s First Farmers. Proceedings of the National Academy of Sciences Early Edition. doi:10.1073/pnas.1113710109. 1-5.

Bryman, A. & Cramer, D. 2011. Quantitative Data Analysis with IBM SPSS 17, 18 & 19: A Guide for Social Scientists. London: Psychology Press.

Killgrove, K. 2010. Migration and Mobility in Imperial Rome. PhD Thesis. University of North Carolina. (Open Access).

Montgomery, J. 2010. Passports from the Past: Investigating Human Dispersals Using Strontium Isotope Analysis of Tooth Enamel. Annals of Human Biology. 37: 325–346. (Open Access).

Price, T. D., Burton, J. H. & Bentley, R. A. 2002. The Characterisation of Biologically Available Strontium Isotope Ratios for the Study of Prehistoric Migration. Archaeometry. 44 (1): 117-135.

Richards, M.P., Fuller, B,. T. & Hedges, R. E. M. 2001. Sulphur Isotopic Variation in Ancient Bone Collagen from Europe: Implications for Human Palaeodiet, Residence Mobility, Modern Pollutant Studies. Earth and Planetary Science Letters. 191 (3-4): 185-190.

Richards, M. P., Montgomery, J., Nehlich, O. & Grimes, V. 2008. Isotopic Analysis of Humans and Animals from Vedrovice. Anthropologie. XLVI (2-3): 185-194.

Van Klinken, G., Richards, M. and Hedges, R. 2000. An Overview of Causes for Stable Isotopic Variations in Past European Human Populations: Environmental, Ecophysiological, and Cultural Effects. In S. Ambrose and M. Katzenberg (eds). Biogeochemical Approaches to Palaeodietary Analysis. New York: Kluwer Academic. pp. 39-63.