Blog Entry Number One
To get my new website finally online, I want to add something to the blog section. Here in the future mostly stuff should be published that is related to archaeology and statistics using R.
To start I want to add a bit that helps to understand the relationship of several items within a collection of data. To be more specific, how can I explore the correlation of lets say pottery types or in this case animal species in the archaeological remains of different sites. Do specific combination occure regularily, so that it might be the case that they are functional related?
As an example I will use the animal remains from sites of the so called Funnelbeaker Culture, on which I recently published an article that is still under review. The data mostly originate from the collection in an article of Jan Steffens (2005), but there are some additions from an [unpublished (?) master thesis of Peter Imperiale] [@Imperiale2001]. You may like to download the data set.
Within this data set the question was/is, how the different animal species do correlate, and how this can be explored/visualised intuitively. The most obvious candidate was a network that display the interdependence of those species. When two species correlate positively and significant, their representation in the network should be linked by a connection.
To start this enterprise, we first load the data into our R environment:
animals <- read.csv2("tbk_animals.csv") #note that it is the "continental" csv version!
library(knitr)
kable(animals)
id | site | Hausrind | Hausschwein | Schaf.Ziege | Hund | Pferd | Rothirsch | Reh | Elch | Ur | Wildschwein | Braunbär | Dachs | Marder | Iltis | Fischotter | Wolf | Fuchs | Luchs | Wildkatze | Biber | Feldhase | Robbe | sonstige | summe |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | Alsleben | 46 | 1 | 8 | 0 | 0 | 2 | 0 | 0 | 2 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 60 |
2 | Basedow | 87 | 26 | 15 | 9 | 29 | 265 | 71 | 10 | 25 | 81 | 2 | 4 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 10 | 0 | 0 | 0 | 634 |
3 | Bebensee | 43 | 17 | 2 | 11 | 6 | 99 | 20 | 1 | 47 | 27 | 0 | 0 | 0 | 0 | 3 | 0 | 0 | 0 | 2 | 8 | 0 | 0 | 0 | 286 |
4 | Bistoft | 213 | 112 | 209 | 24 | 0 | 363 | 17 | 5 | 22 | 72 | 0 | 0 | 2 | 0 | 43 | 1 | 0 | 0 | 0 | 53 | 1 | 0 | 0 | 1137 |
5 | Dölauer Heide | 105 | 35 | 44 | 2 | 22 | 2 | 1 | 0 | 0 | 3 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 216 |
6 | Fuchsberg-Südensee | 624 | 94 | 44 | 18 | 5 | 45 | 15 | 0 | 20 | 46 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 3 | 0 | 0 | 7 | 922 |
7 | Glasow | 37 | 11 | 68 | 0 | 0 | 5 | 10 | 0 | 21 | 3 | 2 | 2 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 161 |
8 | Großobringen | 3064 | 0 | 570 | 161 | 129 | 248 | 28 | 1 | 0 | 0 | 14 | 1 | 0 | 0 | 0 | 0 | 2 | 1 | 0 | 5 | 0 | 0 | 0 | 4224 |
9 | Haldensleben | 45 | 5 | 6 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 57 |
10 | Heidmoor | 891 | 440 | 139 | 145 | 35 | 1707 | 336 | 162 | 93 | 515 | 6 | 35 | 17 | 6 | 46 | 35 | 1 | 1 | 46 | 579 | 1 | 29 | 17 | 5282 |
11 | Hüde I | 75 | 46 | 17 | 35 | 393 | 301 | 237 | 307 | 989 | 793 | 110 | 6 | 29 | 6 | 64 | 12 | 6 | 2 | 23 | 781 | 0 | 0 | 0 | 4232 |
12 | Neukirchen-Bostholm | 190 | 148 | 36 | 4 | 0 | 12 | 2 | 0 | 2 | 4 | 1 | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 402 |
13 | Niedergörne | 60 | 29 | 39 | 0 | 0 | 6 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 135 |
14 | Runstedt | 67 | 7 | 0 | 0 | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 7 | 0 | 0 | 83 |
15 | Schalkenburg | 1508 | 404 | 792 | 104 | 24 | 37 | 67 | 0 | 12 | 28 | 3 | 3 | 6 | 0 | 0 | 3 | 9 | 0 | 3 | 1 | 1 | 0 | 49 | 3054 |
16 | Siggeneben-Süd | 38 | 41 | 9 | 3 | 0 | 16 | 1 | 0 | 3 | 6 | 1 | 0 | 0 | 0 | 1 | 2 | 0 | 0 | 4 | 0 | 0 | 8 | 2 | 135 |
17 | Stinthorst | 15 | 18 | 6 | 2 | 8 | 151 | 31 | 13 | 5 | 30 | 5 | 1 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 3 | 0 | 0 | 0 | 290 |
18 | Süssau | 568 | 113 | 97 | 12 | 0 | 18 | 0 | 0 | 0 | 3 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 0 | 813 |
19 | Wangels FN | 164 | 8 | 57 | 11 | 1 | 65 | 23 | 0 | 13 | 11 | 1 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 5 | 9 | 370 |
20 | Wangels MN | 151 | 128 | 32 | 15 | 0 | 113 | 48 | 0 | 9 | 34 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 24 | 72 | 626 |
21 | Wolkenwehe | 2586 | 448 | 168 | 16 | 32 | 2875 | 416 | 16 | 104 | 288 | 0 | 8 | 1 | 0 | 8 | 2 | 8 | 0 | 8 | 456 | 0 | 8 | 0 | 7448 |
22 | Blandebjerg | 423 | 124 | 37 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 586 |
23 | Bundsø | 1405 | 1222 | 359 | 196 | 0 | 142 | 3 | 7 | 7 | 7 | 3 | 0 | 14 | 1 | 3 | 1 | 0 | 0 | 0 | 0 | 0 | 10 | 0 | 3380 |
24 | Fannerup | 183 | 106 | 149 | 10 | 7 | 46 | 10 | 0 | 13 | 16 | 0 | 1 | 0 | 0 | 3 | 0 | 1 | 0 | 0 | 0 | 0 | 17 | 0 | 562 |
25 | Lidsø | 672 | 66 | 137 | 49 | 0 | 0 | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 11 | 1 | 938 |
26 | Lindø | 1293 | 659 | 421 | 30 | 1 | 90 | 0 | 0 | 0 | 5 | 3 | 0 | 0 | 0 | 3 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 2508 |
27 | Lyø | 297 | 85 | 59 | 0 | 0 | 8 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 25 | 0 | 474 |
28 | Sølanger | 15 | 1 | 29 | 34 | 0 | 60 | 128 | 0 | 0 | 96 | 0 | 0 | 0 | 0 | 8 | 0 | 9 | 0 | 3 | 0 | 1 | 16 | 1 | 401 |
29 | Spodsbjerg | 1978 | 891 | 588 | 101 | 4 | 179 | 8 | 0 | 0 | 8 | 1 | 0 | 0 | 0 | 4 | 1 | 4 | 0 | 0 | 1 | 0 | 23 | 101 | 3892 |
30 | Svaleklint | 35 | 0 | 0 | 0 | 0 | 126 | 65 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 5 | 0 | 10 | 0 | 242 |
31 | Troldebjerg | 10459 | 11632 | 2721 | 25 | 75 | 25 | 25 | 0 | 1 | 5 | 0 | 0 | 0 | 0 | 5 | 0 | 5 | 0 | 1 | 1 | 0 | 5 | 0 | 24985 |
32 | Löddesborg | 7 | 0 | 0 | 3 | 0 | 65 | 30 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 2 | 0 | 33 | 22 | 164 |
33 | Nymölla III | 7 | 0 | 0 | 0 | 0 | 74 | 23 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 8 | 0 | 3 | 0 | 2 | 0 | 0 | 21 | 6 | 145 |
34 | Brachnowko | 24 | 76 | 107 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 8 | 0 | 1 | 0 | 0 | 80 | 0 | 0 | 0 | 0 | 0 | 0 | 296 |
35 | Cmielow | 1579 | 566 | 276 | 112 | 57 | 36 | 44 | 3 | 0 | 44 | 3 | 5 | 0 | 0 | 0 | 3 | 3 | 0 | 0 | 5 | 0 | 0 | 0 | 2736 |
36 | Grodek | 1265 | 453 | 252 | 41 | 15 | 19 | 11 | 6 | 0 | 60 | 2 | 1 | 0 | 0 | 1 | 0 | 2 | 0 | 0 | 0 | 6 | 0 | 0 | 2134 |
37 | Kamin Lukawski | 1675 | 581 | 402 | 66 | 9 | 26 | 28 | 3 | 11 | 11 | 3 | 3 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 23 | 3 | 0 | 0 | 2845 |
38 | Kruska Podlotowa | 237 | 0 | 1 | 7 | 25 | 8 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 279 |
39 | Ksiaznice Wielkie | 150 | 137 | 25 | 31 | 0 | 9 | 26 | 0 | 0 | 5 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 50 | 0 | 2 | 0 | 0 | 436 |
40 | Mrowino | 359 | 78 | 93 | 8 | 3 | 29 | 3 | 0 | 0 | 12 | 7 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 0 | 1 | 0 | 0 | 0 | 595 |
41 | Pikutkowo | 103 | 26 | 14 | 69 | 3 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 217 |
42 | Podgaj | 197 | 19 | 94 | 0 | 0 | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 312 |
43 | Srem | 1040 | 216 | 50 | 17 | 11 | 1 | 1 | 0 | 0 | 4 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 3 | 0 | 0 | 1343 |
44 | Strachow | 215 | 76 | 119 | 4 | 0 | 0 | 0 | 0 | 15 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 22 | 0 | 0 | 451 |
45 | Stryczowice | 327 | 36 | 48 | 71 | 0 | 0 | 0 | 0 | 8 | 4 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 3 | 0 | 5 | 502 |
46 | Szlachcin | 56 | 14 | 11 | 4 | 43 | 30 | 45 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 7 | 0 | 0 | 0 | 212 |
47 | Ustowo | 483 | 267 | 76 | 37 | 46 | 191 | 59 | 0 | 60 | 16 | 4 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 1 | 114 | 0 | 1 | 0 | 1357 |
48 | Zawichost-Podgorze | 1017 | 323 | 214 | 93 | 40 | 10 | 10 | 0 | 0 | 9 | 2 | 3 | 0 | 0 | 0 | 0 | 0 | 2 | 0 | 3 | 2 | 0 | 0 | 1728 |
49 | Makotrasy | 1636 | 373 | 173 | 50 | 14 | 31 | 1 | 0 | 5 | 10 | 2 | 0 | 0 | 0 | 0 | 2 | 0 | 0 | 0 | 2 | 7 | 0 | 2 | 2308 |
50 | Muldbjerg I | 33 | 0 | 3 | 0 | 0 | 665 | 116 | 0 | 0 | 5 | 0 | 0 | 2 | 0 | 115 | 0 | 0 | 0 | 0 | 179 | 0 | 0 | 4 | 1122 |
51 | Björnsholm | 1 | 0 | 2 | 0 | 0 | 6 | 9 | 0 | 0 | 3 | 0 | 0 | 1 | 0 | 0 | 0 | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 24 |
52 | Saxtorp | 246 | 78 | 90 | 1 | 3 | 4 | 5 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 429 |
53 | Almhov - CT1 | 157 | 170 | 51 | 0 | 0 | 71 | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 452 |
54 | Hunneberget | 85 | 50 | 36 | 16 | 0 | 19 | 27 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 1 | 3 | 0 | 7 | 2 | 248 |
55 | S.Sallerup 15H | 371 | 52 | 43 | 2 | 4 | 22 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 495 |
56 | Elinelund 2B | 314 | 59 | 14 | 2 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 393 |
57 | Hyllie station | 39 | 41 | 7 | 0 | 0 | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 89 |
58 | Hyllie vattentorn | 40 | 27 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 68 |
59 | Hotelltomten | 79 | 68 | 7 | 0 | 0 | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 156 |
60 | Skumparerget | 397 | 68 | 20 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 4 | 492 |
61 | Anneberg | 54 | 46 | 4 | 11 | 0 | 0 | 0 | 2 | 0 | 0 | 0 | 1 | 17 | 0 | 17 | 0 | 3 | 0 | 2 | 6 | 0 | 674 | 24 | 861 |
Displayed are now our sites with the NISP data of the species. For the construction of the network we now need a matrix, consisting of a row and a column for each species, filled with 1 for each situation when there is a positive significant correlation between the occurence of these both species. To archive this I used a modified snippet from the web (originally from Aleksandar Blagotić, function corstar.R), and the function corr.test
from the library psych
. I constructed this little function:
cornet <- function(x, y = NULL, use = "pairwise", method = "pearson", round = 3, row.labels, col.labels, ...) {
require(psych)
ct <- corr.test(x, y, use, method,adjust="bonferroni") # calculate correlation
r <- ct$r # get correlation coefs
p <- ct$p # get p-values
sig <- ifelse(p < .05, 1,0) # generate significance stars
poscorr<-ifelse(r > 0, 1,0)
m <- matrix(NA, nrow = nrow(r) , ncol = ncol(r)) # create empty matrix
rlab <- if(missing(row.labels)) rownames(r) else row.labels # add row labels
clab <- if(missing(col.labels)) {
if(is.null(colnames(r)))
deparse(substitute(y))
else
colnames(r)
} else {
col.labels # add column labels
}
rows <- 1:nrow(m) # row indices
cols <- 1:ncol(m) # column indices
m[rows, cols] <- sig*poscorr
colnames(m) <- c( clab) # add colnames
rownames(m) <- c( rlab) # add colnames
m # return matrix
}
With this at hand we can now prepare our data for the network plot:
adj <- cornet(na.omit(animals[,3:25]))
## Lade nötiges Paket: psych
## Warning in abbreviate(colnames(r), minlength = 5): abbreviate mit
## nicht-ASCII Zeichen genutzt
Second thing we need for a nice plot is the function ggnet2
from the library GGally
. This library has some dependencies, especially regarding ggplot
and sna
, so you might have to install them first!
library(GGally)
But then:
ggnet2(adj, mode = "fruchtermanreingold", size = 24, label = rownames(adj), label.size = 3, alpha=0.8)
## Lade nötiges Paket: network
## network: Classes for Relational Data
## Version 1.13.0 created on 2015-08-31.
## copyright (c) 2005, Carter T. Butts, University of California-Irvine
## Mark S. Handcock, University of California -- Los Angeles
## David R. Hunter, Penn State University
## Martina Morris, University of Washington
## Skye Bender-deMoll, University of Washington
## For citation information, type citation("network").
## Type help("network-package") to get started.
## Lade nötiges Paket: sna
## sna: Tools for Social Network Analysis
## Version 2.3-2 created on 2014-01-13.
## copyright (c) 2005, Carter T. Butts, University of California-Irvine
## For citation information, type citation("sna").
## Type help(package="sna") to get started.
##
## Attache Paket: 'sna'
## Das folgende Objekt ist maskiert 'package:network':
##
## %c%
## Lade nötiges Paket: scales
##
## Attache Paket: 'scales'
## Die folgenden Objekte sind maskiert von 'package:psych':
##
## alpha, rescale
Voila! domestic species form one subnet (pig, cattle, sheep/goat), while most wild species form another subnet. There seem to be two separate spheres of interaction with the animals, indicating that there are two types of sites resp. activities: one is hunting, the other husbandry.