Introduction
The Data Svodín - Southwest Slovakia Lengyel Culture settlement 915 features (122 graves) 420 stratigraphic relations (189 ambiguous) 34583 recorded finds (11161 diagnostic) Superimposed by Eneolithic, Bronze Age, Iron Age and Medieval settlements.
Introduction
Big Data with Big Issues Ambiguity Heterogeneity Fragmentarity Multimodality Description
Ambiguity
Unclear relations
Ambiguity of relations between find complexes We know there is a superposition but we do not know the sequence This information should not be discarded
Ambiguity Chronological phasing of 3 features based od stratigraphy A disturbs B ⇒ A < B
Phase
All possible solutions Feature Solution A B C no. 3 2 1 1 3
2
3
2
2
1
2
3
2
1
3
4
3
1
2
5
3
1
3
6
B overlaps C ⇒ B ≠ C
Probability of features dating to phases
No. of possible solutions
⇒
Feature A B C
1 0 4 1
Phase 2 2 2 2
3 4 0 3
⇒
Phase Feature 1 2 3 A 0% 33% 67% B 67% 33% 0% C 17% 33% 50%
=
Solutions with feature dating to phase All possible solutions
A
B
C
Fragmentarity
Missing connections Fragmentarity resulting from depositional and postdepositional transformations Absence of evidence ≠ evidence of absence Use external / prior knowledge to fill in the gaps
Fragmentarity Determining superpositions based on prior knowledge Post structure 1 < Feature A, B, C Post structure 1 ? Feature D, E, F, G
C
F E
G
D
B
Post structure 1 A
Fragmentarity Determining superpositions based on prior knowledge Post structure 1 ⇒ House 1
Feature D, E ⇒ Pit 1
⇒
Feature D = Feature E House 1 ≠ Feature F, G, Pit 1
House 1 C
F Pit 1
D
E
G
Post structure 1 ≠ Feature D, E, F, G B
Post structure 1 A
Heterogeneity
Diverse entities
Heterogeneity of data caused by different sizes and purposes of settlement features (e.g. pits, graves, ditches). Our model must reflect the complexity of the processes, which resulted in the observed evidence. What we consider random depends on the context.
Heterogeneity
Statistical hypothesis testing Parametric tests – assume knowledge of underlying distribution e.g. Students T-test Nonparametric tests – based on simulated random distribution of data Permutation and bootstrap tests
Multimodality
Ups and downs Development of styles can have multiple peaks in time Models used by ordination methods must reflect this fact
Multimodality Unimodal distributions of ceramic attributes in time Svodín:
Troy:
0.09 0.08 0.07 0.06 0.05 0.04 0.03 0.02 0.01 0
Ware 662 0.6
frequency ratio
frequency ratio
Type G1 0.5 0.4 0.3 0.2 0.1 0
Multimodality Multimodal distributions of ceramic attributes in time Svodín:
Troy:
Type Tasse-1
Ware 616 0.25
0.1 0.08 0.06 0.04 0.02 0
frequency ratio
frequency ratio
0.12
0.2 0.15 0.1 0.05 0
Multimodality Unimodal distributions of frequencies in time Mode A < Mode B
⇒
Type A
Mode A
Type A < Type B
Mode B
Type B
Multimodality Multimodal distributions of frequencies in time Mode A < Mode B
Mode A
⇒ Type A < Type B ERROR Using methods assuming unimodal distributions (Seriation, Correspondence Analysis, Principal Component Analysis, etc.)
Mode B
Type A
Type B
Multimodality Multimodal distributions of frequencies in time Mode A2 > Mode B1
Mode A1
⇒
Type A Mode A2
Type A ≮ Type B
Using methods assuming multimodal distributions Mode B1
Type B Mode B2
Multimodality Multimodal distributions of frequencies in time Mode A2 < Mode B1
⇒
Mode A1
Type A Mode A2
Type A < Type B Using methods assuming multimodal distributions
Type B
Mode B1
Mode B2
Description system
Who is who Graphs can be used as a method of storage. Relations between entities hold the most important information about them. Graph databases (sets of Entity – Relation – Entity triplets) are a more natural way to describe archaeological data.
Description system Graph Database
Features, Finds and Interpreted Structures = Vertices Contextual and stratigraphic relations = Edges e.g.: Feature ⇒ Contains ⇒ Find
Feature ⇒ Disturbs ⇒ Feature Vertex
Edge
Vertex
Conclusions • Accept and work with uncertainty. • Clearly define and apply all prior knowledge to fill in the gaps in evidence. • Provide the simplest explanation of the evidence, without ignoring the complexity of the underlying structures. • Use appropriate models for analysis.
[email protected] http://uniba.academia.edu/demjan