A Comparative Assessment of Structure from Motion Methods for

0 downloads 0 Views 7MB Size Report
Methods for Archaeological Research. Susie Green .... principles as traditional photogrammetry, but is capable of processing huge numbers of points. The result ...
A Comparative Assessment of Structure from Motion Methods for Archaeological Research Susie Green (Corresponding author) a, Andrew Bevan b, Michael Shapland c a

UCL Institute of Archaeology 31-34 Gordon Square London WC1H 0PY Email: [email protected] Tel: 07738242888 b

UCL Institute of Archaeology 31-34 Gordon Square London WC1H 0PY c

Archaeology South-East / UCL Centre for Applied Archaeology 2 Chapel Place Portslade East Sussex BN41 1DR

Abstract This paper addresses the use of open source, structure from motion methods for creating 3d pointclouds from photographs and compares these with alternative workflows in other software, and relative accuracy compared to other 3D modelling methods. It describes a series of case studies that use structure from motion to record standing buildings and create digital elevation models. Looking at other recording techniques it finds that structure from motion can produce better results than traditional techniques such as plan drawing, topographic survey and photogrammetry, and is cheaper and more accessible than new techniques such as laser scanning and LiDAR, although it is less accurate in some regards. It demonstrates that good accuracy can be achieved if careful measurements are made, and concludes that it has great potential for widespread archaeological application.

Keywords Structure from Motion; Computer Vision; Bundler; open source; photogrammetry; multiview stereo

1. Introduction Structure from motion (SfM) is the name given to the extraction of three dimensional data and camera positions from a collection of photographs. It is the computer vision equivalent of a human’s ability to understand the 3d structure of a scene as they move through it (Szeliski, 2010). In practical terms this is a photogrammetric approach that involves taking a collection of photographs of an object or scene that we wish to record, and processing them through a series of computer programs to create a 3d pointcloud representing the surfaces of the objects in the photographs. The resulting pointcloud can then be processed further to create, among other things, 3d models, digital elevation maps, rectified images, scaled plans, elevations and cross sections. This paper investigates the use of the open source programs Bundler and Patch Based Multi-View Stereo (PMVS) to automatically create pointclouds without the need for human input. Bundler and PMVS2 are both released under the GNU public license. Most papers describing the use of SfM in archaeology have focussed on the use of Agisoft’s Photoscan which is the commercial software that is most directly comparable to Bundler and PMVS2 (De Reu, et al., 2013, Olson, et al., 2013, Verhoeven, 2011, Verhoeven, et al., 2012). Photoscan enables georeferencing and creation of textured models from the resulting pointcloud and creation of digital elevation models. However at a few thousand US dollars at current prices for the professional edition (Agisoft, 2013) it is beyond the budget of many archaeologists, and there is nothing that can be achieved using this software that cannot be reproduced using open source programs. Indeed, open source technology is not only important to archaeology because it is free to use, but more importantly because the source code is free to anyone to interrogate and further develop. If it is embraced by archaeologists at an early stage then it can be developed in a direction that is advantageous to archaeology’s specific research and management agendas. In addition, open source solutions allow the archaeologist to retain ownership of the data throughout the process. Archaeological records are often paid for by public money, and it is therefore important that they belong to the public and can be accessed by them, or at least accessed on their behalf. Commercial software companies often store data in formats or under licensing arrangements that mean they are not available to the public, with

this being a well-known issue for laser-scanning archives. For various reasons the owners of the data (the public) may lose access to the software with which to read it; for example the license may no longer be affordable to the organisation holding the data, the company producing the software may go bust, or the software may simply no longer be sold. In the case of open source software the code with which stored data is encoded is freely available to anyone, so if the software needed to read it is no longer available, it can always be rewritten. The SfM open source toolchain needed to produce a textured model or digital elevation map consists of several different pieces of software and this multi-package workflow can be daunting for users when compared to the end-to-end solution offered by commercial software such as Photoscan. However this modular approach also has its advantages as it allows the archaeologist to understand each step in the process, to see where problems lie, and vary his/her pipeline to get the best results from each data set.The adoption of Bundler and PMVS2 by ArcheOS, the open source operating system for archaeologists, (ArcheOS, 2011) is an encouraging start. Ducke and colleagues have already documented the recording of an excavation of a burial site using Bundler and PMVS2, but their discussion focused on the robustness, simplicity and affordability of the process, rather than considering the accuracy of the results (Ducke, et al., 2011).This paper aims to build on this work by directly comparing SfM with other recording techniques used by archaeologists in order to assess the accuracy (the degree to which the results match their real world values) and precision (the resolution of the results).

2. Methods 2.1 The Investigation The possible applications of SfM in archaeological research are numerous. This paper uses four case studies to illustrate some of the areas in which it could be of benefit to archaeology, and compares the results with more traditional recording techniques. The first case study looks at the recording of standing buildings compared to traditional plan drawing. The second demonstrates how SfM can be used to record highly irregular shapes and create

cross sections. The next case study looks at the use of SfM to create digital elevation models. In order to cover large areas of ground a kite was used to elevate the camera into the air. A surface was created from the pointcloud that can be represented by x and y geographical coordinates, holding the z (elevation) information as an attribute. These models were compared with a topographic survey map created using total station survey data, and with a digital terrain model created from LiDAR. The final case study uses a small scale digital elevation model to examine surface deformation. In each of the case studies a number of control points were surveyed. The accuracy of an SfM approach was established using the distance between the real world points and their equivalents in the pointcloud. This paper also examines the precision achieved by SfM, which can be determined by the amount of detail that is evident in the pointcloud, and looks at the relationship between accuracy and precision and its implications when interpreting results.

2.1 Source Photographs In order to create a pointcloud without gaps it is usually necessary to use a large number of images. These case studies use between 250 and 650 photographs each. Each point in the pointcloud must be visible in at least three photographs taken from different positions. The number of images necessary is dependent on the complexity of the shape of the subject. The angles between each pair of photographs should not be more than 15 degrees, but also should not be too small or PMVS2 will be unable to correctly locate the points in space. All the case studies described here use a low cost digital camera (Canon G11).

2.2 SfM Algorithms and Software Considerations In traditional photogrammetry, either the positions of the cameras, or the position of some points that are visible in more than one image, are known. These are triangulated in order to find the location of other points in the photographs. In Structure from Motion, matches are made between many points across many images without prior knowledge of the camera position (Lowe, 2004). The process involves two steps:



step 1. The photographs are examined to find matching points across several images and these are used to calculate the position of the cameras relative to each other.



step 2. Now that the camera positions are known, the location of many points can be plotted in space, giving a dense reconstruction of the shape of the objects that were photographed. The first step is iterative, with the results being refined as more points are added. As

each camera is added a check is made that the matches fall within the expected parameters, and if not it is rejected. As the number of images increase, the time taken to assess each camera becomes exponentially longer which means that for very large image sets the process is impractically slow (Snavely, et al., 2006). The second step of the process uses the same principles as traditional photogrammetry, but is capable of processing huge numbers of points. The result is a pointcloud that looks similar to those produced by a laser scanner. For the case studies in this investigation the first step was carried out using Bundler and the second using PMVS2. A major drawback of the PMVS2 dense pointcloud generation is that its processing time scales badly. This problem is addressed by a process known as CMVS, which breaks the task up into smaller chunks (Furukawa et al., 2010). Currently, the above pipeline can be made simpler still via VisualSfM, which also has a further advantage of using multiple computer cores to speed up the process of locating camera positions and parameters (Wu et al., 2011; not open source, but is free for non-commercial use), and includes a graphical user interface that implementation simpler. Another solution would be the use of ArcheOS, the open source operating system for archaeologists, (ArcheOS, 2011) which incorporates the bundler pipeline and simplifies it with a python wrapper. Other solutions that are free for noncommercial use include Photosynth from Microsoft, which is web based, only produces a low resolution pointcloud but it is very fast and is therefore useful for testing sets of photographs (Microsoft, 2014), and 123D Catch from Autodesk, another fast, free, web based solution which can also produce models and surface textures, but the results are mixed and are better suited to discrete objects rather than large features or landscape. It is designed as part of a 3D printing pipeline (Autodesk, 2014) Bunlder and most other structure from motion methods of this kind have no concept of the absolute world space represented in the photographs so, instead, pointclouds are created in

an arbitrary space and must be georeferenced in a further step. To do this we use distinctive markers or features within our photographic subjects, and survey them to find their real-world locations. These points can be picked out of the pointcloud and a comparison made between their geographical coordinates and their virtual coordinates, and the difference can be used to calculate an affine transformation to georeference the pointcloud correctly (e.g. using CloudCompare; Girardeau-Montaut, 2014). Further manipulation of pointclouds can be performed using the open source program Meshlab (Meshlab, 2013), which will also allow the creation of polygon meshes and the texturing of surfaces using the original photographs. It should be noted that this entire pipeline is possible within the commercial program Photoscan. Table 1 contains an outline of each part of the process for the software described above.

Software

Low density pointcloud

High density pointcloud

Georeferenced Mesh and Texture

Bundler &

Using one of a number of

A second script uses the

Further processing must be

PMVS2

available scripts (eg.

bundler output to remove

done in other software such

ArcheOS) to point at a set

distortions from the

as CloudCompare (to

of photographs, the bundler

photographs and launch the

georeference points) and

pipeline returns a text file

PMVS2 process. The result is

Meshlab (create and

containing camera

one or more .ply files

texture meshes)

locations, parameters and a

containing a high density

low density collection of

pointcloud

points. VisualSFM

VisualSFM has a GUI that

PMVS2 can be launched from

Further processing must be

& PMVS2

allows a set of photographs

within the VisualSFM GUI. The

done in other software such

to be loaded, and takes the

result is a series of .ply

as CloudCompare

user through a series of

pointcloud files very similar to

(georeference points) and

steps to match points and

those produced by Bundler

Meshlab (create and

find camera positions. The

and PMVS2. These can be

texture meshes)

cameras and low density

viewed in the GUI or opened in

points are displayed in a 3D

an external editor such as

viewer.

Meshlab.

A set of photographs is

No further processing is

uploaded to the Photosynth

currently possible

Photosynth

website. The result is an online 3D viewer which displays a low density collection of points. 123D

A series of photographs are

Catch

uploaded to the 123D Catch website and the result is a fairly low resolution textured 3D model that can be downloaded. Georeferencing can be taken from points on the images.

Photoscan

A set of photographs is

The next stage in the process

Further processing within

loaded into the software

produces a high density

Photoscan will produce a

and when processed the

pointcloud which can also be

model and a high resolution

camera positions and low

displayed in the viewer.

texture. It also has the

density points are displayed

facility to georeference

in a 3D viewer

points.

Table 1. The key steps in the SfM process for different toolchains.

3. Case Studies: Standing Buildings

3.1 St Andrew’s Church, Jevington The first case study looks at the potential of structure from motion to record plans and elevations of standing buildings as a replacement for traditional scale drawing. The varied and non-reflective surfaces of brick and stone made it easy for Bundler to find distinctive points for

matching, making standing masonry an ideal candidate for SfM. Our case study looks at St Andrew’s Church, Jevington, in East Sussex, for which data, in the form of georeferenced points from total station survey and some elevation drawings, had previously been produced. Jevington was visited on June 25th 2011 and 244 photographs of the church tower were processed successfully to create a pointcloud using Bundler and PMVS2. The georeferenced total station points used distinctive points on the masonry of the church tower, and these had to be estimated and matched in the pointcloud. For each point used in the transformation the distance between the surveyed point and the transformed pointcloud point was calculated. The greatest distance between the two sets of points was 4.4cm. For an additional set of 5 control points the greatest distance was 8.3cm. Given the difficulty of matching the features across both files, and the uncertainty of the exact location for which the measurements were made, the largest discrepancies may well be caused by human error. In order to be useful as a record of the building fabric, we must be able to see the features clearly. This was achieved by creating a polygon mesh from the pointcloud. The colours of the points were used to colour the mesh, and the result is a realistic representation of the exterior walls in 3d. The mesh was created using a Poisson Surface Reconstruction filter (Meshlab, 2013). (Fig.1) >

In fig.2 an orthographic representation of the south wall of the tower can be compared to the elevation drawing. The red points represent the georeferenced points from total station survey. The SfM model is a closer match to the points than the drawing as can be seen in the top section around the circular windows. The drawing was created from a photograph that was rectified using the same total station points (Shapland, 2012), and this fairly approximate polynomial transformation of the image is likely to have introduced some distortion. Structure from Motion appears to create pointclouds that are free of perspective distortion. >

3.2 St Benedict’s Basilica at Bury St Edmunds

The second test subject is a piece of masonry in the grounds of Bury St Edmunds Abbey. This section of wall is extremely irregular in shape and would be hard to capture accurately using plan drawings.(see fig.3) It is possible that this piece of masonry is a part of the Saxon ‘St Benedicts Basilica’ as described by Whittingham: On one of the two lumps representing the Infirmary Chapel of St. Michael, dedicated by Archbishop W. Corbeuil (1123-36), a Norman external string-course below former N. windows abuts an earlier wall. From the latter instead of a Norman cross-vault springs a barrel vault running, astonishingly, N. and S. across an aisled nave (Whittingham, 1951). The barrel vault arches towards the west and undercuts much of the masonry. This piece of Masonry had not yet been surveyed or planned, so before photographing it a series of points were marked out around the base of the wall using coloured circles, and their locations were recorded with a total station, along with a number of points on the wall itself to use later to check the accuracy of the pointcloud transformation. The circle markers were photographed along with the masonry so they would be visible in the pointcloud (fig.3a). 453 photographs were taken of this piece of wall. The camera was mounted on a pole to take shots of the wall from above. >

Having the markers visible in the pointcloud made georeferencing simple. All the markers that were used for the transformation were placed on the ground rather than wall itself so as not to obscure the masonry. To check that the pointcloud was scaled correctly in all dimensions some distinctive points on the wall were also surveyed. It was estimated that it should be possible to pinpoint these to within approximately the nearest 3cm. When the difference between the georeferenced pointcloud and the control points was calculated the greatest distance between them was found to be 2.9cm. After creating a mesh (using the same filter as above), we can plot elevations for the front, back and sides, and in this case also the top of the wall. However to fully understand the shape of the mesh we can also use the 3d model to examine cross sections. Fig.4 shows the mesh sliced parallel to the ground plane (4a) and perpendicular to the north/south axis (4b to f) This allows us to see the curve of the barrel vaulted roof described by Whittingham

(Whittingham, 1951). The plans also suggest the presence of a blocked window or an altarniche at A, shown in cross section in 4e, and a possible curve of a doorway at B (Shapland, 2012). The cross sections also reveal useful and hitherto under-appreciated information for conservation of this structure: in particular, the masonry is less solid than it appears at first, being extremely thin in some sections. >

4. Cases Studies: Digital Elevation Models 4.1 Windmill Hill In order to investigate the accuracy of GIS data created using SfM we need a high resolution digital elevation model for comparison. The solution is to find an archaeologically rich landscape for which there is LiDAR data available. Windmill Hill near Avebury in Wiltshire has existing LiDAR coverage at 0.5m resolution, the highest commercially available, and it has distinct archaeological earthworks. There is a Neolithic causewayed enclosure with three concentric ditches. There are four bronze age barrows within the area of the enclosure and several more outside (Keiller and Smith, 1965). Windmill Hill was photographed in the first week of July using a camera rig attached to a kite. High winds meant that the camera was constantly moving, so a very high shutter speed had to be used to prevent motion blur. Dark conditions meant using a large aperture and high ISO resulting in slightly grainy images. 494 photographs of acceptable quality were taken over a period of about an hour and a half, before it became too wet to continue. The area covered was roughly a third of the causewayed enclosure, about 250m by 250m. As well as photographing Windmill hill, a series of coloured markers of 15cm diameter were surveyed using a total station. The markers are visible in the photographs and show up in the pointcloud as a few pixels each of contrasting colours. The greatest discrepancy between the location of the markers and the transformed points is 0.318m. A DEM was created from the pointcloud. This is relatively easy as, for example, a commonly used point cloud format such as .ply contains a simple list of coordinates followed by

colour values, separated by spaces which can be read into an open source GIS and interpolated to create a DEM (e.g. GRASS GIS r.in.xyz followed by r.resamp.rst). In addition to the DEM, we can create a matching raster dataset which summarises the colour information from the photographs. As the markers were surveyed to a local grid, a further transformation was needed to match the SfM DEM to the LiDAR data. This was done with an affine geographic transformation (first order polynomial) using two points, the elevation data was also matched at the highest point on both maps (but not scaled). >

Figure 5a and b show the area of Windmill Hill that is covered by both maps. It is immediately apparent that there are problems with the SfM map. The elevation values are patchy, and there appears to be a fault line running unevenly from the north east to the south west. However before looking at the problems evident in this map it is worth considering where it has succeeded. The geographical coordinates are so similar that the two maps can be overlaid without any blurring of the features. The height data is also largely correct, with the average values across the maps being very similar. So what has caused the fault line? Most likely, we are looking at a combination of two factors, the impact of wind on image capture and the presence of vegetation. The grass on the hill was several inches long when it was photographed and there were large patches of nettles covering parts of the hill. The vegetation itself can account for the smaller variations in height, but not for the fault line. It is likely that the latter was caused by the movement of the vegetation by the strong winds, which allowed Bundler to match features that had moved between images. The LiDAR data to which are comparing the DEM is a Digital Terrain Map, or DTM which has been post-processed to remove the heights above ground surface of vegetation (Crutchley, 2010). Therefore, comparing this with the SfM DEM is not strictly comparing like with like, and some discrepancy between the two is to be expected. The influence of the vegetation can be demonstrated if we compare the DTM and DEM in fig.5a and b. The features marked with an A are the ditches of the causewayed camp. They are clearly visible in the LiDAR DTM, but almost completely invisible in the SfM DEM because they are overgrown by vegetation. The feature at B is equally clear in both elevation models because a path runs along this line, so the vegetation has been trampled down to ground level. Unfortunately there is no way to remove the influence of vegetation from SfM data unless the vegetation is very thin and the ground is clearly visible beneath.

To learn more about the accuracy of the height data, it is worth looking at a smaller area around the largest barrow, near the crest of the hill. A colour map was derived using the RGB values of the points (Fig.6a) and the height of the DEM was matched to the LiDAR DTM at the highest point of the barrow as this was free of vegetation. If we subtract the SfM DEM from the LiDAR data we are left with a map showing the difference between the two. (Fig.6b) >

The white values are the areas of the DEM that are between 0 and 30cm higher than the LiDAR data. Most of the map is within these values, which would be expected if the difference is due to vegetation. Very few areas have a lower value (pink). The dark pink patches are most likely to be due to a slight geographical offset between the two sets of data. If we compare the areas where the DEM is over 30 cm higher to the SfM RGB map in fig.6 we can see that these correspond to the darker patches of nettles. It is also apparent from the colour map, which has not been reduced to a lower (0.5m) resolution to match the LiDAR data, that the SfM maps offer potentially much higher resolution (i.e. in terms of observations per unit area).

4.2 Fishbourne Dolphin Mosaic Fishbourne Roman Palace is a large Roman Villa near Chichester in West Sussex. Excavations in 1960 unearthed a large number of mosaics dating from the first to third centuries AD. The mosaics now form part of a museum which was constructed in situ, covering over the whole of the north wing of the villa (Cunliffe, 1971). The ground underneath the ‘Cupid on a Dolphin’ mosaic is subsiding, and the mosaic is gradually collapsing with it. Fishbourne museum were interested to see if structure from motion could be used to create a detailed DEM of an individual mosaic which could be used to monitor further subsidence in the future. The mosaic is too fragile to stand on, so in order to get detailed photographs of the centre the remote controlled camera rig for the kite was hung from a rope stretched across the mosaic. The results were obtained from 653 photographs. The amount of detail in the pointcloud

is best illustrated by the untextured 3d model created from the points (fig.7). It is possible to make out the broken edges of the tiled areas, and in some places individual tiles. >

To georeference the pointcloud and assess the accuracy of the DEM, 46 points on the mosaic were surveyed using a total station. The use of a mosaic as a subject presents an opportunity to georeference points without using markers. Instead tiles in distinctive positions were chosen and noted on a plan of the mosaic. The individual tiles are approximately 1cm across. As there were as few as 3 pointcloud points representing each tile the accuracy cannot be much greater than the nearest 1cm. The greatest difference between the surveyed and transformed points is 0.006m which is well within our estimated error of 1cm.

5. Comparing the Accuracy of SfM If SfM is to become a useful tool we must have some confidence in the accuracy of the results. It is impossible to achieve 100% accuracy of measurement by any means so the calculations below attempt to assess whether the accuracy of the SfM pointcloud is comparable to the alternative methods used in each case study. Table 2 documents the distances between the georeferenced real-world points and the transformed pointcloud points and presents these results as a percentage of the pointcloud size for each study. This table includes date from the 3 cases studies described above, as well as three further studies of very similar type.

Jevington

No of points tested

greatest distance between observed and transformed points

mean distance between observed and transformed points

largest dimension of pointcloud

greatest distance as a percentage of pointcloud size

mean distance as a percentage of pointcloud size

13

0.083m

0.039m

15.556m

0.534%

0.251%

Bury St Edmunds

13

0.029m

0.019m

5.937m

0.488%

0.320%

Bury Abbey

5

0.082m

0.055m

128.841m

0.064%

0.043%

Edinshall

6

2.048m

0.962m

175.285m

1.168%

0.549%

Windmill Hill

9

0.318m

0.188m

355.246m

0.090%

0.053%

Fishbourne

42

0.011m

0.005m

12.734m

0.086%

0.039%

Convoys Wharf

12

0.058m

0.036m

35.581m

0.163%

0.101%

Table 2. Highest and mean distances between observed (surveyed) points and their associated points in the georeferenced pointclouds.

The mean difference for the case studies is between 0.549% of the pointcloud (Edinshall) and 0.039% of the pointcloud (Fishbourne.) The case studies have already suggested that there is a correlation between the discrepancy in results and the degree of error introduced when measuring or matching the points. To investigate whether the correlation is real an estimate has been made of the degree of error likely in either the measurement of the real world coordinates or in the selection of matching points in the pointcloud. These are documented in table 3.

estimated error in measurement

reason for estimate

estimated error as a percentage of pointcloud size

Jevington

0.05m

estimated variation between distinctive stones and the points representing them

Bury St Edmunds

0.025m

distance between pointcloud points on the most poorly represented markers 0.421%

Bury Abbey

0.075m

radius of the marker, sometimes represented by a single point.

1.00m

approximate average height difference between 10m cells of the OS DEM against which it is georeferenced. 0.570%

Edinshall

0.321%

0.058%

Windmill Hill 0.075m

radius of the marker, sometimes represented by a single point.

0.021%

Fishbourne

0.005m

radius of a single mosaic tile, used as marker reference

0.039%

Convoys

0.03m

radius of the marker, sometimes

0.084%

Wharf

represented by a few scattered points.

Table 3. Estimated errors due to measurement of coordinates and selection of points.

To assess the correlation between these two sets of points the mean distances between points were plotted against the estimated errors, and the linear regression was calculated using the statistical program R. The adjusted R² value for this relationship is 0.9591 which suggests a very strong correlation with a very significant p value of 0.0000737 (calculated using student’s t.) Fishbourne mosaic has the smallest estimated error of measurement (0.039%) and a large number of surveyed points to consider, so further investigation was carried out on this dataset. Table.4 shows the elevation differences between the surveyed points and same points in the DEM created from the SfM pointcloud.

ID

Diff

ID

Diff

ID

Diff

ID

Diff

ID

Diff

ID

Diff

ID

Diff

0

-0.001

6

-0.003

12

-0.006

18

-0.005

24

-0.006

30

-0.006

36

-0.001

1

-0.004

7

0.002

13

-0.006

19

-0.005

25

-0.002

31

-0.005

37

-0.002

2

0.001

8

0.000

14

-0.006

20

-0.002

26

-0.005

32

-0.004

38

-0.004

3

0.000

9

-0.006

15

-0.005

21

-0.002

27

-0.005

33

-0.004

39

-0.003

4

0.001

10

-0.001

16

-0.006

22

-0.001

28

-0.006

34

-0.005

40

-0.002

5

-0.002

11

-0.005

17

-0.005

23

-0.005

29

-0.004

35

-0.004

41

-0.002

Table.4 Elevation values for the surveyed tiles subtracted from the DEM elevation values at the same points

The largest difference is -0.006m. This correlates well with the estimated 0.5cm measurement error. However the value of -0.006m is quite common, so it would seem worth investigating further. Fig.8 shows these values mapped onto the DEM. >

It is evident that there is a spatial pattern in these values. Those in the central part of the mosaic are comparatively lower than those at the edges. There are several possible reasons for

this. The most worrying would be that the pointcloud is not uniformly scaled, but this is highly unlikely because the z value is meaningless when we consider the untransformed pointcloud, and the affine transformation would not have changed this. The low values do not exactly match the lowest part of the mosaic, they are actually a better match for aspect; the lowest values tend to be on the west facing slope. This discrepancy is therefore most likely to be caused by a slight misalignment of the rotation or translation of the pointcloud due to the imprecision of the point locations. If the transformation is the primary source of error it is likely that the relative position of the tiles is correct. This suggestion is supported by the colour map. If we look at an enlarged section in fig.9, it is apparent that the relative position of the points (the precision) is accurate to a considerably higher resolution than 1cm. >

It seems that the Fishbourne pointcloud has higher precision than accuracy. In other words any errors are systematic across the whole pointcloud, and the point positions are correct to a higher resolution when only their relative positions are considered. This means that if we can improve the accuracy of the measurements for the transformation we should be able to improve the accuracy of the pointcloud. However to improve accuracy significantly we need more points in the pointcloud to match to the markers. This can be solved by increasing the resolution of the pointcloud or by using higher resolution photographs, but will lead to a steep increase in processing time. A better solution would be to pick out the centres of the markers from the original photographs and ensure that a point is created in the correct place. If this information was available from the start of the Bundler process the pointcloud could be created with the correct coordinates from the start.

6. Discussion This paper has demonstrated that structure from motion can be a very useful tool for archaeology, and is likely to become more important in future as it develops and addresses its failings. As it stands at the moment it is most useful as a less accurate, but cheaper and higher resolution substitute for some of the cutting edge technology used in archaeology, such as LiDAR and laser scanning, or as a supplement to photogrammetry, topographic survey and plan

drawing. This places it in an important strategic position, occupying the middle ground between technology that is prohibitively expensive and traditional techniques that are slower and of lower resolution. Until structure from motion can demonstrate reliable accuracy, and this can be calculated on a case by case basis, it is unlikely to be taken as seriously as a measurement tool, however it is a technology still in its infancy, and this problem is understood by its developers. Moreover, SfM brings with it other benefits, in particular the high speed and simplicity of acquisition, which allows it to be used in situations where slower or unwieldy technology would be inappropriate, allowing, for example, the capture of models of features being excavated without slowing down the excavation. It results in data that is more complete than any other technology, as it can capture both shape and colour of uneven surfaces at a high resolution. Structure from Motion promises to become a highly important tool in its own right if it is developed correctly. For this reason it is very encouraging that so many of the developments in SfM have remained open source, which gives archaeologists a chance to mould this technology to suit their own archaeological needs.

Acknowledgements This dissertation owes a great deal to the help of many people. Benjamin Ducke and Mark Lake provided their help and guidance with the writing of this paper. Robert Symmons arranged for the photographing of the dolphin mosaic at Fishbourne Roman Villa. Andrew Dunwell kindly made his survey data available. Giulia De Nobili, Sophie Butler and James Mason provided technical assistance during the collection of data.

References Agisoft. 2013. Agisoft Photoscan Professional Edition. http://www.agisoft.ru/products/photoscan/professional/ (Accessed 11 January 2013). ArcheOS. 2011. What is ArchaeOs? http://www.archeos.eu/wiki/doku.php?id=start (Accessed 18 September 2011). Autodesk, 2014, 123D Catch, http://www.123dapp.com/catch (Accessed 9 January 2014). Crutchley, S., 2010. The light fantastic : using airborne lidar in archaeological survey, English Heritiage, Swindon. Cunliffe, B.W., 1971. Excavations at Fishbourne, 1961-1969, Society of Antiquaries, London,. De Reu, J., Plets, G., Verhoeven, G., De Smedt, P., Bats, M., Cherretté, B., De Maeyer, W., Deconynck, J., Herremans, D., Laloo, P., Van Meirvenne, M., De Clercq, W., 2013. Towards a three-dimensional cost-effective registration of the archaeological heritage, Journal of Archaeological Science 40, 1108-1121. Ducke, B., Score, D., Reeves, J., 2011. Multiview 3D reconstruction of the archaeological site at Weymouth from image series, Computers & Graphics 35, 375-382. Dunwell, A., 1999. Edin's Hall fort, broch and settlement, Berwickshire (Scottish Borders): recent fieldwork and new perceptions, Proceedings of the Society of Antiquaries of Scotland, National Museum of Antiquities of Scotland, pp. 303-357. Furukawa, Y., Curless, B., Seitz, S.M., Szeliski, R., 2010. Towards internet-scale multi-view stereo, Computer Vision and Pattern Recognition (CVPR), 2010, pp. 1434-1441. Girardeau-Montaut, D., 2014, CloudCompare, http://www.danielgm.net/index.php (Accessed 9 January 2014). Keiller, A.F.S.A., Smith, I.F., 1965. Windmill Hill and Avebury. Excavations by Alexander Keiller, 1925-1939. (Prepared for publication by I. F. Smith.) [With plates, including a portrait, and maps.], Clarendon Press, Oxford. Lowe, D.G., 2004. Distinctive Image Features from Scale-Invariant Keypoints, Int. J. Comput. Vision 60, 91-110. Meshlab. 2013. Meshlab. http://meshlab.sourceforge.net/ (Accessed 11 January 2013). Microsoft, 2014, Photosynth, http://photosynth.net/ (Accessed 9 January 2014). Olson, B.R., Placchetti, R.A., Quartermaine, J., Killebrew, A.E., 2013. The Tel Akko Total Archaeology Project (Akko, Israel): Assessing the suitability of multi-scale 3D field recording in archaeology, Journal of Field Archaeology 38, 244-262.

Shapland, M. G., 2012. Buildings of Secular and Religious Lordship: Anglo-Saxon Tower-nave Churches. Unpublished PhD Thesis: University College London. Snavely, N., Seitz, S.M., Szeliski, R., 2006. Photo tourism: exploring photo collections in 3D, ACM, pp. 835-846. Szeliski, R., 2010. Computer vision: Algorithms and applications, Springer. Verhoeven, G., 2011. Taking computer vision aloft – archaeological three-dimensional reconstructions from aerial photographs with photoscan, Archaeological Prospection 18, 67-73. Verhoeven, G., Doneus, M., Briese, C., Vermeulen, F., 2012. Mapping by matching: a computer vision-based approach to fast and accurate georeferencing of archaeological aerial photographs, Journal of Archaeological Science 39, 2060-2070. Whittingham, A., 1951. Bury St. Edmunds Abbey: The Plan, Design and Development of the Church and Monastic Buildings, Archaeological Journal 108, 170-171. Wu, C., 2011, "VisualSFM: A Visual Structure from Motion System", http://ccwu.me/vsfm/ (Accessed 9 January 2014). Wu, C., Agarwal, S., Curless, B., Seitz, S. M., 2011, "Multicore Bundle Adjustment", Computer Vision and Pattern Recognition (CVPR), 2011 pp. 3057-3064.

Figures

Fig.1. Comparing the SfM model of Jevington Church to a photograph

Fig.2. A comparison of the 3d model and planned elevation using overlaid with georeferenced points

Fig.3a. Photograph of the wall showing survey markers, b. Bundler cameras and pointcloud, c. PMVS2 dense pointcloud including survey markers

Fig.4. Cross sections of the wall. The first image shows where the sections are made.

Fig.5a. Structure from motion DEM at 0.5m resolution, b. LiDAR DTM at 0.5m resolution © Environment Agency copyright 2013. All rights reserved.

Fig.6a. Colour map at 0.2m resolution, b. The difference between the SfM DEM and LiDAR DTM

Fig.7. Fishbourne Dolphin Mosaic untextured model

Fig.8. The location of the surveyed tiles, and their elevation distance from the DEM

Fig.9. RGB georeferenced colour map a. 1cm resolution b. 0.1cm resolution

Suggest Documents