|
Italian Semester of Presidency of the European Union
EUROPEAN CONFERENCE OF MINERVA
Quality for cultural Web sites
Online Cultural Heritage for Research, Education
and Cultural Tourism Communities
Parma, 20-21 November 2003, Auditorium Paganini
Alessandro Mecocci
(Meta.Com)
Advanced Fruition and Management Systems for Museums
INTRODUCTION
Today's digital techniques based on multimedia, can be interpreted
as new means for answering very old needs, namely those of: studying,
preserving, documenting the historical events, religions, cultures,
architectural structures, human organizations, and habits. In few
words the need of propagating memories to contemporary and future
generations. In the past, different techniques have been used, that
go from the images and draws on the rocks to the perspective representations
on paper, till the three-dimensional analogical models. The common
goal has been that of giving realistic representations capable of
conveying semantic meanings in an intuitive, easy to understand
and to remember way. Realism is not the only focus; interactivity
is another fundamental aspect that plays a central role in the information
and knowledge communication process. A lot of attention has been
devoted to the improvement of the user interface (for example by
changing it from 2D to 3D) to enhance the exploration and manipulation
possibilities.
Realism and interactivity are important aspects in communicating
and understanding Cultural Heritage; concerning this, Multimedia
techniques show important features that can offer many new opportunities.
In particular, two technologies play a dominant role in this scenario:
3D representations (Virtual Environments, VEs), and Wireless Appliances
(Was). VEs help in building up 3D documentation of complex Cultural
Heritage structures and items. Moreover, VEs facilitate hypothesize
and testing, by making it possible to simulate different scenarios
and reconstructions (different lighting conditions, different assembling
of fragments belonging to cultural items, simulated temporal evolution
of Cultural assets from past, to today, to forecasted future). In
this sense, VEs promise to be an important technology for answering
the diverse needs of dissimilar users (occasional or professional),
and to improve the retention of knowledge while deepening the understanding
about Cultural Heritage topics. Moreover, VEs are particularly suited
for exploiting new metaphors of interaction with users and for improving
remote access to worldwide Cultural Heritage. This last point also
impacts the preservation aspects (e.g. ultra-high-quality digital-copies
can be diffused while the originals are kept safe).
On the other hand, WAs give flexible responses to the needs that
generally arise during the visit in Museums or in Archaeological
and Naturalistic areas. WAs improve the comfort and the interactivity
between the user and the support infrastructure, thus granting a
greater freedom of movement while assuring a significant safety
and satisfaction degree. Free walking, personalized information
and advices, dynamic messaging, real-time wireless multimedia distribution,
etc., are some of the new services that can be conceived and implemented.
In this paper we introduce another important idea: Museum Reactivity.
This interaction metaphor is strongly related to the concept of
Multisensor Social Interfaces (MSIs) that enable a new class of
multi-user man-machine interfaces [1]. Museum Reactivity is a completely
new metaphor that will be the basis for a new generation of Museum
fruition systems.
MUSEUM REACTIVITY
Computing resources in public spaces (e.g. inside Museums) represent
a paradigm that differs from the conventional desktop environment
and requires user-interface metaphors quite unlike the traditional
Mouse, Pointer, Icons, and Windows. In particular, a MSI (an interface
between multiple persons and multiple computers in relatively large
spaces) must be capable of autonomously initiate, and terminate
interactions on the basis of multiple-people parallel-behaviour.
Moreover, it must be capable of handling multiple interactions at
the same time, and to allocate, divide and deallocate resources
among multiple customers in a finalized and equitable way. This
approach represents a significant departure from current practices,
but it will become increasingly important as computing resources
migrate from desktops into public open spaces. It is important to
note that there is a perspective change from the single user structured
interactions (typical of desktop systems), towards unstructured
interactions with multiple customers freely moving in open spaces.
MSIs enable Museum Reactivity: it is no more needed that visitors
press buttons or touch screens to start multimedia interactions
or presentations. The Museum "sees" the behaviour of visitors
through Multisensor Social Interfaces (e.g. Vision based or Pressure
based sensors) and then autonomously reacts by affecting the environment
(e.g. starting a sound, activating a video or enacting visual events).
Because the Museum knows something about what is going on, it can
act and react appropriately. MSIs introduce an improved fruition
dimension and leave people free of moving without taking care of
nothing but enjoying the visit. It is like to have a ghost assistant
who is constantly looking for opportunities to help and manifests
himself only when needed. From the fruition point of view, the role
of the visitor is greatly enhanced with respect to previous approaches;
people actually build up the visit and they do this in a social
way (i.e. multiple parallel interactions by multiple visitors at
the same time, determine how the Museum reacts i.e. the dynamic
evolution of what is presented and how it is presented).
This paper describes the integration of Pressure based and Vision
based Social Interfaces in the framework of MuseumNet©
a product for the creation and management of clusters of Museums
distributed over the territory. MuseumNet© is an advanced
modular system developed by Gruppo META in co-operation with Etruria
Innovazione (a Technology Transfer Center of the Tuscany Region)
and with the University of Siena (that developed the innovative
parts) [2]. The system evolved from an earlier architecture (devoted
to multimedia Museum fruition) that has been designed by the author
for the "Nuovo Museo Archeologico" of Bolzano (currently
hosting the Mummy of Similaun, claimed to be the oldest mummy in
the world). The Museum with its distributed multimedia fruition
system has been opened to the public, on 28 March 1998. It has been
reviewed by New York Time as one of the top ten Museums recently
opened in Europe.
SYSTEM ARCHITECTURE
The integration of MSIs inside the MuseumNet© architecture
has been implemented by considering a layered approach comprising
three main parts: 1) Control Subsystem; 2) Sensor Subsystem; 3)
Actuation Subsystem. This approach allows breaking the description
of Social Interfaces (that could be a complex task), into simpler
sub problems, and also permits to separate the logical and semantic
aspects from the physical implementation.
The Control Subsystem is the principal part, it starts from an
abstract description (see below) of the whole set of MSIs that are
present in the Museum and supervises all the events that arises
in the different physical areas. It dispatches internal messages,
starts, stops or kills threads, initializes physical devices and
implements the changes in the environment through suitable output
devices.
The sensor subsystem takes care of suitably initializing and handling
the hardware and the software running on peripheral input devices.
Due to the fact that MSIs generally require some processing to be
done at the input-signal level to extract useful information (e.g.
in the case of Vision based sensing of human behaviour), this subsystem
must take care of properly load setup data for each physical device,
to schedule appropriate logging and data reporting, to monitor the
proper execution of the different subtasks. Moreover, the sensors
are described through suitable abstractions that encapsulate device-specific
details behind a generic interface and enable a separation between
physical and virtual worlds. Each Abstract Sensor provides generic
methods for accessing the input values in a uniform way independent
from the real physical device. The physical devices are interfaced
through low-level special purpose routines. The low-level routines
must supply methods to initialize the device, and get data (start
up, prompt for a value, get a value, close, etc.) for any input
device. Through these routines one or more Abstract Sensors access
the physical detectors. Such an approach enables painless integration
of new input devices within existing applications. This is very
similar to the approach taken in the JAVA-3D APIs [3].
The actuation subsystem takes care of acting on the environment,
i.e. it implements the reactions of each Social Interface inside
the Museum. This subsystem is very similar to the sensor subsystem
mainly differing in its efferent role.
The VirtualMuseumGraph
Each Museum generally comprises more than one MSI, and each MSI
insists over a different physical area. To appropriately interact
with users, each MSI must be aware of its own environment, i.e.
it needs a model of the environment itself. The global description
of the Museum uses a metaphor based on Virtual Rooms. Each Virtual
Room contains the environment model (the physical space related
to that Room) and the model of each MSI that relates to that Room.
Each MSI is described by specifying the sensors, and the related
"Stages". A Stage is the description of a subpart of the
Room environment where an interaction takes place. A Stage contains
the information about the kind of interaction, about the conditions
that must be met to start/stop the interaction, about the actuators
that are used to implement the interaction, and about the exact
location of the physical subpart of the environment with respect
to the whole local environment (the whole local environment is described
in the Virtual Room). By specifying the relative position of each
Stage with respect to the Virtual Room environment, it is possible
to properly fuse information coming from different Sensors to obtain
higher-level descriptions. For example, it is possible to extract
the 2D position of people by means of suitable Computer Vision algorithms
applied to the images acquired by a camera. Thereafter the data
can be fused with those of other similar cameras to extract the
3D location of people with respect to the Virtual Room environment.
The transformations are contained in the respective Stages (or in
a single stage comprising multiple Sensors).
The description of the whole Museum is given by using a "forest
of trees" that has been called a VirtualMuseumGraph. The VirtualMuseumGraph
is an abstract representation that comprises one or more VirtualRooms
that in turn comprise one or more StageGraphs. Each StageGraph comprises
sub-trees of attributed nodes (see Figure 1). Each node can be a
group or a leaf and represents an entity in a StageGraph. Leaf nodes
can be Abstract Sensors or Sounds or Actuators (actually a Sound
is a special kind of Actuator). Groups are used to assemble multiple
Actuators or Sounds or Abstract Sensors in a single coordinated
unit. For example, a group of lights that must be switched on in
sequence, can be described as a single logic unit by means of a
group node whose children are leaves each representing the single
light of the light-set. Each node or group can have an associated
Behaviour. A Behaviour can do anything, for example it can: perform
computations, update its internal state, modify the StageGraph,
start a thread, send a message, activate an interaction. Multiple
Behaviours can
Figure 1 VirtualMuseumGraph abstract description structure
be composed so that independents Behaviours can run in parallel
to obtain special interactive effects or complex presentations (e.g.
starting multiple sounds while playing video presentation while
opening boxes or operating mechanical analogical models). Generally
a Behaviour is characterized by two methods: the first is called
when the Behaviour is enabled and the second is called when the
Behaviour is fired (waked up). The first method can be used to initialize
the Actuators while the second performs the actions needed to implement
the environment modifications. Not all the Behaviours are enabled
at the same time. Normally a Behaviour must be enabled only when
a visitor is nearby; this condition is specified by Enabling Bounds
i.e. a volume delimited by: a sphere, a box, a generic polyhedron,
or their Boolean combination. Obviously a Behaviour can be permanently
enabled by making its Enabling Bounds greater than the environment
associated to the Virtual Room it belongs to. This possibility allows,
for example, to have a Room Behaviour always enabled, this is important
for people tracking tasks where a multisensor system is continuously
running to estimate visitors' 3D positions inside the Room environment.
Even if enabled, a Behaviour can be fired only if some predetermined
conditions are met. In particular, FiringCriteria and FiringConditions
are connected to each Behaviour. FiringCriteria are used as prerequisite
for firing, for example: a number of frames have been acquired and
nothing changed, a number of milliseconds have elapsed, another
Behavior posts an event, one detected shape collides with other
shapes. FiringConditions are used to combine the previous criteria
according to Boolean rules (AND, OR, ANDofORs, ORsofANDs) so that
complex activation strategies can be more easily defined.
The VirtualMuseumGraph is created by means of a special editor that
is used:
- to give the actual conformation and dimension of the physical
spaces where each MSI operates
- to give the description of the Sensors and of the Actuators
- to specify the interactions by setting up the appropriate Behaviours
and by binding them to the corresponding Stages
- to specify the exact position of each Stage with respect to
the environment of the Virtual Room which it belongs to
The VirtualMuseumGraph structure is used by the Control Subsystem,
by the Sensor Subsystem and by the Actuation Subsystem to obtain
the information and data needed for the correct functioning of the
whole Museum.
Vision based MSI
To enable reactivity in Museums, some form of perceptual intelligence
is needed so that the system becomes capable of classifying the
current situation and to appraise the important variables to react
in an appropriate and socially acceptable way. This can be obtained,
for example, through suitable Computer Vision algorithms capable
of detecting and tracking the position of multiple visitors inside
a specific area of the environment. In the proposed system, a Vision
based MSI has been implemented that is capable of detecting the
presence of visitors. Thereafter it tracks them by means of multiple
TV cameras. For each camera an adaptive multi-class temporal median
filter is used coupled with a colour and shape statistical model.
According to the model, visitors are segmented from complex textured
backgrounds under variable viewing and illumination conditions.
People are modelled by means of blobs whose colour statistic is
continuously updated. The tracking algorithm also maintains a model
of the background that is represented as a set of textured regions
mixed with more homogeneous regions. Different statistical models
are used for the two kinds of regions. The regions are described
through local-mean colour-values and spatial distribution descriptors.
A multi-valued Gaussian mixture is used to account for background
spatio-temporal variability [4]. Colour histogram indexing and normalized
colour descriptors are used for tracking multiple visitors in real
time (see Figure 2).
Figure 2 Example of people tracking
The detected 2D blobs are back-projected to obtain 3D position
estimation; when more than one camera is available homographic-based
multiple-image-fusion is used to improve position estimates. In
the actual implementation the flat ground hypothesis is assumed
to hold (that is almost always true in Museum applications). Camera
intrinsic and extrinsic calibration (needed for 3D position recovery)
is obtained by using multiple planar targets during the preliminary
set up phase. These data are stored in the corresponding StageGraphs
and in the VirtualRoom abstract-description structures. The Sensor
Subsystem uses such information to fuse the multi camera data and
to back-project the visitors' position over the Room model. It is
important to note that the visitors' detection is done in parallel
enabling, in this way, the Social Interface principle. A computer
can now evaluate which kind of interaction to start and how to react.
This evaluation can be done by grouping the visitors and by matching
them with the interaction resources that are available in the Room
(described by the StageGraphs that belong to the VirtualRoom). Multiple
interactions can be started in parallel, targeted to different visitors
sub-groups.
Pressure based MSI
Another important sensing and tracking strategy is based on arrays
of pressure sensors. This is particularly true in those applications
where illumination conditions or the spatial conformation are too
severe to apply Vision based people sensing (e.g. if it is needed
to implement darkrooms where to start special purpose interactions).
In particular, it is possible to implement floating floors tasselled
by means of suitable tiles over a bed of pressure sensors. One typical
configuration uses four pressure sensors for each squared tile used
to pave the ground. In such a way, by reading the pressure values
on the four vertices of each tile and by considering the neighbouring
tiles (the neighbour dimension depends on the tile dimension), it
is possible to estimate the position and the number of persons on
the floating floor with sub-tile accuracy. Moreover, in general
no back-projection processing is needed because pressure sensors
are directly related to physical locations inside the actual environment
(see Figure 3).
Figure 3 Events detected by the VirtualRoom Behaviour
The Control Subsystem
At start-up, the Control Subsystem loads the VirtualMuseumGraph
and extracts the whole description of the various VirtualRooms inside
the Museum. Each VirtualRoom in turn, contains the description of
the physical environment related to the MSIs and of the corresponding
Sensors, Sounds, Actuators and Behaviours. These data and information
are used to initialize the Social Interfaces and all the physical
devices needed to enable the Museum Reactivity. After the initial
housekeeping, all the global Behaviours are started. These Behaviours
can vary from VirtualRoom to VirtualRoom, so that different tracking
and interaction strategies can run in different areas of a single
Museum. For example, in a certain area there can be a Pressure based
MSI, while in another zone a Vision based MSI. The area that uses
pressure sensing is described by the corresponding VirtualRoom abstract
structure; it also specifies which are the areas where interactions
can be started by the Reactive Museum. The StageGraph contained
in the VirtualRoom structure describes the areas. When the global
Behaviour of the VirtualRoom, detects some visitors inside a Behaviour
Enabling Bound, it starts the Behaviour that corresponds to that
Bound and looks for its firing criteria and conditions. If both
of them met, the Behaviour is fired and the corresponding Stage
is activated. The Stage contains the description of the Actuators
and Sounds that the fired Behaviour uses to interact with the visitors.
Needless to say that multiple Enabling Bounds can be activated by
the same or by different sub-groups of visitors, so multiple Behaviours
can be running at the same time. In the same way a specific Behaviour
can affect multiple Stages so that multiple parallel modifications
of the environment can be obtained (for example it is possible to
start a presentation sound or a background sound in a specific place
while switching on lights or videos in another location). This is
how the Museum can react in parallel to multiple solicitations by
multiple contemporary users.
THE MUSEUM OF MONTICCHIELLO
The previously described system is fully integrated into the MuseumNet©
architecture as a specialized module for Museum reactivity. This
architecture is currently used inside the Museum of Monticchiello
(a little village nearby Siena, Italy) devoted to Old Theatrical
Arts. The Museum hosts two different MSIs; one is Pressure based
and the other is Vision based. At the entrance, visitors go inside
a darkroom where some ancient objects, sounds and videos are presented.
Special illumination effects at the corners, allow visitors to progressively
discover the environment and to see ancient things. At the same
time projections over the walls and from the floor (some tiles are
semitransparent) at different locations, are used to illustrate
various aspects of the ancient theatrical art. It is important to
note that the presentation does not follow a predetermined path,
it is the Museum that reacts to the visitors depending on their
movements detected through the array of pressure sensors. In this
way each visit is a unique experience developing from the social
behaviour of the current visitor group (see Figure 4).
Figure 4 Pressure based MSI during the design stage
From the dark room, visitors go into a tunnel that get them to
a Vision based MSI. The room has four cameras at the ceiling corners
that are continuously looking for visitors. At the centre of the
room an ancient sink has been equipped with hidden interactive presentation
devices. In particular, videos can be shown at the bottom of the
well, while sounds can be played through multiple rings of speakers
hidden all around the sink border. These devices are used to show
images, descriptive narrations and videos, and to create 3D sound
effects. The cameras are used to track the visitors position in
real time so that the control system constantly knows their spatial
distribution on the floor. The positional information is used to
enable reactive events during the visit. For example, when a predetermined
number of persons enter the room, a background appealing sound is
enabled to attract their attention towards the sink that is in the
middle of the room. When a suitable number of persons are nearby
the sink (the MSI uses vision to verify that there are people enough),
video presentations start and are projected on the well bottom.
The video presentations depend on the number and on the distribution
of visitors around the sink (that is the Museum reacts in different
way depending on people behaviour). If another group of visitors
concentrates in another position while the remaining part of them
is around the sink, the Museum reacts by starting a video presentation
over the nearby wall, while the presentation at the sink continues
(see Figure 5). If after some time the persons around the sink still
remain there, the Museum reacts by stopping video on the well bottom
and starts a sound while opening a little hidden vain on the opposite
part of the room. Again the interactions are not predetermined,
they basically depend on people behaviour; moreover multiple interactive
presentations can go on in parallel targeted at different sub-groups
of visitors.
Figure 5 The multimedia sink and a projection screen in the background
CONCLUSIONS
In this paper we have presented an innovative module that has been
integrated in the MuseumNet© architecture. The module
enables the use of Multisensor Social Interfaces inside Museums.
Social Interfaces are new interaction metaphors that allow multiple
visitors to interact in parallel with multiple computers; moreover,
the interactions can be autonomously initiated by the system so
enabling interaction mechanisms completely different from traditional
approaches. The system has been practically implemented at the Museum
of Monticchiello, nearby Siena, devoted to Ancient Theatrical Arts.
In the next future, a support subsystem for emotional agents will
be added to the Proposed architecture. The MSIs will have their
artificial psychology. This fact will add new dimensions to the
interaction activities by enabling "motivational behaviours"
based on the MSIs internal state.
Bibliography
[1] A.P. Pentland, "Smart Rooms: Machine Understanding of
Human Behaviour", in Computer Vision for Human-Machine Interaction,
R. Cipolla, A. Pentland Eds., Cambridge University Press, 1998
[2] A. Mecocci, "MuseumNet: 3D Virtual Environments and Wireless
Appliances for improved Cultural experiences", EVA 2001 Conference,
Florence. Electronic Imaging & the Visual Arts. Conference Proceedings.
Pp.137-142, Pitagora Editrice, Bologna, 2001
[3] Sowizral, Rushforth, Deering, "Java 3D API Specification,
Second Edition", Addison Wesley, May 26, 2000
[4] F. Moschetti, A. Mecocci, M. Sorci, "Motion Estimation
for Digital Video Coding Based on a Statistical Approach",
IEEE-ICICS2001 - Information, Communications & Signal Processing,
Singapore, 2001
|
|
|