a Remote Control Technique to Select Objects by

18 downloads 0 Views 2MB Size Report
several commercial smartphone apps for controlling smart home appliance. In most cases ... Samsung Smart Home app for Android and iOS). To some extent ...
This is a post-peer-review, pre-copyedit version of an article published in Personal and Ubiquitous Computing. The final authenticated version is available online at: http://dx.doi.org/10.1007/s00779-018-1129-2

SEQUENCE: a Remote Control Technique to Select Objects by Matching their Rhythm Video https://www.youtube.com/watch?v=ri7iQ-nJspg

Alessio Bellino Università di Milano-Bicocca [email protected] ORCID: 0000-0003-0285-1549

Abstract. We present SEQUENCE, a novel interaction technique for selecting objects from a distance. Objects display different rhythmic patterns by means of animated dots, and users can select one of them by matching the pattern through a sequence of taps on a smartphone. The technique works by exploiting the temporal coincidences between patterns displayed by objects and sequences of taps performed on a smartphone: if a sequence matches with the pattern displayed by an object, the latter is selected. We propose two different alternatives for displaying rhythmic sequences associated with objects: the first one uses fixed dots (FD), the second one rotating dots (RD). Moreover, we performed two evaluations on such alternatives. The first evaluation, carried out with five participants, was aimed to discover the most appropriate speed for displaying animated rhythmic patterns. The second evaluation, carried out on 12 participants, was aimed to discover errors (i.e., activation of unwanted objects), missed activations (within a certain time) and time of activations. Overall, the proposed design alternatives perform in similar ways (errors, 2.8% for FD and 3.7% for RD; missed, 1.3% for FD and 0.9% for RD; time of activation, 3862ms for FD and 3789ms for RD). Keywords. Interaction Techniques, Rhythm Matching, Touch Remote Control, Touchless Remote Control.

1

Introduction Nowadays, we are surrounded by remote controls: appliances such as TV, VCR, etc., are normally equipped with such devices. The most advanced appliances (e.g., smart TV), moreover, can be controlled using wireless mice, smartphones, or even mid-air gestures. In the context of remote control, we contribute with SEQUENCE, an interaction technique that enables users to trigger controls – which can be physical or displayed on screens – at a distance. The technique is simple: a control shows a circular rhythmic pattern by means of animated dots that move inside boxes and can be triggered by matching such a pattern through a rhythmic sequence performed by users (e.g., on a smartphone, as shown in Figure 1). Different controls are associated with different rhythmic patterns that disambiguate the selection.

Figure 1. A SEQUENCE control displays a rhythmic pattern on a screen and can be triggered (in the example, playing a video) matching such a pattern through a sequence of taps (i.e., of binary inputs, which define the rhythm) provided by users. Rhythmic patterns are displayed circularly through a series of green and gray boxes. A dot moves clockwise inside the boxes every 350 ms marking the time so that users are visually aware of the rhythm over time. To trigger the control, users should perform a tap only when the dot is inside green boxes whereas a pause – i.e., no tap – should occur when the dot is inside gray boxes. For the sake of clarity, starting from the current situation, the sequence of taps to play the video is: 11101010. Taps (1) correspond to green boxes whereas pauses (0) correspond to gray boxes.

SEQUENCE theoretically supports any touch/touchless input able to provide rhythmic sequences in form of binary states, e.g., mechanical buttons (on/off), touch sensors, (touched/untouched), head noddle (headup/head-down), eye blinks (open/close), proximity sensor (present/absent), beats of lips (open/close), and so on. Among these varieties of input, we chose three of them to design three variants of SEQUENCE to demonstrate its flexibility: the first – touch input – uses taps on smartphones, and the second and the third – touchless inputs – use cameras to detect eye blinks and beats of lips respectively. Although we show these three variants of input – see section 3.3.8 – we clarify that the focus of this paper is the touch input variation using smartphones, which is the one that was evaluated with users.

Background and related works Sequence is an interaction technique that lets users select one target among multiple available targets. To differentiate the target to be selected, a matching mechanism of the user’s input against the available targets is needed. Conventional user interfaces use spatial matching (a pointer that moves over multiple targets) or semantic matching (establishing a direct relation between input and target, e.g., pressing a button, or typing a command) [17,57]. SEQUENCE uses a different matching mechanism: the user’s input, i.e., a rhythmic sequence, is matched against multiple targets distinguished by different rhythmic patterns. 2

Therefore, the selection of a target depends on the rhythmic sequence performed by the user. To contextualize our work, we provide background on smartphone used as remote controller, touchless remote-control techniques, and the rhythm used as input method in HCI discussing how SEQUENCE is positioned as remote-control technique.

Smartphone as remote control In many cases, smartphones were used as remote controls for distant screens. Most techniques use smartphone touch capability for controlling remote pointers (spatial matching) resembling the PC-desktop interaction with mice [9,39,40,47]. Other techniques use semantic matching leveraging the accelerometer to sense tilt and throw gestures [16] or for gaming [56]. Other approaches leverage both spatial and semantic matching combining eye-tracker (to point objects) and smartphones (to manipulate objects through gestures) [52,53], or use smartphone cameras to acquire objects and the smartphone screen to manipulate them using touch gestures [2,8]. Other researchers investigate the control of a diverse set of widgets for large screens through smartphones [7]. Moving far from distant screen control, there are several commercial smartphone apps for controlling smart home appliance. In most cases, smartphone apps show conventional UIs (like remote controls) through which they control smart appliances (e.g. Samsung Smart Home app for Android and iOS). To some extent, we could say that SEQUENCE does not have to do so much with the works noted above. We presented those works just because SEQUENCE uses smartphones’ touch input capabilities. Nevertheless, it differs radically from such works for its design. In fact, SEQUENCE is designed to allow users to control remote appliances letting them continue to do the normal activities usually carried out on smartphones (e.g., navigating on Facebook as displayed in Figure 1). To our best knowledge, no other similar technique for smartphones has been designed to comply with this requirement. Avoiding application switching is a key feature of SEQUENCE when compared with other applications that require their opening. In fact, interrupting the current activity to open other applications has many disadvantages for users, related to cognitive demands and time issues [34]. According to a study conducted in 2012 [34], the interruption caused by app switching takes around 12 s. Nowadays, this time interval is probably reduced since current smartphones are generally faster to open different applications simultaneously than older smartphones of the year 2012. Moreover, usability of multitasking management improved over time thanks to the new versions of Android and iOS. At any rate, to switch applications, users are still required to go to home screen, find the application to open (which could be inside a folder), then open the app (waiting for its opening), and use it. Then, users must go back to the previous application to resume the task they were doing. Conversely, SEQUENCE is straightforward and does not require users to carry out the useless steps noted above allowing them to save time. On the other hand, SEQUENCE has some cons in comparison with the techniques mentioned above. In general, we could state that SEQUENCE is suitable for simple interaction tasks (turn on a light, change TV channel) whereas the other techniques are generally more expressive (e.g., [2,7,9,16,56]) since they use touch gestures that can variate in speed, type, direction and acceleration. Moreover, the techniques that use smartphones to control a cursor (e.g., [9,39,40,47]) have the advantage of reflecting the previous experience of users since they resemble the typical PC-desktop interaction.

Touchless remote-control techniques Rhythmic patterns can be captured by a diverse set of touchless sensors that detect binary states [20,46], e.g., eye blinking (open/close), or proximity (near/far). Therefore, SEQUENCE can be also included in the family of touchless remote interaction techniques. Although we did not carry out user evaluations on SEQUENCE touchless input variations, we implemented two of them to demonstrate the flexibility of SEQUENCE. The first implementation uses eye blinks as input whereas the second one uses beats of lips. 3

Both touchless variations provide a binary state (eye open/closed, mouth open/closed) that can be used as input by SEQUENCE. To contextualize our touchless implementations, this section aims to frame them against related previous works. Generally, touchless input techniques work recognizing hand and body gestures from a distance [48,49] and usually are cursor-based or gesture-based. In the former, based on spatial matching, a remote pointer is used to select a target; in the latter, based on semantic matching, different gestures are associated with different targets [12]. Regarding cursor-based techniques, they need a way to confirm the selection (e.g. using a dwell time) when the cursor is over a target to overcome the Midas Touch Problem. Moreover, users can suffer from fatigue since large movements could be required for reaching targets disposed at various positions. This leads to mapping issues between the user’s movements and cursor displayed on the remote screen. SEQUENCE bypasses both Midas Touch and mapping issues by matching the user’s input against a corresponding rhythm, in contrast to corresponding position. PathSync [12] and TraceMatch [14,15], instead, bypass the same issues by matching the user’s input against the corresponding motion. Regarding gesture-based techniques [13,27,29], since arbitrary gestures are semantically associated with arbitrary targets, the user needs to remember and recall such associations and this leads to usability problems [41]. Moreover, there are not clear ways in which gestures can be revealed, discovered, and learned [1,5,58]. These issues are addressed by SEQUENCE using the rhythm as matching technique: users are guided to match targets through the rhythm displayed on them. PathSync and TraceMatch also address similar issues, but they use motion instead of rhythm. Another drawback of discrete gestures is the misrecognition of gestures: shapes need to be reproduced carefully to avoid errors. SEQUENCE overcomes this issue thanks to the use of a uniform gesture able to produce a binary input (taps, eye blinks, etc.) that can be performed in sequence varying the rhythm to select among different targets. Moreover, movements employed by SEQUENCE are minimal and this mitigates fatigue issues, which are instead frequently observed when using mid-air gestures [24,26]. As emerged from the discussion in this section, motion correlation techniques [57] like PathSync and TraceMatch have many relations with SEQUENCE: they use motion instead of rhythm, but address similar issues. However, the use of rhythm may be advantageous for continuous control, e.g., changing volume. In fact, as explained by the authors of TraceMatch, movement correlation techniques are not generally suited for continuous controls “because they require the user to continuously follow the target for prolonged periods” [14]. Using SEQUENCE, instead, once a control was triggered through its rhythmic pattern, it can remain active for a predetermined amount of time (e.g. 2 s) waiting for new active event (e.g. a tap on smartphone, eye blink, or beat of lips), and each new active event triggers again the control one time (similarly to buttons of a remote controls). Then, after a predetermined amount of time from the last user input, the active control is deactivated, and SEQUENCE returns to the original state where controls can be normally triggered according to their rhythmic patterns. Continuous control is discussed in section 3.3.6. Broadly speaking, touchless interaction techniques based on motion tracking have generic limitations. To begin with, they require users to be in the field of vision of the sensor. Moreover, some techniques require users to look at the camera frontally so that body parts can be detected as expected [59]. Finally, the area covered by these sensors is limited as well as the field of view. For example, Kinect – which is used by few of the techniques previously mentioned (e.g. [12,27]) – has an operating distance range between 0.8 and 3.5m with a horizontal and vertical field of view of 57° and 43° respectively [19]. These limitations reduce the suitability of these techniques for everyday contexts. As a matter of fact, these techniques are widely used only in limited contexts such as gaming (e.g., Xbox). Although our touchless implementations of SEQUENCE (see section 3.3.8) use cameras and have the same limitations discussed above, we point out that SEQUENCE does not necessarily rely on sensors that have such limitations. In fact, as highlighted by other works that use rhythm as input (e.g. [20]), SEQUENCE is very flexible and can be used by leveraging a 4

great variety of sensors regardless of body/hand tracking – e.g., using a proximity sensor – or even regardless movement tracking at all – e.g., using a microphone. To conclude, we could state that any touchless input variation for SEQUENCE has clearly pros and cons that deserve to be investigated in future works, e.g. microphones could be used regardless of factors like distance or field of vision but require external noises to be well isolated from user rhythm to avoid false activations.

Input method based on rhythm The use of rhythm is common for controlling user interfaces; click and double click are two well-known examples, but there are also other techniques. In Rhythmic Menus [38], menu items are highlighted in sequence at a given rate and selected when the button mouse is released. In Motion Pointing [23], items show periodic motions and can be selected performing the related motion. In Cyclostar [36], a new touch gesture for zooming on maps was introduced: the speed of zoom depends on the speed (periodicity) of elliptical oscillatory gestures. Tap&Tap [6] is a new gesture to zoom for maps displayed on smartphones. The gesture lets users zoom in by touching in fast sequences two different points of the map. Five-key [55] is a text entry technique in which letters can be selected according to different rhythmic sequences performed on five keys. TapSongs [61] allows users to authenticate in a system tapping rhythmic patterns previously recorded. In Beats [42], instead, sequential touches are leveraged to improve interaction with smart watches. Ghomi et al. [20] investigated efficiency and learning of rhythmic patterns as input technique and highlighted the flexibility of rhythm as input method, which depends on the support of a large variety of sensors and devices able to provide binary input. Finally, SynchroWatch [46] is an input technique for smartwatch that leverages rhythmic correlation between user’s taps and a pair of targets that blink alternately. The selection through rhythmic correlation is possible only between two targets (e.g., to answer/dismiss a phone call). To best of our knowledge, no works investigate ways to graphically represent rhythmic patterns to allow rhythm matching by users as SEQUENCE does.

Our contribution In contrast to previous works, our contribution explores ways for representing rhythmic patterns graphically so that they can be matched using a rhythmic sequence provided by users (taps on smartphone, eye blinks, and beats of lips, in our case). Therefore, our study is aimed to measure to what extent users can understand simple visual rhythmic patterns and match their rhythm. To contextualize our contribution, we now discuss the motivations and show a SEQUENCE prototype for controlling a simple smart home environment composed of a smart TV and a lamp. Then, we introduce the design of SEQUENCE pointing out different input variations – touch and touchless. Finally, we present the evaluation using the touch input variation (i.e., the smartphone app displayed in Figure 1).

Design rationale Motivation As highlighted by other researches [14,32], smart home interactions should be designed to offer users instant control “right here” and “right now” with minimal action and without going out of their ways. SEQUENCE concurs in this direction providing a “lazy” input technique for activating controls at a distance and has several advantages in certain contexts. To begin with, users can play rhythms in many different situations performing small movements, e.g., tapping feet or nodding the head. Moreover, rhythm is reproduced in time rather than space. This means rhythm requires just a small tactile area to be performed unlike typical gestural interactions, which usually require larger movements. Finally, rhythm can be detected regardless of conventional body tracking technique, e.g., using a pressure sensor to detect users’ blowing or even using bio-signal-based human-computer interfaces like EOG glasses to detect eye blinks [4]. At any rate, as other similar works highlight (e.g. [14,15,20]), SEQUENCE is not intended to replace 5

existing remotes, but aims to complement them to provide users instant control “right now” and “right there”.

Proof of concept To show how SEQUENCE could be applied in a smart home environment, we designed a prototype. In our scenario (Figure 2), we show how to control a smart TV and a light. Six SEQUENCE controls are used: two of them are physical whereas the remaining four are displayed on the smart TV. Regarding the physical controls, the first one is used for turning on and off the light, the second one for turning on and off the smart TV (see Figure 2).

Figure 2. Our referring scenario: a smart TV and a light controlled through SEQUENCE.

Once the smart TV is turned on, the four controls displayed on it are used to change channel (next and previous, Figure 3-bottom) and regulate the volume (increase and decrease, Figure 3-left).

Figure 3. Smart TV controls.

6

The four controls displayed on the smart TV are set to support continuous control so that user can change channel and regulate the volume rapidly (e.g. Figure 4).

Figure 4. Once having triggered the Volume-Up control through the corresponding sequence, it becomes a continuous element (the other controls disappear, and it is highlighted in red) that can be repeatedly triggered through other user binary inputs.

Design of SEQUENCE control widgets 3.3.1 Rhythms and number of controls As discussed above, SEQUENCE control widgets are distinguished by different rhythms. This mechanism could be error prone since users could make rhythm errors when synchronizing with the desired control. Consequently, controls with similar rhythmic patterns could be activated erroneously. To reduce the probability of false activation, rhythmic patterns among different controls should be as distinct as possible. On the other side, rhythm patterns should be easy to match and therefore not so complex. In this section, we discuss our design choices aimed to balance the probability of false activation and rhythmic pattern length and complexity. Pattern length. A rhythmic pattern is composed of several units of time that establish its length. SEQUENCE theoretically supports rhythm of different lengths, but the latter should be optimized to minimize the time of activation, error rate, and ease of use. The latter – the easiness to synchronize with a rhythm – depends on the simplicity of such rhythms [22]. In music, different rhythmic structures are used: the most common is 4/4 (frequently used in pop and rock music). This rhythmic structure is defined by four time units at beat level, but different subdivisions are possible so that time units can become 8, 16 or more 1 at division level [60] (Figure 5).

Figure 5. A 4/4 rhythm structure with derived time units.

In this discussion, we therefore consider 4, 8, and 16 time units because they arise from the 4/4 rhythmic structure, which is our choice since it is the most common and the easiest to reproduce.

Conceptually, there are no limits of division level. At any rate, in music the highest subdivision is given by the 256th note (which is rare and split a measure in 256 time units). 1

7

Regarding time units, the more they are, the more time is required to match the rhythm. Therefore, many time units (i.e., long rhythms) are unfavorable because time of activation would increase. Nevertheless, the possibility of making different rhythm combinations is reduced when reducing the number of time units (making short rhythms). Therefore, time units should be chosen to balance rhythmic combination opportunities (needed to maximize differences among control rhythmic patterns) and time of activation. To conclude, we chose eight time units for SEQUENCE because they are enough to make several rhythm combinations as discussed in the next subsection. Rhythm combinations. Theoretically, eight time units correspond to 255 rhythm combinations that range from [00000000] to [11111111] (0 corresponds to pause whereas 1 corresponds to active event). At any rate, not all combinations can be used in SEQUENCE. If a control is associated with a rhythmic pattern composed of too many zeroes (e.g. [10000000]), it would not be so robust against false activations since just an accidental input performed by the user (i.e., a tap on the smartphone, an eye blink, or a beat of lips) could be enough to activate it. Therefore, to make the activation of controls as robust as a possible, associated rhythmic patterns should be composed of as many active events (i.e., 1) as possible. In fact, users can just perform active events explicitly, as pauses (i.e., 0) are implicit (when no active event occurs, then it is a pause). In general, computing the hamming distance between rhythm combinations is a smart way to determine their robustness against false activations. For example, in rhythms composed of four active events and four pauses, we can have 14 combinations that have a hamming distance of 4 from each other (Table 1).

Table 1. Rhythmic combinations having hamming distance of 4 bits.

1 2 3 4 5 6 7

00001111 00110011 00111100 01010101 01011010 01100110 01101001

8 9 10 11 12 13 14

10010110 10011001 10100101 10101010 11000011 11001100 11110000

It means that users should make at least four rhythmic errors for erroneously activating one of the combinations in Table 1. Making some trials with two users (contextualized later in section 3.3.2), we observed several difficulties to activate some of the combinations in Table 1. Many combinations were easy to activate (e.g., combinations 1, 2, 3, 4, 6, 11, 12, 13, and 14 in Table 1), but synchronizing with others could be tricky (e.g., combinations 5, 7, 8, 9, and 10 in Table 1) because of their irregularity. In effect, a study conducted by Grahn and Brett [22] confirms the difficulties of users to synchronize with rhythms that have complex metrics, i.e., rhythms in which accents occur at irregular intervals. This convinced us to change the approach completely: in effect, the naturalness of the rhythm (human aspect) does not have much to do with a mere combinatory approach (permutational aspect). Therefore, for SEQUENCE, we chose one of the simplest rhythms that can be built [10101011]2, and translate it right one time unit so that we obtain eight different rhythms (see Table 2).

2

Practically every song has this rhythmic pattern in one way or in another.

8

Table 2. The first rhythm is translated right one time unit to obtain the remaining seven rhythms.

1 2 3 4 5 6 7 8

101110101011101010111010 101110101011101010111010 101110101011101010111010 101110101011101010111010 101110101011101010111010 101110101011101010111010 101110101011101010111010 101110101011101010111010

Our base rhythm 1-translation 2-translation 3-translation 4-translation 5-translation 6-translation 7-translation

The minimum hamming distance among rhythmic patterns in Table 2 is two. Therefore, to have an activation error, at least two rhythmic errors should occur. In the next section, we introduce different ways to visualize rhythms contextualizing the design rationale behind the choice of using the translated rhythmic patterns displayed in Table 2.

3.3.2 Ways to visualize rhythm In this section, we discuss possible ways to represent rhythms visually. Normally, a musical software represents rhythms using Time Unit Box System (TUBS). It consists in a row or boxes in which each box represents a fixed time unit. Usually, these software “play” the TUBS repeatedly in a continuous cycle, which is potentially never-ending. TUBS can be transformed circularly so that all boxes are connected endlessly. This approach, named circular notation, works well for repetitive and loop-oriented music [18]. We use the circular notation to represent SEQUENCE rhythmic patterns since it has several advantages. In traditional TUBSs, rhythmic patterns have a clear starting and ending point since they are designed along a line (Figure 6-left). In contrast, circular TUBSs show continuity between the starting and ending point since they are designed along a circle, and therefore, users cannot identify the actual start and end of the rhythmic sequence3 (Figure 6-right).

Figure 6. Perception of starting and ending points in linear TUBS and circular TUBS. In the former, starting and ending points are clearly identifiable. In the latter, starting and ending points are arbitrary since there is continuity among them.

Continuity between starting and ending points is a key advantage of circular TUBSs because users can choose to start synchronizing with the displayed elements at any moment (e.g. Figure 7-right) following the circular direction without the “jump” that is instead needed in linear TUBS (e.g. Figure 7-left).

From the system perspective, any sequence has starting and ending points, but in circular TUBSs, they are not identifiable by users. 3

9

Figure 7. Starting synchronization with an element in the middle of a sequence.

The continuity offered by circular TUBSs, combined with the rhythmic translations displayed in Table 2, lets users to perceive all the rhythmic patterns used in SEQUENCE as the same rhythm. In fact, all the rhythmic sequences displayed in Table 2 are, de facto, the same rhythm when changing the starting points (see Figure 8). To some extent, this seemed to facilitate the matching of elements since users are free to start synchronizing when they prefer, i.e., when they “enter the rhythmic loop”, perceiving in mind the regularity of the sequence with which they are trying to synchronize.

Figure 8. Starting from the point indicated with the red marker, all the rhythmic sequences used in SEQUENCE are identical. Watch also the video at https://youtu.be/f9IBceXlp_4

After some trials with two users4, in fact, we noticed that in the condition in which translated rhythms were tested, users made generally fewer errors than in the condition in which different simple rhythms were tested, i.e., the one displayed in Table 1 combinations 1, 2, 3, 4, 6, 11, 12, 13, and 14. The differences were minimal but constant considering different repeated trials over time with the same users. We cannot state that such differences are significant because we did not conduct formal evaluations, but somehow these trials suggested us – also thanks to several discussions with such users – that using translated rhythms could be the best choice to minimize errors and, consequently, time of activation. Finally, regardless of easiness of synchronization, another advantage of circular TUBSs is that they allow a more effective design of control widgets in terms of shape. In fact, circular TUBSs can be displayed in a circular shape around the icons that identify the different controls and, according to the literature, curved visual objects are generally preferred by users [3]. Moreover, circular widgets let designers save space (see Figure 9) since they are around 26% smaller than corresponding squared widgets despite both have equal icon size.

Figure 9. Circular TUBSs allow the design of curved visual widgets, which are generally preferred by users. Moreover, circular TUBS widgets are smaller than squared widgets allowing designers to save space.

These users belong to the close circle of the author of the paper. Therefore, these tests cannot have a formal validity for clear reasons, but they were extremely useful to better reflect on the different possibilitie s of design of SEQUENCE. 4

10

Figure 10 summarizes the entire design rationale just described.

Figure 10. The rationale from our base binary rhythm to our representation using circular notation and translated rhythms (as previously displayed in Table 2).

3.3.3 Designed widgets We designed two variants of circular rhythmic patterns (Figure 11): the first one uses a fixed widget where dots move around it discretely; the second one uses a rotating widget so that the entire rhythmic pattern moves continuously in a circular way.

Figure 11. Fixed widget and rotating widget.

Fixed widgets. In the fixed widgets, a dot moves discretely around the widget marking the time. There are two kinds of time unit boxes: green and gray. When the dot moves to green boxes, an active event should be performed by the user. When instead, the dot moves to gray boxes, no action (pause) should be performed by the user. Therefore, all the boxes (green and gray) are needed to mark the time regularly at a predetermined interval, but active events of the user should occur only when the dot moves inside the green boxes. We handcraft a corresponding physical version using a polystyrene-like surface suitably tailored that hosts a series of LEDs connected to Arduino (see Figure 2). Rotating widgets. In the rotating widgets, the entire rhythmic pattern rotates continuously with the widget making a complete rotation at a predetermined speed. Therefore, the rotation marks the time, and grey boxes are not needed. When, during the rotation, a green box reaches the upper position of the widget, a dot is displayed inside indicating the moment in which the user should perform an active event (Figure 11right). Regarding rotating widgets, no corresponding physical version was developed. Discussion on the widgets. We expected the rotating widget to have better performance (i.e., easiness to be synchronized by users) than the fixed widget because it provides a continuous animated rhythmic 11

feedback (i.e., the continuous rotation) to the user and, generally, meaningful animations in user interfaces improves decision-making [21] and helps understand what is going on [25,37]. Our expectation was that rotating animations would have guided users’ attention on rhythm tapping continuously and, therefore, more effectively. On the contrary, the fixed widget provides feedback at a discrete interval without continuous animation, i.e., the dot moves from the current box to the next one discretely, without any continuous animation – guiding users’ attention in a discrete way rather than in a continuous way. At any rate, the evaluation displayed later demonstrates that there is no notable difference between these two ways of representing circular rhythmic patterns.

3.3.4 System design SEQUENCE system is developed in NW.js and executed on a PC. It can show the rhythmic widgets on screen and can eventually be connected to external physical widgets through Arduino. SEQUENCE is equipped with a Web Socket server that accepts inbound connection from external applications (e.g., apps for smartphone, apps for detecting eye blinks or beats of lips) able to provide binary input.

3.3.5 User feedback Feedback is a fundamental aspect in human-computer interaction because it satisfies communication expectations of users when interacting with a system [44]. That is why for any binary user input during the activation of a rhythmic sequence, any SEQUENCE control provides feedback. If a dot inside the box was matched correctly, both dot and box disappear (Figure 12).

Figure 12. User feedback: both box and dot disappear when correctly matched.

If an error occurs when matching a dot inside a box (e.g., the user performs an active event on a gray box), the entire control is reset, and the entire rhythmic sequence must be repeated from the beginning (Figure 13).

Figure 13. When user errors occur, the control is reset and the activation sequence must be repeated from the beginning.

3.3.6 Continuous activation of controls When designing applications on SEQUENCE, control widgets can be set to support (1) discrete activation or (2) continuous activation. When a control is set to support discrete activation, once it has been triggered, it can be triggered again only by performing the entire rhythmic sequence required, and this is time12

consuming. Discrete activation is unfavorable when a control can be activated several times in sequence, e.g., to increase the volume. To overcome this limitation, widgets can be set to support continuous activation (see Figure 4): in this case, once a widget was triggered, the widget remains active for a while waiting for other user active events that trigger again the same widget (without need of repeating any rhythmic sequence). As noted above, this makes SEQUENCE adapt for continuous control unlike movement correlation techniques (in which following a target for a prolonged period is required) [14].

3.3.7 Interface visibility Displaying SEQUENCE animated controls at any time could be annoying for users who do not wish to interact with the system for prolonged periods. Therefore, we designed activation and deactivation processes to be performed by users for respectively displaying and hiding rhythmic controls. The kind of activation and deactivation processes depends on the different input modalities and are discussed in the next section.

3.3.8 Input modality variations As noted above, we designed three different input types: taps on smartphone (touch), eye blinks, and beats of lips (touchless). Touch input: tap sequence on smartphones. SEQUENCE for smartphone (i.e., an Android app) requires just a small region on the screen to let users perform the tap sequence. The region to be tapped can float in foreground over any application without disturbing the current activity (see smartphone in Figure 1). Therefore, unlike other remote-control apps, users do not waste time to switch application [34] since the SEQUENCE app is always visible over any application being accessible at any moment while using the smartphone. When users need remote controls for occasional interactions, maybe, while carrying out other activities on smartphones (e.g., replying a WhatsApp message, or reading e-mails), the use of SEQUENCE for smartphone becomes reasonable. Controls can be displayed (explicit activation process) by performing a tap on the smartphone. Then, users can rhythmically synchronize with the desired control. When the system does not receive any input for a predetermined amount of time (6 s), the system is automatically deactivated (implicit deactivation process) and controls are hidden. Touchless input: eye blink detected by cameras. A touchless modality able to provide binary input can be designed by leveraging eye blinks. To detect them, we designed a Python application that uses OpenCV [10] and Dlib [31] to detect face landmarks [30] through standalone webcams. Then, we applied the technique proposed by Soukupová and Čech [50] to detect eye blinks using face landmarks. The technique works computing the eye aspect ratio, which drops to 0 when the eye is closed. Since eye blinks could occur involuntarily, we designed an alternative way to activate and deactivate the system. To activate the system and make the controls visible, users should keep the mouth open for a while (2 s). Therefore, once the system is activated, users can perform eye blink sequences to trigger the desired control. To deactivate the system and hide the controls, users should keep the mouth open for a while again. Therefore, the same gesture (mouth open for a prolonged period) is used for activating and deactivating the system. Touchless input: beats of lips detected by cameras. Another touchless modality able to provide binary input can be designed by leveraging beats of lips. To detect them, we designed a Python application that uses OpenCV and Dlib to detect face landmarks through a standalone webcam. We noted that the shape defined by the landmark points of the eye is practically identical to the shape defined by the landmark points of the mouth (Figure 14). Therefore, the same technique of eye blink detection can be applied without any modification also for detecting beats of lips (naturally, changing the landmark points). 13

Therefore, we detect beats of lips computing the mouth aspect ratio – derived from the eye aspect ratio just discussed – which drops to 0 when the mouth is closed.

Figure 14. The eye aspect ratio is practically identical to the mouth aspect radio.

Beats of lips could occur for reasons that are not related to SEQUENCE control, e.g. while talking. Therefore, similarly to eye blinks, we use the opening of the mouth for a prolonged period for activating and deactivating the system, whereas controls can be triggered by beating the lips.

User evaluations Since we are interested in user performances rather than system performances, we carried out our evaluations using the most reliable input modality we designed, namely, the smartphone app. Smartphones, in fact, are quite effective when detecting touch, i.e., they are robust against false positive and true negative. In fact, rarely does a modern touch capacitive smartphone not detect touches, if users perform them. The two touchless input modalities we proposed, instead, can be affected by false positives (e.g., the system detects an eye blink, but the user never performed any blink) and true negatives (e.g., the user performs an eye blink, but the system does not detect any blink). Therefore, all the evaluations presented in this section were carried out using the smartphone app because we wanted to investigate SEQUENCE regardless of reliability of different input modalities. Moreover, another relevant reason to evaluate touch modality is the immediate applicability and flexibility of the technique. In fact, as discussed later in section 5.2, many (wearable) devices already present in our life (e.g. Figure 21) are normally equipped with touch sensors or mechanical buttons that could be leveraged for supporting SEQUENCE at no cost and right now. The first evaluation was aimed to tune the SEQUENCE rhythm speed, so we answered the following question: •

What is the most appropriate SEQUENCE rhythm speed to allow users to match SEQUENCE controls fast and without errors?

After determining the speed, we carried out a second evaluation aimed to discover user performances and eventual differences between the two alternatives of widget design: fixed and rotating. Therefore, we answered the following questions: • • •

How many false activations of unwanted controls (errors of users) occur when synchronizing with a control? What is the average time required by users to select a control? Is there any user performance difference in terms of false activations, missed elements, and activation time between the two ways to visualize rhythms, i.e., fixed and rotating?

SEQUENCE rhythm speed To assess the most appropriate SEQUENCE rhythm speed, we conducted several trials in which users were asked to select elements on screens. Several stages of evaluation were conducted for each rhythm visualization way (fixed and rotating), starting from low speed – i.e., a speed that was reasonably easy to be synchronized with –, increasing the speed in the succeeding stages – until reaching a speed so fast to make synchronization reasonably hard. According to the collected data, we could identify the most appropriate 14

speed for a rhythmic sequence, which is 2.8 s (86 BPM). In the following, we present the details of our experiments and the rationale that led us to choose that speed.

4.1.1 Participants We selected five participants (3 M/2 F) aged between 27 and 30 years. All participants were right-handed and declared to use smartphone frequently. None of the users had a previous experience with the system.

4.1.2 Apparatus The experiments were carried out using a notebook PC connected with an external display (21’’, 1920 × 1080), which was used for the experiments. Users were placed at 0.5 m from the external display. A THL 5000 smartphone running SEQUENCE app was used as touch device.

4.1.3 Evaluation prototype The evaluation prototype, developed in NW.js, can be set to show eight fixed or rotating controls (with different rhythmic patterns as discussed in section 3.3.1) at different rhythm speeds. The prototype allows users to perform 34 trials. The first four trials are intended for practice and therefore no data are saved. In the last 30 trials, instead, the prototype record data of users: time of activation, false activation, and missed activations. For each trial, a control is randomly highlighted to let users understand which control they must select. If a user selects the required control within 10 s, the prototype records the time of activation and pass to the next trial where another control is highlighted to be selected by users. If a user is not able to select a control within 10 s, it is recorded as missed activation and the prototype passes to the next trial anyway. Finally, if a different control from the one required is erroneously selected by users, a false activation is recorded. In this case, users can continue to select the required control within 10 s (so that we can record the time, even if the false activation is recorded in any case); otherwise, the prototype pass to the next trial.

4.1.4 Procedure We introduced participants with a basic overview of SEQUENCE. We explained the control technique showing how to select elements with few examples. We explained that they had to touch the smartphone only after the dot moves inside the green boxes, trying to memorize the rhythm and so predicting when to touch. Once participants declared to have understood the selection mechanism, the study could start. The study was composed of ten stages, and, in each of them, the prototype was set in diverse ways as displayed in Table 3. As the same table shows, to avoid the carryover effect, we balanced ways of rhythm visualization, i.e., fixed and rotating among users. We preferred not to balance the order of speeds because we think it would have been counterproductive. In fact, we must state that higher speed sequences are objectively much more difficult to be synchronized than lower-speed sequences. Therefore, users that would have started the experiment in the faster conditions would have a puzzling first impression with the system because of the higher difficulty of synchronization, and this would have probably conditioned the entire experiment. Therefore, we preferred to increase difficulty gradually to minimize the first impression effect despite this could generate, somehow, a learning effect toward high speeds. Table 3. Evaluation stages. Stage 1 2 3 4 5 6 7 8 9

Rhythm visualization way For users 1, 3 and 5 For users 2 and 4 Fixed Rotating Rotating Fixed Fixed Rotating Rotating Fixed Fixed Rotating Rotating Fixed Fixed Rotating Rotating Fixed Fixed Rotating

15

Evaluated speeds 4.8s (50 BPM) 4.8s (50 BPM) 4.0s (60 BPM) 4.0s (60 BPM) 3.2s (75 BPM) 3.2s (75 BPM) 2.4s (100 BPM) 2.4s (100 BPM) 1.6s (150 BPM)

10

Rotating

Fixed

1.6s (150 BPM)

For each stage displayed in Table 3, we recorded 30 trials as noted above. Therefore, for each participant, we recorded data on 300 trials (30 each for each stage), for a total of 300 × 5=1500 trials in the entire study. At the end of the experiment, participants completed a form where personal information was required (age, sex, and smartphone frequency of use).

4.1.5 Results The aim of this evaluation is to find the best compromise between time of activation (which should be as low as possible) and errors (which should be as few as possible). In theory, low time of activation should correspond to higher rhythm speed, but increasing the rhythm speed, users make more errors and, therefore, also the time of activation would increase.

Figure 15. Time of activation (y-axis), errors and missed elements (callout) for each sequence speed (x-axis).

Errors (false activations) and missed activations increase with rhythm speed. Aggregating results of fixed and rotating visualization ways (as displayed in Figure 15-right), the slower speed (4.8 s) shows 4% of errors and 1% of missed activation. The higher speed (1.6 s) instead shows 25% of errors and 9.7% of false activations. Observing the aggregated data, moreover, there is no relevant improvement of time of activation after the speed of 3.2 s. According to the data in Figure 15-right, the most appropriate SEQUENCE rhythm speed with an acceptable error rate (6.3%) would be 3.2s. At any rate, observing the data in Figure 2-left, we noted a suspicious difference between fixed and rotating visualization ways (1.34 vs 11.34% of error). We do not expect to have significant differences among data since we have just few users and trials. At any rate, we observed data of single users (Figure 16) to better reflect about the most appropriate setting of SEQUENCE rhythm speed.

16

Figure 16. User data with time of activation (y-axis), errors and missed elements (callout) for each sequence speed (x-axis).

Users 2 and 5 are quite regular: time of activation decreases until speed of 2.4 s and, moreover, both errors and missed elements are 0% at that speed. This leads us to think that the most capable users reach the best performance at that speed. User 3 made a considerable number of errors (30%) for rotating-3.2s condition. As expected, also time of activation increased (up to 5.2s) in that condition. Nevertheless, the number of errors drop down at 16% unexpectedly at the rotating-2.4 s condition. Looking at the data, we noted that, in the rotating-3.2 s condition, the user was not able to synchronize with the highlighted control for several consecutive times. Considering the small number of trials (30), this caused an increment of the error rate up to 30%. Another considerable increment of error rate (33.3%) occurred for user 1 both in the fixed-2.4 s and rotating-2.4 condition. Also in this case, we identified a notable sequence of consecutive errors/missed elements. 17

To conclude, observing the data of users separately, and in accordance with the fact that (1) users 1 and 3 tended to behave somehow unexpectedly under certain conditions, (2) users 2 and 5 had the best performance at the speed of 2.4 and (3) user 4 achieved good performances at the fixed-2.4s condition (6% of error and 0% missed), we assumed that the most appropriate setting for the SEQUENCE rhythm speed could be the one in the middle between 3.2 s and 2.4 s, which is therefore 2.8 s (86 BPM). With a rhythm speed of 2.8 s, the minimum possible activation time (i.e., the difference of time between the last user’s active event before triggering the element and the first user’s active event for synchronizing with that element) is theoretically 2.1 s, as shown in Figure 17.

Figure 17. The theoretically minimum activation time of a SEQUENCE control with a rhythm speed of 2.8 s (86 BPM) is 2.1s, i.e., 2100ms, calculated as following: 2800-(350+350).

We must report that, after a test, a user made a comment that led us to change a minor setting in the system for the next evaluations. He commented that, in many cases, he had the impression to not be able to match an element because he tended to anticipate few instants the moment in which dots move to green boxes. In effect, we had the same impression while observing the participants. Therefore, to mitigate this tendency, we added a slight delay of 30 ms between the moment in which the user taps on the smartphone and the moment in which the signal arrives to SEQUENCE system. Since the SEQUENCE system provides feedback for each any user input (as discussed in section 3.3.5), we added a very slight delay so that it is not perceptible by users5 but may reduce synchronization errors. Formal evaluation of SEQUENCE with the rhythm speed of 2.8 s (86 BPM) and the slight delay of 30 ms noted before are presented in the next section.

User performance with different visualization ways To evaluate user performances according to fixed and rotating visualization ways, we conducted several trials in which users were asked to select elements on screens in both conditions. According to the data, no notable differences between the two visualization conditions were identified (activation time, 3.9 s fixed vs 3.8 s rotating; errors, 2.8% fixed vs 3.7% rotating; missed elements, 1.3% fixed vs 0.9% rotating). In the following section, we present the details of our experiments.

4.2.1 Participants Tests were conducted on 12 participants (9 M/3 F) aged between 22 and 32 years (M=27.3, SD=2.96). Eleven participants were right-handed, whereas one was left-handed. All participants declared to use smartphones frequently. None of the users had previous experience with the system. We must report that we excluded a participant since he was not able to synchronize with any elements, even after few minutes of attempts. The participant – probably affected by beat deafness [33,45,51] – declared not to be able to perceive and imagine the required rhythm at all, and preferred to stop the test.

An “experiment conducted by Michotte and reported by Card, Moran and Newell (1983) [11] shows that humans perceive two events as connected by immediate causality if the delay between the events is less than 50ms” [35]. 5

18

4.2.2 Apparatus As before, the experiments were carried out using a notebook PC connected with an external display (21’’, 1920 × 1080) and a THL 5000 smartphone.

4.2.3 Evaluation prototype The evaluation prototype is pretty like the previous one. Nevertheless, instead of 34 trials, the prototype allows users to perform 110 trials. The first 10 trials are intended for practice and therefore no data are saved. Regarding the last 100 trials, instead, the prototype records the data of users: time of activation, false activation, missed activations. As before, for each trial, a control is randomly highlighted to let users understand which control they must select until reaching the expected 110 trials. Each 20 recorded trials the system went in pause automatically to let participants take a break. Any participant could decide break duration and when resuming the experiment.

4.2.4 Procedure As before, a basic overview of SEQUENCE was presented to the participants showing them how to select some elements of example. Once the participants declared to have understood the selection mechanism, the study could start. Unlike the previous experiment, the study was composed of two stages: participants evaluated the fixed visualization way in a stage, and the rotating visualization way in the other stage (within-subject study). Therefore, for each participant, we recorded data on 200 trials (100 each for fixes and rotating conditions), for a total of 2400 trials (200 × 12) in the entire study. We balanced visualization conditions among users to avoid the carryover effect: if a user started the test with the fixed visualization way, the next user started with the rotating visualization way and vice versa, until completing all participants. The entire study for each participant took approximately 30-40 min. At the end of the experiment, participants completed a form where personal information was required (age, gender, and smartphone frequency of use). Then, we asked participants some questions to collect some qualitative data about the system.

4.2.5 Results Differences between visualization conditions. We expected users to perform better in the rotating condition since we thought that continuous animated feedback during the rotation could help users to better synchronize with controls. At any rate, in contrast with our expectations, users performed quite similarly in both fixed and rotating conditions (see Figure 18). We used Wilcoxon sign-rank tests instead of paired t-tests because, according to Shapiro-Wilk tests, data were not normally distributed. No significant differences between conditions (p>0.05) were observed in all cases. The activation times were 3.86 s (SD=0.52 s) for fixed condition and 3.79 s (SD=0.44 s) for rotating condition (Z=-0.78, p=0.43). The errors were 2.75 (SD=1.96) for fixed condition and 3.67 (SD=3.23) for rotating condition (Z = -1.65, p=0.09). The missed elements were 1.25 (SD=2.56) for fixed condition and 0.92 (SD=1.62) for rotating condition (Z=-0.68, p=0.49). Errors and missed elements are considered over 100 trials, and results can be therefore considered as percentages.

19

Figure 18. Comparison between fixed and rotating way. Activation times (in seconds), missed elements, and errors. Error bars denote 95% CI.

Although differences would have been statistically significant, there is no practical difference between the conditions. Activation time distribution. Figure 19 shows the activation time distribution over 1200 trials for fixed and rotating visualization (missed activations are ignored). Activation time distribution is quite similar for both visualization ways: more than 75% of the activations occur within 4 s. Minimum activation time for fixed visualization was 2.2 s, whereas that for rotating visualization was 2.5s.

Figure 19. Box plots showing activation times of participants in seconds (y-axis) for both visualization ways.

Subjects’ performance. To show details on users who carried out this evaluation exhaustively, we summarize the performances of each of them in Figure 20. It shows activation time, errors and missed elements of all users for both visualization ways. Users 3 and 11 made no error for fixed condition. Users 11 and 12 made no error for rotating condition. No missed elements occurred for users 3, 4, 5, 10, 11, and 12 in both visualization conditions. Eight users out of 12 made less than 3% of errors in both fixed and rotating conditions.

20

Figure 20. Times of activation in seconds (y-axis), errors and missed elements (callout) of all users (x-axis) for both fixed and rotating visualization ways. Error bars denote 95% CI.

User 3 was the fastest whereas user 9 was the slowest. Surprisingly, user 9, even if the slowest, had a low rate of errors and missed elements, i.e., 1% of error and 2% of missed elements for fixed condition and 3% of both errors and mixed elements for rotating conditions. The worst performance in terms of errors and missed elements rate are obtained by users 8 and 2. Users’ comments. At the end of the tests, we arranged a discussion with the participants where we asked how they felt about the easiness or difficulty to synchronize with elements on screen and if there were differences between the two visualization ways. Ten participants out of 12 declared that it was quite easy to synchronize with elements, but no relevant preferences between the two visualization ways were expressed in most of the cases: four users showed a slight preference toward the rotating condition whereas three users expressed a slight preference toward the fixed condition. Only 2 participants out of 12 expressed that it was not so easy to synchronize with elements. Moreover, one of them expressed a slight preference toward the fixed visualization way whereas the other had no preference. Despite the overall easiness of synchronization, 8 out of 12 participants declared that the test was physically demanding. At any rate, we expected this result since 200 trials were condensed in a little time and each trial required 5 touches on the smartphone to be completed successfully. Moreover, we observed that 3 out of 12 participants clearly suffered from fatigue stretching and moving the hand continuously to get relaxed during the experiment breaks. At any rate, we must report that three participants pointed out that the test was not tiring at all despite the considerable number of repetitive touches. In sum, we could state that participants felt that it was generally easy to synchronize with elements and no noteworthy difference between the two visualization ways was perceived.

Discussion In the previous sections, we described SEQUENCE interaction technique pointing out the rationale of design and displaying the evaluations using the smartphones app. In this section, we show the key features of SEQUENCE framing them against the literature, discussing advantages and limitations of the technique. Moreover, we discuss some combinatory aspects of SEQUENCE interfaces and how they can affect the design of the interaction when the number of controls is less or more than eight. Finally, we discuss the differences between fixed widgets and rotating widgets regarding costs of production.

21

The novelty of our contribution To best of our knowledge, SEQUENCE is the first work that investigates ways to represent rhythmic patterns to allow their matching by users. The evaluations showed that SEQUENCE can be easily used – at least using touch sensors. The flexibility of rhythm-based interaction techniques like SEQUENCE was previously discussed in literature [20,46], and we presented different user input variations (see section 3.3.8) just to demonstrate – in practice – such flexibility and enrich the discussion. As far as we know, the most related work to SEQUENCE is Ghomi et al.’s [20]. SEQUENCE deserves to be discussed against that work because both show advanced used of rhythmic patterns for interaction. The main difference is that SEQUENCE is an actual interaction technique unlike Ghomi et al.’s work, which instead investigates “the potential advantages of using rhythm to interact with computer system” carrying out experiments on rhythm memorization and reproduction. Regarding reproduction, participants of that study were asked to reproduce rhythmic patterns that were presented to them by combining visual/audio stimuli. The results of that study were promising, reaching a success rate of 93.9% considering a vocabulary of 30 different rhythmic patterns chosen by the authors. Regarding memorization, the difficulty to memorize rhythmic pattern resulted to be similar to that of memorizing keyboard shortcuts. Participants of that study were asked to reproduce rhythmic patterns by using the space key of a keyboard as input. The aim of that study, in fact, was not to propose a novel interaction technique but just investigate rhythm reproduction and memorization. Therefore, the use of a keyboard key as input was the most appropriate choice in that context. Unlike Ghomi et al.’s work, SEQUENCE aims at providing a complete interaction technique based on simple rhythms that are presented to users by using animated visual widgets – both on-screen (Figure 3) and physical (Figure 2). Therefore, the advantage of SEQUENCE is that users do not need to memorize different rhythmic patterns but can synchronize with the desired element by inferring the corresponding rhythm through its visual animated representation. In this way, users can discover and learn what are the different rhythmic patterns associated to different elements instead of having arbitrary associations as in Ghomi et al.’s work. Moreover, SEQUENCE supports continuous activation, which is useful to trigger the same elements repeatedly as discussed in section 3.3.6. Finally, unlike Ghomi et al.’s work, we introduced different kinds of user inputs – touch and touchless – to demonstrate in practice the flexibility of the technique. In this paper we evaluated the touch input variation, but we are planning to evaluate and/or envision other touchless variations in next works.

Different input modalities Despite all our input variations displayed in section 3.3.8 work providing binary inputs, they are fundamentally different showing the flexibility of SEQUENCE. We believe the touch modality to be very interesting because it enables, right now, the control of rhythmic widgets using a wide class of (wearable) devices equipped with touch sensors and mechanical buttons (e.g. Figure 21) such as Bluetooth headset, pocket button (e.g., Flik.io, Amazon Dash button), smartwatch, etc.

Figure 21. Examples of wearable devices supported by SEQUENCE. From left to right: smartwatch, smart bracelets, wireless headset, smart glasses, pocket button.

The touchless modalities have, instead, a noteworthy advantage: they allow the control of widgets without any intrusive devices. Eye control requires small movements, but during the eye blinks, SEQUENCE interface is not visible to the user (even if just for few instants). This could disturb users because they may 22

need continuous visual feedback to remain synchronized with the desired control. At any rate, more effective evaluations should be carried out to investigate whether the problem is relevant or not. The control through the beating of lips, instead, could seem quite unusual and unnatural, but it seems an effective input modality for SEQUENCE. In fact, the marking rhythm with the lips is quite common: when talking or singing, in fact, people mark the rhythm opening and closing the lips. Moreover, the beatboxing [54] is the art to mimic drum machines with the mouth, lips, tongue, and voice. Professional musicians can reproduce even complex rhythms with this technique. Therefore, this leads us to think that lips may be used profitably also with SEQUENCE since it uses simple rhythms. Finally, given the flexibility of SEQUENCE, we point out that any other input modality – touch or touchless – able to provide binary input could be envisioned.

5.2.1 Framing SEQUENCE touch input variation using smartphones against literature According to our evaluations on the touch input modality using smartphones, user performances on time of activation (