Eye Gaze Interaction for a Mobile Text Application. A New Interface Concept

Bachelor Thesis, 2019

40 Pages, Grade: 1,0



1 Introduction

2 Related Work
2.1 UI for Handheld Devices
2.2 General Gaze Interaction
2.3 Gaze Interaction on handheld devices
2.4 Eye Typing
2.4.1 Dwell Time Based Eye Typing
2.4.2 Dwell Time Free Eye Typing
2.4.3 Eye Typing in Security and VR

3 Gaze enhanced Text Editing
3.1 Explanation of important Components
3.2 Design Considerations
3.3 Input Interpretation
3.4 Complementary interaction modalities
3.4.1 Typing
3.4.2 Text Selection
3.5 Implementation
3.6 Feedback
3.7 Interaction Technique Examples
3.7.1 Touch typing with ’gaze shift’
3.7.2 Gaze based cursor positioning
3.7.3 Gaze based word selection
3.7.4 Free text area selection with gaze
3.7.5 Eye typing a special character
3.7.6 Selecting extension keys with gaze

4 User Study: Word Selection
4.1 Study Design
4.2 Participants
4.3 Study Procedure

5 Results
5.1 Performance
5.2 Questionnaires and Feedback

6 Discussion

7 Conclusion


Ein übliches Element einer Benutzeroberfläche zum Auslösen einer Aktion ist der Button, der traditionell berührt oder angeklickt wird. Ein neues Konzept wird eingeführt, das die Stan- dardfunktionalität eines Buttons um blickbasierte Funktionen erweitert, der GazeButton. Er fungiert als universelle Schnittstelle für blickbasierte UI-Interaktionen während einer klassischen Berührungsinteraktion. Er ist einfach mit bestehenden Benutzeroberflächen zu kombinieren und ergänzt sie unaufdringlich, da er den Benutzern die Freiheit lässt, zwischen klassischer und blick- basierter Interaktion zu wählen. Außerdem ist er unkompliziert, weil alle neuen Funktionen lokal an ihn gebunden sind, und obwohl er nur ein kleines UI-Element ist, ist er aufgrund seiner Blick- komponente für Interaktionen im gesamten Display und darüber hinaus verwendbar. Der GazeBut- ton wird anhand einer Textbearbeitungsanwendung auf einem Multitouch-Tablet-PC demonstriert. So kann beispielsweise ein Wort ausgewählt werden, indem man es ansieht und auf den GazeBut- ton tippt, wodurch anstrengende körperliche Bewegung vermieden wird. Für solche Beispiele werden konkrete systematische Entwürfe vorgestellt, die visuelle und manuelle Benutzereingaben kombinieren, woraufhin die blickbasierte mit der berührungsbasierten Textauswahl in einer Be- nutzerstudie verglichen wird.


A common element of a user interface for initiating an action is the button that is traditionally touched or clicked. A new concept is introduced that extends the standard functionality of a button with advanced gaze-based functions, the GazeButton. It acts as an universal interface for gaze-based UI interactions during a classic touch interaction. It is easy to join with existing UIs and unobtrusively complementary because it keeps the users’ freedom to choose between classic and gaze-based interaction. In addition it is uncomplicated, because all new functions are locally bound to it, and, despite being just a small UI-Element, it is usable for interactions throughout the display and even beyond because of its gaze component. The GazeButton is demonstrated using a text editing application on a multitouch tablet computer. For example, a word can be selected by looking at it and tapping the GazeButton avoiding strenuous physical movement. For such examples, concrete systematic designs are presented that combine visual and manual user input whereafter the gaze based text selection is compared with the classic touch based one in a user study.


Typing on wide touch screens takes longer time than attaching a normal keyboard to it. Recently, the use of gaze as an interaction modality became ubiquitous. It is now enabled in public displays, laptops and soon in mobile phones. By the use of gaze, we can enhance the typing experience by combining touch with gaze interaction. Accordingly, in this thesis, we aim to enhance text entry on wide touch screens with the use of gaze.


- Review state of the art of using gaze and touch as interaction modality.
- Implementation of touch and gaze enhanced methods.
- Carry out a lab study to evaluate the method usability.
- Analyze and report the results.

Abbildung in dieser Leseprobe nicht enthalten

Figure 1.1: The GazeButton: An augmented button that expands the standard UI interaction (top left) with three new input modalities that, identified by eye tracking, can be used for interaction.

1 Introduction

The dominant computer interface is based on a 2D GUI ever since the introduction of the direct manipulation interface by Shneiderman [29]. Tablet users usually use the index finger or thumb for interaction, whereby interaction with the fingers often makes large parts of the screen useless by obscuring them during the interaction. This basic interaction is usually made possible by a common GUI element called the ’button’ whose functionality stayed the same, whether on a desktop PC, smartphone or tablet PC. The user selects the button by hand, either indirectly with the mouse, or directly on a touch screen and thus causes an action. This thesis examines how this basic button concept and the UI interaction can be extended and improved by using gaze recognition.

Due to advances in the technology behind eye tracking it is now possible to integrate com- plete eye trackers into the screen or use them as external devices on existing displays [13] [33]. Research has investigated numerous interaction methods and device types [30] [12] [8] [19] [31] [46], including hands-free interaction on smart watches through smooth pursuit eye movements [8], moving an object by looking at it followed by a button press and mouse movement where the mouse acts as a relative positioning device [12], and several types of eye typing [19]. These meth- ods were explored in isolation as general selection methods or as alternatives to standard manual input methods. Recently, the combination of gaze and touch for interaction with large displays using various techniques on mobile devices has been attempted [36]. It is not completely clear, however, how gaze and manual input can be reconciled, nor how gaze recognition can harmonize with established UI concepts, especially when applied to tablet computers.

Creating the GazeButton an innovative UI element has been built extending the conventional button concept by adding semantic usage of the user’s eye movement to it. The underlying concept is to use the additional gaze data and its possible combinations with touch interaction to make a single button much more expressive simplifying the multiple manual interactions by taking into account the user’s gaze during manual input [23]. This UI design has several advantages.

By combining the two input methods gaze and touch that can easily and naturally be used at the same time there can be a larger number of different interactions without needing additional UI space when compared to a conventional button.

Also, because of the little space that is needed by a single relatively small UI Element, the GazeButton can easily be integrated into existing UIs and still does not interfere with interactions that are already there. The UI remains intact with all its previous functionality.

Furthermore, despite the small size of the GazeButton it can be used for interactions on the whole screen since gaze can easily reach throughout the screen without needing additional UI space nor obscuring existing one.

As figure 1.1 illustrates, the GazeButton allows for three new input states that can be used for user interaction in addition to the conventional interaction (top left). The new states are:

1. Gaze UI - Touch GazeButton (top right) The user looks at the UI while they press the GazeButton. For example the user looks at a text field and taps the button to place the cursor at gaze position but if the user looks at another button the GazeButton might be linked to that button and act just like the button that is gazed at so that the user does not have to move the whole arm to the other button which would obscure the screen.
2. Gaze GazeButton - Touch UI (bottom left) In this case the GazeButton can be looked at to modify a classic UI touch function for instance to enter a shift character instead of a lower-case one when the letter is touched on an on-screen keyboard while looking at the GazeButton. As well as 1. (top right) this state can be used to add functionality at any point in the screen with just one unimposing button.
3. Gaze GazeButton - Touch GazeButton (bottom right) Both looking at and touching the GazeButton forms another unambiguous interaction which can be used to trigger an action.

In this thesis those possibilities will be explored on implemented examples in a text editing application prototype on a touchscreen tablet computer. Subsequently, a user study that has been designed and conducted to evaluate one of the GazeButton’s new interaction techniques is presented and evaluated. Among other findings, the study showed that, at a specific font size, participants preferred the new gaze based text selection over the conventional touch based one.

2 Related Work

This section discusses related work that involves research in the design of UIs for handheld devices as well as research in eye-tracking based interaction.

2.1 UI for Handheld Devices

There are some different challenges on mobile devices as opposed to desktop computers so re- searchers focused on the handheld touchscreen UI to improve it and make it more ergonomic to use. For example there is the grip and reach issue, meaning that the necessity to hold the device in at least one hand restricts the interaction possibilities of the holding hand. This issue with regard to thumb reachability has been investigated extensively by Bergstrom-Lehtovirta and Oulasvirta by modeling the "functional area of the thumb" of the holding hand [2], by Trudeau et al. with a special focus on thumb typing with both hands on different on-screen keyboards on tablet com- puters [34], by Wolf and Henze in terms of pointing techniques on two-handed tablet interactions [43], by Odell and Chandrasekaran who focused on tablet computer thumb interaction [22] and by Wolf et al. who tried to better understand the target selection on tablets by taking into account the hand’s kinematic model [44].

In addition various methods were suggested to avoid the grip and reach issue, in some cases allowing unimanual tablet interaction. Cheng et al., for instance, developed a prototype that uses grip sensing to adapt the on-screen keyboard to the current hand positions of the user [5] and Wagner et al. pointed out that UI elements can be moved automatically to the reachable area of the holding hand [40] while Hinckley et al. illustrated how pre-touch sensing, that detects multiple fingers above and around the screen, can be used to adapt the UI to the user’s holding position by estimating the finger and grip posture before they interact with the screen [9]. As for gaze-based solutions, Pfeuffer and Gellersen suggested to combine gaze and touch interaction to enable users to reach the whole screen using relaxed thumb touches that refer to the user’s gaze position [25]. A difficulty for users in their approach is that there is no visible UI element bound to this interaction.

In another work without eye-tracking, Pfeuffer et al. presented "thumb buttons" that are used to manipulate and extend the functionality of a stylus in an indirect way [26]. Those thumb buttons were placed near the anticipated position of the user’s grasp.

In this thesis the GazeButton is used similar to these thumb buttons but with a new gaze modality and without a stylus. Through the GazeButton the users are presented with a dedicated visual element for the oftentimes unfamiliar gaze interaction, making it more obvious and easier to understand.

2.2 General Gaze Interaction

In the works of Bolt [3] early research on eye tracking technologies and the progress made in it can be found. From those works one can learn about the beginning of gaze based interactions, the beginning of using eye tracking to give control to the users instead of merely observing the users’ viewing direction with it, which has been done before that. The idea was to use gaze interaction to give users the possibility not to be overwhelmed by many moving images at the same time ("World of Windows"[3]) in a stimulus-flooded world, "making the eye an output device"[3]. He saw high potential for gaze based interactions in the future with regard to interactive graphics. Such graphics became extremely common and diverse to this day. The usefulnus of eye movements also has been studied about 9 years later by Jacob [12] who pointed out that the barrier to using eye movement as a medium does not lie in the eye tracking technology. Instead, he saw the need to explore interaction techniques that enable users to perform gaze based human-computer interaction in a natural, unintrusive way. For example, using the eye blink for a signal has been rejected for being unnatural and requiring distracting deliberate thinking. With this naturalness in mind, Jacob implemented gaze based interaction techniques and alternatives of them like:

Scrolling Text by looking at arrows appearing above the first and below the last line of text where the text only scrolls if there is more text in the corresponding direction that can not be displayed within the visible area. The text then always scrolls when the user is looking at one of the described arrows. Unwanted scrolling is excluded based on the assumption that the user will look at the moving text when it starts scrolling which then stops the scrolling because the gaze moves away from the just viewed arrow.

Selecting an Object by looking at it and pressing a key or by looking at it for a predetermined amount of time (dwell time) that can be very short if a wrong selection can be undone by a new selection following it. Jacob thought the dwell time approach to be "much more convenient"[12], leading to "excellent results"[12] with 150-250 ms dwell time. As a good use for this method a display with two areas is proposed where one area contains several objects while the other area acts as a "continous attribute display"[12] always showing information about the last viewed object.

Moving an Object by selecting it like already mentioned, followed by a button press and moving the mouse where the mouse movement between the press and release of the button is translated into object movement or by selecting and moving the object with the eyes whereby a button is pressed to indicate the start of the movement and released to indicate the end of the movement. In the latter technique, instead of moving the object fluidly it actually jumps suddenly to the new eye fixation after about every 100 ms and stays there without motion to avoid unwanted motion caused by "eye jitter" [12] which is small movements of the eye that Jacob found to be normal in humans, usually rotating the eye less than one degree. Despite the hypothesis that this translation of eye movement into object movement would be hard to use, moving the object with ones eyes turned out to be more convenient making the mouse movement in the alternative technique feel redundant and unnatural.

A similar way of selecting objects with the eye without dwell time by button press and moving an object by linking eye movement to object movement on the screen is going to be presented in this thesis, using the GazeButton for text selection.

Later, in 1999, Zhai et al. [46] explored the use of eye gaze as an input medium as well and illustrated three reasons for using gaze interaction, especially for gaze pointing, namely:

- Making pointing possible for users who can not use their hands because of physical handicap or having their hands occupied with other tasks
- Accelerating interaction, taking advantage of the relatively high movement speed of eyes in comparison to other body parts
- Reducing physical stress and fatigue caused by typical computer interaction devices like mouse and keyboard by reducing or replacing their use with eye interaction which needs less energy.

In the same paper [46] Zhai et al. clarified two problems they thought previous gaze pointing techniques suffer from, one being that the subconsious motions of the eye, called "eye jitter" by Jacob [12], lead to a precision with a target area whose diameter is bigger than the size of a typical scrollbar and much bigger than the size of a typical character in a typical screen setup, the other being the fact that using dwell time to initiate actions is unnatural as the eyes are often used for tasks like searching without the intention of initiating an action which leads to unwanted dwell time actions when looking at an object for too long in a usual search. This problem of unwanted dwell time actions is called "’Midas touch’ problem"[12] by Jacob.

With those two mentioned problems of other gaze pointing techniques as a basis Zhai et al. stated that "it is unnatural to overload a perceptual channel such as vision with a motor control task"[46] so they wanted to improve traditional gaze pointing with their own approach called "MAGIC (Manual And Gaze Input Cascaded)"[46] with the aim to keep pointing and selection primarily manual using gaze just as a secondary aid which is achieved by a combination of gaze based and manual pointing that highly reduces manual cursor movement by letting the cursor jump to an eye gaze area around the target, followed by manual cursor movement for fine pointing and clicking for selection.

In the following year Sibert and Jacob evaluated their eye gaze interaction techniques [30], discovering that their gaze based object selection (already described in this thesis) was faster than common mouse selection proving the speed advantage of the eyes over the more physically de- manding alternative.

In 2012, Stellmach and Dachselt [31] again pointed out some solutions for the lower precision of gaze selection in contrast to mouse selection such as using bigger GUIs and combining gaze and manual input like in Zhai et al.’s "MAGIC" interactions. They tested those solutions and found them outperforming gaze-only cursor selection and being highly adaptable to handheld devices.

2.3 Gaze Interaction on handheld devices

Not being confined to desktop computers the use of gaze based interaction techniques has found use with powerful handheld devices as well and in 2007, Drewes et al. [7] tested the feasibility of gazed based interaction techniques for mobile phones in a user study even before phones had the needed processing power for eye tracking technology. They found that gaze interaction is attractive for mobile phone users. Recently, in 2018, Khamis et al. [13] presented a complete view on the past, present and future of eye tracking on handheld devices and described arisen challenges and opportunities in this area that suggest further research. Khamis et al. referred to the three tenses in the history of eye tracking as follows:

Past A phase beginning in the early 2000s, when eye tracking on mobile devices was new and handheld consumer devices lacked the necessary processing power for eye tracking which was assumed to change in the future.

Present The current time, after experiencing past advances in built-in front-facing cameras and processors of mobile devices, making it increasingly practical to investigate gaze interaction that involves eye tracking on unmodified handheld devices.

Future A phase where eye tracking is used widely and seamlessly as part of manifold daily interactions.

Khamis et al. described the past and present of handheld eye tracking considering "gaze behaviour analysis"[13], "implicit gaze interaction"[13], "explicit gaze interaction"[13] and the "lessons learned"[13] in that phase while discussing opportunities and challenges when it comes to the future phase.

In the past, the front-facing cameras of mobile devices were usually not sufficient for real- time eye tracking. As a result, the initial research involving eye tracking on mobile devices used external eye trackers like head-mounted devices or remote commercial eye trackers or they even expanded handheld devices by building their own hardware. This commitment testifies to high motivation in researching this topic and high believe in the advancement of eye tracking technol- ogy in this early phase in the history of mobile gaze interaction that nevertheless lead to insights like the finding that gaze bahaviour when scanning through search results differs on large screens in comparison to small screens [15] as they are common in mobile devices. As early as 2005, Dickie et al. [6] presented a system called "eyeLook" that is able to recognize whether a user is looking at a mobile device or not, creating the possiblity to pause moving content on the screen such as a running video automatically when the user is not looking at it. Khamis et al. [13] stated that, despite problems with ecological validity in the past research in mobile gaze interaction, there would be a clear message that using eye bahaviour such as gestures and smooth pursuit is more promising, better perceived and less reliant on calibration than dwell time based solutions which makes solutions without dwell time seem more suitable for handheld devices.

Khami et al.’s [13] present phase started with gaze interaction research on unmodified handheld devices that increased drastically since 2010. In that year, Bulling and Gellersen [4] discussed the aspects of upcoming research on eye tracking on handheld devices, stating that appliactions for gaze based interaction are typically limited to stationary setups but also mentioning the first video- based eye tracker that fits almost completely inside an ordinary glasses frame (Tobii Glasses).

On unmodified gaze-enabled handheld devices it became possible to do research in this field with high ecological validity. For example, Miluzzo [21] proposed "EyePhone", a system that uses the front camera of a mobile phone and machine learning algorithms to detect the position on the phone display the user is looking at and in 2012, Vaitukaitis and Bulling [39] were able to detect gaze gestures with an accuracy of 60 % with a prototype that runs entirely on an unmodified android smartphone. In the same year, Stellmach and Dachselt [31] explored through a user study how gaze as an interaction technique can be used with handheld devices to highlight important information on a distant larger screen while searching through images. For this they extended Zhai et al.’s MAGIC pointing [46] and adapted it to touch input using touch interaction for fine selection and developed two new variations called "MAGIC touch" [31] and "MAGIC tab" [31] that allowed users to fine position the cursor with touch input and iterate through a list of objects that are spatially close to the user’s viewpoint. Stellmach and Dachselt’s aim was in particular to figure out how enjoyable their gaze supported selection techniques appeared to the users finding out that their MAGIC variations were very robust concerning inaccurate gaze data and thus leading to low eye strain, high performance and high reported usability. Especially in their "MAGIC tab" they saw high potential for future works for allowing fast and accurate selection of small targets that are close together or overlapping.

From 2011, Turner et al. combined gaze and touch to extend the user’s physical reach across devices giving them the power to interact with remote displays [38] [37] [36]. This multimodal UI has been particularly explored on tablet and desktop computer touchscreens by Pfeuffer et al. [23] [24], demonstrating a high potential of gaze interaction when it comes to providing benefits by extending the usual touch input. For example, they gave users freedom in their choosing of the area on the screen they use for touch interaction by translating finger taps and multi touch gestures on the screen to the point the user is looking at instead of using the touch points as point of action [23]. This not only lets users choose freely where on the screen they touch without changing the action to be performed but also avoids occlusion of important areas on the screen by the users’ arms or hands because users would touch the screen somewhere where they do not obscure important information if they are presented this possibility.

In 2015, Mariakakis et al. [20] introduced "SwitchBack", a system that, like "EyePhone"[21], uses the front facing camera of a smartphone, in this case for enabling the phone to detect when the user starts or stops focusing their visual attention on the phone using Mariakakis et al.’s algorithm "Focus and Saccade Tracking" (FAST) [20] that additionally was able to identify how many lines of text a user has read in a controlled study with a mean absolute error of 3.9 %. This ability has been used to highlight the last read line of the user after a distraction occured in a smartphone text reading scenario with distractions improving the average reading speed by 7.7 %.

Such eye tracking techniques that only need a front-facing camera and software to compute the user’s gaze coordinates based on the iris positions in the camera’s captured image belong to the field of computer vision based gaze tracking which Hohlfeld et al. described as "promising low- cost realization of gaze tracking on mobile devices" [10] as the built-in hardware of smartphones is sufficient so no cost-intensive extra hardware is needed and it can be improved by new algorithms.

To find out how applicable computer vision based gaze tracking on mobile devices is in prac- tice, Hohlfeld et al. [10] started researching this topic based on two user studies using an un- modified Nexus 7 (2013) tablet and Wood and Bulling’s "EyeTab" algorithm for binocular gaze estimation [45] which is open source and was able to achieve a gaze estimation accuracy of 6.88 of visual angle at near-realtime speed of 12 frames per second [45]. Hohlfeld et al.’s first user study on computer vision based gaze tracking [10] focussed on assessing aspects they saw as crucial to the use of mobile gaze tracking such as the effects of varying conditions like viewing distance and lighting, including the use of glasses. They found out that different lighting and viewing distances can have a large effect while the use of glasses of the person whose eyes are tracked barely affects eye tracking performance. Problems they identified were the limited accuracy of EyeTab in their use case and that the eye tracking accuracy changes from the top to the bottom of the screen with the point on the screen the user is looking at when the front-facing camera is built-in at the bottom.

With this limitations in mind, Hohlfeld et al. conducted a second user study [10] finding out that in their setup it is possible to recognise users’ word fixations with a recognition rate of up to 77% in the top row on the screen but also that this recognition rate decreases down to 8% in the bottom row on the screen where the angle between the gaze of the user and the camera is very low so they suggested to only use this technique for detecting word fixations when the tested words are located in the upper half of the screen, relatively far away from the camera.

After stating that gaze based interaction using front-facing cameras of mobile consumer de- vices is expected to improve and in fact opens great opportunities like novel usability testing outside labs but also that there still are improvements needed, Khamis et al. [13] described those oportunities and challenges in their future phase of gaze-enabled handheld mobile devices, point- ing out that, while in the early days of eye tracking the challenges mainly related to hardware, nowadays many problems can be solved with software, nevertheless attributing high importance to "front-facing depth cameras" [13] believing that this new hardware will catalyse mobile eye tracking, making it widely used by consumers in everyday life.

This thesis takes inspiration from the described observations that confirm the attractiveness of gaze based interaction on handheld devices, and works to increase the experience in such interac- tions in the context of a text editing application on a touch screen tablet computer.

2.4 Eye Typing

Eye typing is the entering of text by utilizing the recognition of the user’s gaze, which in case of pure eye typing enables even people to type who can only move their eyes, so the first gaze typing that was introduced in the 1970s is an application for disabled people [19]. Since then, much research has been done in this area trying to increase the speed and accuracy of eye typing.

2.4.1 Dwell Time Based Eye Typing

Despite the Midas touch problem [12] inherent in dwell time based approaches, it is the only way of eye typing for some disabled people and there are many dwell time based eye typing methods. One of them is "pEyeWrite" [11] which uses two pie menus containing characters for typing. Those menus resulted from the desire to unify interfaces for gaze control. The initial bigger pie menu contains five characters per slice in six slices. If the user looks at one slice for 400ms they select it and a second, smaller pie menu appears that partly overlaps the bigger pie menu, mainly where the selected big slice is located. The smaller pie menu contains only one character per slice in its five slices which contain every character of the selected slice of the bigger pie menu. The user then selects a slice of the smaller pie menu by looking at it for 400ms and thus enters the character of the selected slice. Novice users achieved a maximum typing speed of 13.45 wpm with pEyeWrite and an average speed of 7.85 wpm. Another dwell time based eye typing method is "GazeTheKey" [28], which arised from the idea of making eye typing keys more dynamic, embedding word predictions into them and using a "two-step dwell time" [28].

2.4.2 Dwell Time Free Eye Typing

Without dwell time it is possible to type faster as there is no waiting involed and the Midas touch problem is avoided. One of the fastest eye typing applications until this day is "Dasher" with which 12 participants in a longitudinal user study, conducted by Tuisku et al. in 2008 [35], achieved an average eye typing speed of 2.49 wpm in the first and 17.26 wpm in the tenth session, one participant achieving 23.11 wpm average speed in session 9. Ward and MacKay, inventor and developer of Dasher [42], stated that after an hour of practice with Dasher users could write up to 25 wpm compared to 15 wpm on an on-screen keyboard. Furthermore, on-screen keyboard users were not only much slower but also had a fivefold as high error rate compared to Dasher users. Dasher is dwell time free, using pointing gestures in a dynamically changing graphical display of letters [41].

Another dwell time free eye typing application is "EyeSwipe" by Kurauchi et al. [17] which scans gaze paths to enter whole words at once that are selected out of a set of word suggestions after only the start and end of the swipe path are selected explicitly. To enable explicit eye selection without dwell time, Kurauchi et al. developed the selection mechanism "reverse crossing" [17] that works by 1) looking at the key that should be selected, 2) looking at the selection button that appears above the key and 3) looking back to the key that should be selected which then results in selecting it.

2.4.3 Eye Typing in Security and VR

Eye-typing also has been used in the security area. For example, Kumar et al. developed "Eye- Password" [16], a method for gaze based PIN and password entry that prevents shoulder-surfing, is only slightly slower than using a keyboard and is preferred over traditional methods by test sub- jects [16]. Khamis et al. designed a similar method called "GazeTouchPIN" [14] that combines gaze and touch to enter a PIN on the phone. Such methods make it hard for attackers to spy the password because gaze input is invisible unless you observe the eyes of the user.

Even in the context of virtual reality, eye typing has been explored. Recently, Rajanna and Hansen investigated eye typing in VR through two user studies with 32 people [27] using a VR headset with an eye tracking unit. They used dwell time based eye typing as well as eye typing with gaze selection and manual clicking for confirmation, both with a 2D keyboard that fits inside the user’s view and with a 3D curved keyboard that is larger than the user’s view, requiring head movement to be able to see and select all keys. Furthermore, the users executed the typing tasks while sitting as well as while biking. Results inlcude that, with the simple gaze typing methods Rajanna et al. implemented, gaze typing in VR is practical but confined, that the large curved keyboard increased physical strain and reduced typing speed in comparison with the smaller 2D keyboard and that, with a dwell time of 550 ms, participants performed better with the gaze and click solution and tended to prefer the method with clicking over the dwell time based one.

The previously described applications of gaze interaction focussed on several input methods or specific application cases. A different perspective is presented in this thesis, focussing on the extension of touch typing by using gaze interaction while keeping the freedom of typing conven- tionally, without losing the new functionality.

3 Gaze enhanced Text Editing

This section explains the prototype application for gaze enhanced text editing that has been devel- oped for this thesis.

3.1 Explanation of important Components

For easier understanding, in the following, some components of the application prototype that are also illustrated in figure 3.1 are explained beforehand. There are three main interaction areas in the software that are important to the system to decide which action should be performed after an interaction in those areas:

Text Area This is the area on the screen where all text is displayed for direct manipulation, always containing a blinking cursor or a green marked area, indicating the current position for manipulating the text or the current selection of the text. The text area stretches from a bit under the middle of the screen to the top of the screen, including all of its width.

Keyboard The keyboard is another 2D area on the screen that contains all the keys of the on- screen keyboard of the application and is used for all the text manipulation. It takes up the remaining room on the screen that is not covered by the text area.

GazeButton The GazeButton is actually a part of the keyboard, belonging to the same class as all the other keys but it plays a special role in terms of interaction, both concerning touch and gaze, which justifies the treatment as an area of its own. It is located both in the bottom left and in in the bottom right of the screen where both instances inherit the same functions. In practice, one of those instances might suffice.

Abbildung in dieser Leseprobe nicht enthalten

Figure 3.1: A screenshot of the running text editing application with markings indicating the text area (orange rectangle), the keyboard (green rectangle), both instances of the GazeButton (red rectangles) and the green text cursor (within the yellow ellipse).


Excerpt out of 40 pages


Eye Gaze Interaction for a Mobile Text Application. A New Interface Concept
LMU Munich
Catalog Number
ISBN (eBook)
ISBN (Book)
Die Inhalte der CD und gegebenenfalls weitere im Verlauf der Arbeit gesammelte Daten können persönlich angefragt werden.
gaze, interaction, mobile, text, application, interface, concept
Quote paper
Thomas Mayer (Author), 2019, Eye Gaze Interaction for a Mobile Text Application. A New Interface Concept, Munich, GRIN Verlag, https://www.grin.com/document/497717


  • No comments yet.
Read the ebook
Title: Eye Gaze Interaction for a Mobile Text Application. A New Interface Concept

Upload papers

Your term paper / thesis:

- Publication as eBook and book
- High royalties for the sales
- Completely free - with ISBN
- It only takes five minutes
- Every paper finds readers

Publish now - it's free