Chapter 4. User Input

For decades, the primary means by which to interface with computers was by keyboard and mouse. A user was, quite literally, tethered to the device that they were using. The only way to get work done was to sit down at a workstation and get started. Eventually, laptops and notebooks allowed more mobility, but the input mechanisms were mostly the same.

Then came touch.

Today, Android and iOS devices are no longer kept an arm’s length away from a user. They exist in intimate, physical contact with users. When a button is pressed, it is, from the user’s perspective, tapped directly instead of through a trackpad or keyboard shortcut. This makes input one of the most critically important aspects of transforming any old app into a dynamic work of art that understands its user.

Input can take many shapes and form: tapping links in a web view, typing a password into a login form, or swiping across the screen at faces to see if there is an emotional connection with another lonely soul that might lead to the start of a relationship—or maybe even eventually blossom into love. The stakes are high, but the platforms are there to support you with a robust set of tools to take raw input from a user and transform it into an action that has a result they can see, hear, or touch.

Tasks

In this chapter, you’ll learn to:

  1. Receive and react to a tap.

  2. Receive and react to keyboard input.

  3. Handle compound gestures.

Android

While Android gesture APIs can be a little cumbersome, they are fairly transparent, and as a developer, you’ll have all the information and access required to satisfy even the most demanding touch-heavy apps.

Receive and React to a Tap

The tap is perhaps the most common form of user input in most modern mobile applications. Whether it’s tapping a button to submit a form, tapping an input text field to set focus to it, long tapping to reveal contextual options, or double tapping to zoom in or out of a map, this event is an intuitive expression of selection and acceptance.

It’s no surprise then that the Android framework makes capturing taps both simple and highly available.

Tip

For legacy reasons, the Android framework still uses the term “click” in some cases. In most touchscreen frameworks, “click” is synonymous with “tap.”

All View instances (including ViewGroups) accept a View.OnClickListener as a settable property (via setOnClickListener). Once set, the framework handles the underlying complexity, and the listener’s onClick method will be fired when any gesture matches the framework’s qualifications. To remove an action due to a tap on a given view, simply set the listener to null: myView.setOnClickListener(null);.

Note that View.OnClickListener is a simple functional interface with a single method: onClick(View view). This is literally copied and pasted from the source code at the time of this writing:

public interface OnClickListener {
  void onClick(View v);
}

This kind of architecture means that the interface can be implemented at virtually any level—on a controller like an Activity or Fragment, on the View instance itself, or on an anonymous class, a lambda, or a method reference. Additionally, click listeners can be assigned in XML layouts. We’ll take a look at each of these approaches.

Using a controller to implement View.OnClickListener:

Using a controller to implement a method reference:

Using a lambda:

Using an anonymous class instance:

On a View subclass that will always have the same click behavior:

Finally, you can use a method name (as a String) in the XML of a layout to assign a click listener. The containing Activity must have a public method with that name, which matches the signature of View.OnClickListener.onClick:

<!-- contents of res/layout/myactivity_layout.xml -->
<Button xmlns:android="http://schemas.android.com/apk/res/android"
    android:layout_width="wrap_content"
    android:layout_height="wrap_content"
    android:text="Click me!"
    android:onClick="myClickHandler" />

Note the Activity will automatically pick up the relationship and create the binding logic, without explicit references to either the method or the View:

public class MyActivity extends Activity {

  @Override
   protected void onCreate(Bundle savedInstanceState) {
    super.onCreate(savedInstanceState);
    setContentView(R.layout.myactivity_layout);
   }

   public void myClickHandler(View view) {
    Log.d("MyTag", "View was clicked " + view.toString());
   }

}

Note that a View can have at most a single OnClickListener set at any given time. In order to have multiple click listeners, you’ll either need to update the listener to call other listeners, or create a small framework to support it. For example, you could use the following to manage a list of callbacks in a single listener:

This could be used as follows:

While this might seem like a broad range of options to handle tap events, this is really just the tip of the iceberg. The Android framework provides access to touch events at multiple levels, and you could implement your own tap logic if you so choose—for example, you might want to fire a tap only after some delay, or you might want a more liberal (or more conservative) “wander” area (how far the original touch event can have traveled before it is no longer considered a tap). Fortunately, it’s unlikely you’ll ever need to do that, but we’ll dig into gesture management later in this chapter.

Receive and React to Keyboard Input

The Android framework handles key events quite differently than other UI frameworks you might have dealt with. Even KeyEvent—which is probably the API you’d expect to be dealing with directly—is very rarely accessed directly by a developer. Note that even the current documentation states:

As soft input methods can use multiple and inventive ways of inputting text, there is no guarantee that any key press on a soft keyboard will generate a key event: this is left to the IME’s discretion, and in fact sending such events is discouraged. You should never rely on receiving KeyEvents for any key on a soft input method.

This simply states that key events from “soft” (on-screen) keyboards are not guaranteed. They are guaranteed for “hard” keyboards (physical keyboards, like you’d find on a small selection of modern smartphones, or a portable keyboard attached via Bluetooth or USB); however, this isn’t very helpful since the great majority of key input events you’ll want to react to will be generated from a soft keyboard. Further, even hooking into these events requires some fairly complicated setup, including binding to an “IME” (input method), registering for focus, expanding and contracting a keyboard as required, etc.

When digging deeper into the developer documentation, we find a section entitled “Handle Keyboard Actions.” Sounds promising, but again we’re immediately presented with an attention-grabbing banner:

When handling keyboard events with the KeyEvent class and related APIs, you should expect that such keyboard events come only from a hardware keyboard. You should never rely on receiving key events for any key on a soft input method (an on-screen keyboard).

So what do we do? We have a couple strategies…

First, and more commonly, we might actually be more interested in change events fired when the value of an edit text changes, rather than the actual KeyEvent. In these cases, we have access to the TextWatcher interface, which requires three method implementations:

  • onTextChanged

  • beforeTextChanged

  • afterTextChanged

TextWatchers can listen for text change events on TextView instances, including EditText, using the addTextChangedListener listener.

Note

This is one of the few listener APIs that allow multiple listeners to be attached. To support that, there’s a corresponding removeTextChangedListener method as well.

Using a TextWatcher, we can detect when the value of an input text field has changed, which is often exactly what we’re looking to do when listening for key events. While the method signatures of the TextWatcher interface can vary significantly, each provides access to the text that was changed, either as an Editable instance or as a CharSequence instance:

Beyond text changes, and assuming our users are only rarely going to be using an external, physical keyboard, we need to concede that we’re mostly interested in soft keyboard behavior and understand the concept of “IME” a little. “IME” stands for “input method editor,” which is technically anything that can handle events from hardware components, but in reality is almost exclusively referring to soft keyboard management, usually through a TextView, and most commonly through an EditText instance, a subclass of TextView that has editing functionality built in.

Like most View configurations, an IME can usually be handled in either XML instructions or programmatic statements. The most common IME API is “IME options”: either android:imeOptions or TextView.setImeOptions, either of which accepts an integer representing various IME flags, thinks like “go,” “next,” “previous,” “search,” “done,” and “send” (among others). While the option semantic is sometimes expressed with behavior, that’s not always the case. For example, while “next” and “previous” will change the screen’s focus, “go,” “done,” and “send” may do nothing explicitly different, but should pass different values to attached listeners.

For example, you can create an EditText with android:imeOptions="actionSend". When that EditText receives focus, it will open a soft keyboard on the screen, with a button dedicated to the “Send” action (often this will appear as a button on the keyboard labeled “Send” in the device’s local language). Tapping this button will then trigger a registered TextView.OnEditorActionListener to fire its onEditorAction events (more on that in just a moment).

Similarly, you might have android:imeOptions="actionNext", which suggests the soft keyboard render a button with a “next” representation (often a right-pointing arrow). Tapping this button will generally send focus to the next available IME (probably an EditText) in the view tree.

If you want more specific control over the behavior of IME buttons, you have access to the TextView.OnEditorActionListener. You can assign an instance of this listener to an IME (like an EditText) using the setOnEditorActionListener method, just like you would any listener (and similarly, set this value to null to remove previously attached listeners).

OnEditorActionListener instances implement a single method: public boolean onEditorAction(TextView view, int actionId, KeyEvent event). Feel free to use any of the arguments passed to the listener, but generally the actionId flag will be the most interesting. In the last example, when the right-pointing button is tapped, any attached OnEditActionListener instances will fire their onEditAction methods, with the following parameters: the View instance that opened the keyboard, an integer constant equal to EditorInfo.IME_ACTION_NEXT, and a KeyEvent describing the “next” key press event.

Handle Compound Gestures

If you need gesture functionality beyond what’s provided out of the box, you have a couple mechanisms available. The most straightforward approach in our opinion is to simply override the onTouchEvent of a ViewGroup (or an Activity!) and manage each event in whatever fashion suits your needs. Each motion event has a type flag (e.g., a finger begins a gesture [ACTION_DOWN], moves across the screen [ACTION_MOVE], ends a gesture [ACTION_UP], or other, similar methods for multitouch). With this information and the judicious use of timestamps, you can accomplish any custom behavior your app may require.

There are additional APIs available that can make complex tasks easier when writing custom gesture functionality, like Scroller, which despite its name doesn’t actually perform any scroll movement but does have some very handy calculation methods for flings or inertial scroll decay. VelocityTracker is available to record motion events and provide information about velocity and acceleration across either axis.

If these are not enough or your needs don’t require that fine-grain control, a simple way to access gestures is to use GestureDetector (or GestureDetectorCompat from the support library). A GestureDetector instance can be passed a GestureListener, and provided touch events, to pass back common callbacks, including:

  • onDown

  • onFling

  • onLongPress

  • onScroll (think of this as “drag”)

  • onShowPress

  • onSingleTapUp

To accomplish that, you’ll need an instance of GestureDetector, which requires a Context instance and a GestureListener instance:

The GestureDetector instance takes care of most of the accounting; it’ll use system-provided values for things like gravity and touch slop, so you can be assured that your app will start a fling under the same conditions that a ScrollView or RecyclerView would.

When a parent ViewGroup contains View children that can consume touch events (even by having a simple View.onClickListener), an already complicated gesture management system can quickly become hard to manage. Generally speaking, you can use onInterceptTouchEvent in conjunction with onTouchEvent (see the developer docs on the former); between the two you’re pretty sure to be able to at least get access to touch events happening within any container.

Other event callbacks available to View class instances include:1

  • onKeyDown(int, KeyEvent): Called when a new key event occurs.

  • onKeyUp(int, KeyEvent): Called when a key up event occurs.

  • onTrackballEvent(MotionEvent): Called when a trackball motion event occurs.

  • onTouchEvent(MotionEvent): Called when a touch screen motion event occurs.

  • onFocusChanged(boolean, int, Rect): Called when the view gains or loses focus.

To learn more about gesture detection, check out Android’s great guide.

iOS

In 2007, Apple introduced the iPhone, and with it, Multi-Touch was born. Despite the ubiquity now, having the ability to use more than one finger on a glass screen was revolutionary at the time and transformed user interfaces. Touch is currently the primary method of interaction with a smartphone, but certainly not the only one. This chapter covers two of the most common input methods: touches and keyboards. Let’s dig in.

Receive and React to a Tap

The touch event APIs available in iOS are, arguably, the best in the industry. They’ve evolved slightly over time but have largely remained the same since iOS 4, which introduced gesture recognizers. This is by far the simplest method of intercepting touch events. Here’s an example of how to listen for a single tap on an image view within a view controller:

class SomeViewController: UIViewController {
    var imageView: UIImageView!

    override func viewDidLoad() {
        super.viewDidLoad()
        imageView = UIImageView(image: ...)
        let gestureRecognizer =
          UITapGestureRecognizer(target: self, action: #selector(handleTap(_:)))
        gestureRecognizer.numberOfTapsRequired = 1
        imageView.addGestureRecognizer(gestureRecognizer)
    }

    @objc func handleTap(_ gestureRecognizer: UIGestureRecognizer) {
        print("Image tapped!")
    }
}

We begin by declaring our UIViewController subclass, SomeViewController. Most of the action in this class happens within viewDidLoad(). This is part of the view life cycle in iOS and is where setup for a view controller’s view can often occur. Check out Chapter 2 for more information on views.

Within this method, the class’s image view, imageView, is set up. On the next line we declare a gesture recognizer of type UITapGestureRecognizer that is targeting this class via self and providing the handleTap(_:) method as the function to call when this gesture recognizer fires.

After setting the numberOfTapsRequired property on the gesture recognizer to 1, indicating it’s a single tap recognizer, we add the gesture recognizer to the image view defined before. Attaching a gesture recognizer to a view is required to get that recognizer to fire. In our example, this means whenever the image view is touched or tapped, it’ll go through the list of recognizer associated with it and attempt to resolve what touches are valid to trigger a particular gesture recognizer.

Assuming a touch registers for our gesture recognizers, the recognizer itself will call handleTap(_:), which we defined as the action a moment ago.

Note

Note that handleTap(_:) is an @objc method. This is because UIGestureRecognizer and subclasses require a #selector(...) to be passed in as the action fired when a gesture recognizer is activated.

There’s a little bit of boilerplate for our example, but it essentially comes down to two lines:

let gestureRecognizer = UITapGestureRecognizer(target: self, action:
#selector(handleTap(_:)))
imageView.addGestureRecognizer(gestureRecognizer)

We declare the gesture recognizer and attach it to a view.

Gesture recognizers are incredibly powerful. We’ll talk about them later in the chapter. For now, let’s turn our attention to another primary input source on iOS: the keyboard.

Receive and React to Keyboard Input

Unlike on Android, there have never been iPhones or iPads with physical keyboards built into them. It’s theoretically possible this might change in the future but is highly unlikely given Apple’s stance in the past. There are external keyboards (including a case made by Apple) for iPads, and certainly a Bluetooth keyboard can be connected to a device to serve as a replacement for the on-screen keyboard. That said, for an ecosystem so dependent on “soft keyboards,” the keyboard and text field libraries in UIKit are frustratingly—and shockingly—complex given how easy to use some of the other areas of UIKit are.

For example, the primary way to edit text on iOS is via UITextFields or UITextViews. There are separate delegate protocols for each of these user interface controls, and they differ slightly in functionality, but mostly in name. Each of these delegate protocols, although robust, does not have a purpose-built method to get updates whenever a text field changes.

There are other approaches to consider. For example, it’s possible to wire up a text field to call a handler for edit events like so:

class SomeViewController: UIViewController {
    override func viewDidLoad() {
        super.viewDidLoad()
        textField = UITextField(frame: ...)
        textField.addTarget(self, action: #selector(textFieldDidChange(_:)),
        for: .editingChanged)
    }

    @objc func textFieldDidChange(_ textField: UITextField) {
        print(textField.text)
    }
}

In the example, within the SomeViewController view controller, we define a UITextField named textField that adds a target action for textFieldDidChange(_:) on the .editingChanged event. Whenever a user edits text in a text field, the textFieldDidChange(_:) method will get called for each character that is added or updated; in our example, we print() out the text field’s text via print(textField.text).

This works most of the time until the text field is edited programmatically. Then, our textFieldDidChange(_:) method falls silent, and our text changes surreptitiously without notification.

A more foolproof method to capture text field edits is by adding a notification observer like so:

class SomeViewController: UIViewController {
    override func viewDidLoad() {
        super.viewDidLoad()
        textField = UITextField(frame: ...)
        NotificationCenter.default
            .addObserver(self, selector: #selector(textFieldDidChange(_:)),
            name: UITextField.textDidChangeNotification, object: textField)
    }

    @objc func textFieldDidChange(_ notification: Notification) {
        let textField = notification.object as! UITextField
        print(textField.text)
    }
}

This example is similar to the previous example, but has a few differences. First of all, after defining our UITextField, we are no longer listening for the .editingChanged event; we are now listening for the UITextField.textDidChangeNotification. Our same method from before, textFieldDidChange(_:), is called whenever the notification observer fires; however, in order to target the text field, we cast the notification.object to a UITextField in order to read out the text value in the subsequent print(textField.text) line.

Up to now, we’ve been operating only on UITextField. What happens when you need to observe multiple text inputs and a mix of UITextField and UITextView objects? Your code quickly might spiral into something such as this:

class SomeViewController: UIViewController {
    var textField1: UITextField!
    var textField2: UITextField!
    var textField3: UITextField!
    var textView1: UITextView!
    var textView2: UITextView!
    var textView3: UITextView!


    override func viewDidLoad() {
        super.viewDidLoad()
        NotificationCenter.default
            .addObserver(self, selector: #selector(textFieldDidChange(_:)),
            name: UITextField.textDidChangeNotification, object: nil)
        NotificationCenter.default
            .addObserver(self, selector: #selector(textViewDidChange(_:)),
            name: UITextView.textDidChangeNotification, object: nil)
    }

    @objc func textFieldDidChange(_ notification: Notification) {
        let textField = notification.object as! UITextField
        doSomething(textField.text!)
    }

    @objc func textViewDidChange(_ notification: Notification) {
        let textView = notification.object as! UITextView
        doSomething(textView.text!)
    }

    private func doSomething(for text: String?) {
        print(text)
    }
}

Sad.

But let’s take these melancholy thoughts and incomplete frameworks and focus on something different. Let’s bring it back around to touch input again and discuss more complex gesture recognizers. This is an area of UIKit that scales successfully from straightforward logic to elaborate experiences without too much weight placed upon the developer.

Handle Compound Gestures

Gesture recognizers are great for simple tap gestures with one finger. But they are also handy for complex interaction chains. Let’s take a look at the following example:

let doubleTapRecognizer = UITapGestureRecognizer(target: self,
action: #selector(handleTap(_:)))
doubleTapRecognizer.numberOfTapsRequired = 2

This code is similar to our code from before for our single-tap gesture recognizer. However, by simply changing one property value we can transform it into a double-tap gesture recognizer.

There are other gesture recognizers prebuilt into UIKit. If you’re looking to recognize three-finger pan gestures, you can create one with the following code:

let panGestureRecognizer = UIPanGestureRecognizer(target: self,
action: #selector(handlePan(_:)))
panGestureRecognizer.minimumNumberOfTouches = 3

Or, if you’d rather listen for something that requires a physicality beyond our reach, we present the five-fingered, triple-tap gesture:

let fiveFingerTapRecognizer = UITapGestureRecognizer(target: self,
action: #selector(handleTap(_:)))
fiveFingerTapRecognizer.numberOfTapsRequired = 3
fiveFingerTapRecognizer.numberOfTouchesRequired = 5

You’re probably not likely to see that in many shipping apps. However, a common problem in touch interfaces is that you’re often listening for multiple touch events on a single view. How do you listen for a single-tap gesture and a double-tap gesture without accidentally firing the single-tap gesture first? Here’s how it might look:

// Create a double tap recognizer
let doubleTapRecognizer = UITapGestureRecognizer(target: self,
action: #selector(handleTap(_:)))
doubleTapRecognizer.numberOfTapsRequired = 2

// Create a single tap recognizer
let singleTapRecognizer = UITapGestureRecognizer(target: self,
action: #selector(handleTap(_:)))
singleTapRecognizer.numberOfTapsRequired = 1
singleTapRecognizer.require(toFail: doubleTapRecognizer)

First, we create a double-tap gesture recognizer named doubleTapRecognizer. We set the numberOfTapRequired to 2. Next, we create a single-tap gesture recognizer named singleTapRecognizer. We set the number of taps to 1, but then call a separate method, require(toFail:), and pass in the double-tap gesture recognizer from before.

The require(toFail:) method is a method that all gesture recognizers have that allows them to fire only if another recognized gesture recognizer fails first. Wiring the recognizers up this way allows the single-tap recognizer to wait until the double-tap gesture recognizer fails before it’ll call its handler. Not linking the two gesture recognizers would mean that the single-tap recognizer would fire on the first tap of the double-tap gesture recognizer and on the second tap.

Ideally, this makes it easy to see how it’s possible to wire up multiple compound gestures with defined execution priority. The number of gesture recognizer combinations you can create is essentially infinite; it’s beyond the scope of this book to catalog them all, but if you’re interested in finding out about more gesture types, check out the Apple developer documentation on UIGestureRecognizer.

Touch Events API

One of the features of the responder chain on iOS is the fine-grained touch events API available to all responders (e.g., views and view controllers). It’s an incredibly powerful set of methods that are fired whenever a touch begins, moves, ends, or is cancelled. However, given the simplicity and powerful functionality of gesture recognizers, they are almost always the preferred method except in specific circumstances where a custom user interface requires a bit more fine-grained touch interaction. For these cases, check out touch events available to UIResponder objects.

What We’ve Learned

In this chapter we’ve seen the similarities and differences in listening for and receiving user input in Android and iOS. User input can be in the form of simple touches, complex gestures, or on-screen and external keyboards.

  • Both platforms have similar mechanisms for listening to and responding to simple touch events.

  • Both Android and iOS can receive text input from a variety of sources, but iOS requires a bit of hand-holding due to a slightly unwieldy pattern for receiving said input.

  • There are ways to detect and respond to complex gestures built in to the operating systems, but both platforms have gestures that aren’t typically used in the other platform.

Touch input is what makes Android and iOS devices so intuitive and intimate. It’s important to understand how to handle and construct an app that receives input to build an app that is usable.

In the next chapter, we’ll dive a bit more into the objects and patterns that aren’t as directly user facing as what’s been covered so far. Let’s go!

Get Native Mobile Development now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.