6 Ways you can use deep learning to improve mobile devices’ usability

Forget about the frustrating latency issues that crop up with mobile sensing and cloud computing. Near-zero latency is around the corner, with real-time data processing speeds to deliver optimum results.

Google Pixel 4 XL Android smartphone

This article was originally published on Miquido.com on Jan. 23, 2020 and was written by Radosław Holewa.

With an increased global demand for enhanced, more personalized mobile experiences, widespread AI and deep learning adaptation in the mobile app development industry is inevitable. Forget about the frustrating latency issues that crop up with mobile sensing and cloud computing. Near-zero latency is around the corner, with real-time data processing speeds to deliver optimum results. 

Apple’s advanced Bionic smartphone chips with built-in neural processing units already help neural networks run directly on-device at incredible speeds. Using Apple’s Core ML and Google’s ML Kit platforms and deep learning libraries like TensorFlow Lite and Keras, mobile developers can create applications with lower latency, fewer errors, and faster data processing.

The main advantage of on-device machine learning is that it offers users a seamless, accurate user experience. As there is no question of sending data to external servers for processing, you get improved data protection and user security and privacy. Also, with neural networks on mobile devices, you don’t need to connect to the internet to access every feature of your applications. You will, of course, still need the internet for most standard features.

Making use of mobile device computing capabilities to implement deep learning algorithms has undoubtedly improved the usability of mobile devices. Here’s how:

1. On-Device Speech Recognition

Speech recognition involves transforming or transducing input sequences into output sequences using recurrent neural networks (RNN), convolutional neural networks (CNN), deep neural networks (DNN), and other architectures. Developers struggled with the issue of latency — which creates delays between your request and the automated assistant’s response — but we can now get around it by using the compact recurrent neural network transducer (RNN-T) technology in mobile devices.

RNN-Ts are sequence-to-sequence models. Rather than following the usual method of processing an entire input sequence before producing an output, however, they maintain a steady continuity in their input processing and output streaming. This facilitates real-time speech recognition and processing. You see this with Google Assistant, which can process consecutive voice commands without faltering and without requiring you to invoke ‘Hey, Google’ after each request. 

It makes for a more natural, two-way conversation, and the Assistant will follow your instructions to a T. Want it to set an email subject, find a photo in one of your folders, and guide you to your sister’s place? It’s done.

Going forward with Google’s new Pixel 4, its Live Caption feature can provide subtitles to audio notes, podcasts, and videos in real-time and — because the processing is on-device — in airplane mode, as well. So, for example, if a video shows up in your Twitter feed, you can find out what it’s about from its captions, without needing to unmute the audio. Live Caption doesn’t work with music or with phone and video calls yet.

2. Increased Efficiency With Gesture Recognition

With on-device machine learning pipeline models, you can train your mobile device to detect, track, and recognize hand and body gestures. Your device camera records and stores your gestures and movements as 3D image data. The neural networks’ deep learning algorithms then use this gesture library to identify and decipher specific static and dynamic gestures. They then match them in real-time to your intent and execute your desired commands.

The Google Pixel 4 smartphones come with the Soli chip that facilitates complex and nonverbal interaction with your phone. This miniature radar sensor at the top of the phone powers the Motion Sense technology that can detect your presence and hand and body gestures to enable your phone interactions. With a wave of your hand, without even touching the phone, you can tell it to snooze, silence an alarm, or navigate to the next song on your playlist. 

3. Immersive Capabilities of Augmented Reality

Using Google’s ARCore and Apple’s ARKit platforms, developers can build augmented reality apps that can juxtapose digital objects and environments with real-life settings. Phone-based augmented reality’s immersive capabilities are having a significant impact on retail, entertainment, travel, and other industries. Brands like Lacoste and Sephora now allow their customers to try on or preview products with augmented reality apps, and a growing number of shoppers prefer to check out products on their phones before making the decision to buy them.

Interactive augmented reality games such as Pokemon, Ingress, and Ghostbusters World have received extensive press and a dedicated following. If you want to find your way around town, Google Maps Live View will provide you with real-time navigation.

The Leica Quad Camera on the P30 Pro deep learning
The Leica Quad Camera on the Huawei P30 Pro.

4. Higher-Quality Photographs

High photo quality is an important criterion for buyers when selecting smartphones, which they can get with many of the latest models. These come equipped with the hardware components — central processing units (CPUs), image signal processors, deep learning image algorithms, and neural processing units — that have catapulted smartphones into an entirely different realm from traditional cameras when it comes to taking photographs. With these, smartphones can show more awareness at pixel classification level of what they are seeing to shoot high-definition photographs. 

Google Pixel phones and Apple iPhones use multiple cameras and complex machine learning algorithms to recognize people and objects, create depth maps, seamlessly join long exposures, and calculate accurate color balance.

By training neural networks on a dataset of images, the algorithms learn how to respond to individual image requirements and retouch photographs in real-time. Developed by researchers from MIT and Google, the automatic retouching system allows photographers to apply different styles to an image before they even take the shot.

After a convolutional network carries out the image processing at low resolution, a mapping method known as affine color transformation modifies the image pixel colors. The network stores these transformational formulae in a 3D grid that then enables a high-resolution image output. It all occurs within milliseconds.

Smartphones are now also outpacing DSLRs in low light and night photography. By incorporating deep neural networks and sensors, smartphone cameras can capture sharper images with more colors than the human eye can perceive. 

Huawei, which introduced workable low light shots with its P20 Pro, uses RYYB filters, large sensors, and AI image processing in its Mate 30 series to offer high-quality, low-light photography as well as low-light videography. The Google Pixel 4 comes with Night Sight mode that can take photographs in the 0.3-3 lux range, and its astrophotography can capture a dark, starry sky. Along with a night mode that activates automatically in the dark, Apple’s new Deep Fusion system will adjust to the light levels and take iPhone photography to a more impressive level.

Even if you have no understanding of photography, you will be able to take great photos with these smartphones.

5. Increased Security and Privacy

Complying with the General Data Protection Regulations (GDPR) and the California Consumer Privacy Act (CCPA) has become easier with on-device machine learning. It guarantees data security, as you don’t need to upload data for biometrics, encryption, or live caption to a server or a cloud for processing.

On-device automatic encryption is another useful smartphone feature that protects your content with a PIN, password, or pattern and allows access to your data only when you unlock your phone. So, if you lose your device or it is stolen, the chance of anyone getting your data is negligible.

The iPhone’s Face ID feature is one example of a more secure smartphone experience. The on-device neural networks in the Apple smartphone chips process and safely store user facial data. The identification happens on your device, so your privacy and security remain unimpeded.

Google Pixel 4’s Face Unlock technology, facilitated by the Soli chip, uses 3D IR depth mapping to create your face models for face recognition and stores them on an on-device Titan M6 security chip. Face Unlock works well with the 1Password app to offer users biometric security by eliminating chances of identity fraud. To set up the 1Password app on Pixel 4, you only need to enter your details in the Autofill and use Face Unlock to sign in instead of the Fingerprint Unlock function.

6. More Accuracy in Image Recognition

Pairing on-device machine learning with image classification technology, you can identify and get detailed information in real-time about almost anything you encounter. Want to read a foreign language text? Scan it with your phone to get an instant and accurate translation. Did an outfit or a piece of furniture catch your fancy? Scan it to get information about the price and where you can buy it. Is there a tempting new dish on a restaurant menu? You can use your phone to find out its ingredients and nutritional information. 

By facilitating image recognition in real-time, apps like Google Lens, Calorie Mama, and Leafsnap are increasing mobile devices’ usability and learnability and enhancing user experience.

The possibilities of on-device machine learning are immense. With increasingly efficient intelligent algorithms, deeper neural networks, and more powerful AI chips, deep learning mobile applications will be standard in banking, retail, health care, data analytics, information technology, telecommunications, aerospace, and various other industries. According to Verified Market Research, the global deep learning market is likely to touch $26.64 billion by 2026, with the deep learning chipset technology market reaching $2.9 billion. As deep learning capabilities continue to improve, mobile devices’ usability features will evolve alongside and fuel further innovations.