Not surprisingly, this year’s smartphones feature faster processors than those from last year—that happens every year. But what is new this year is the predominance of machine learning features that just about every processor vendor is touting as a way of differentiating their devices. This is true for the phone vendors who design their own chips, the independent or merchant chip vendors who sell processors to phone vendors, and even the IP makers who design the cores that go into the processors themselves.
First a little background: all modern application processors include designs (often referred to as intellectual property, or IP) from other companies, notably firms like ARM, Imagination Technologies, MIPS, and Ceva. Such IP can appear in various forms—for example, ARM sells everything from a basic license for its 32-bit and 64-bit architecture, to specific cores for CPUs, graphics, image processing, etc., that chip designers can then use to create processors. Typically, chip designers mix and match these cores with designs of their own, and make various choices regarding memory, interconnects, and other features, in an effort to balance performance with power requirements, size, and cost.
On the CPU front, most chips have a combination of larger cores that are more powerful and run faster and hotter, and smaller cores that are more efficient. Typically, phones will use the smaller cores most of the time, but for demanding tasks will switch to the higher-performance cores and use a combination of both cores and GPU and other cores to best manage performance needs and thermal considerations (you can’t run the high-performance cores for very long, because they would overheat, and usually you don’t need to). The best-known examples for the big cores are ARM’s Cortex-A75 and A73 cores; the matching smaller cores would be the A55 and A53. In today’s high-end phones, you’ll often see four of each, in what is known as an octa-core layout, though some vendors have taken other approaches.
For graphics, there’s more diversity, with some vendors choosing ARM’s Mali line, others picking Imagination Technologies’ PowerVR, and still others opting to design their own graphics cores. And there’s even more diversity when it comes to things such as image processing, digital signal processing, and as of late, AI functions.
Apple started pushing its AI capabilities in its fall phone announcements, including notably the “A11 Bionic” chip used in the iPhone 8 and 8 Plus, as well as the iPhone X.
The A11 Bionic is a six-core architecture, with two high performance cores and four efficiency cores. Apple designs its own cores (under an ARM architecture license), and has traditionally pushed single-threaded performance. This is a step up from the four-core A10 Fusion, and Apple said the performance cores in the A11 are up to 25 percent faster than in the A10, while the four efficiency cores can be up to 70 percent faster than the A10 Fusion chip. It also said that the graphics processor is up to 30 percent faster.
Apple talks about the chip having a dual-core “Neural Engine,” which can help with scene recognition in the camera app, and Face ID and Animoji on the iPhone X. The company also released an API called CoreML, to help third-party developers create applications that take advantage of this.
Apple typically doesn’t give a lot of information about its processors, but says that the A11 Bionic neural engine is a dual-core design that can perform up to 600 billion operations per second for real-time processing.
Unlike most of the other processor makers, Apple doesn’t integrate the modem into its application processors, and instead uses stand-alone Qualcomm or Intel modems. There has been some controversy as to whether Apple only supports the features in its Qualcomm modems that are also supported by Intel; in practice, this means iPhones support 3-way carrier aggregation but not some of the more advanced features.
Huawei was also early to the AI push, and called its Kirin 970, which it announced at the IFA show last fall, “the world’s first mobile AI processing unit.” The Kirin 970 is used right now in the Huawei Mate 10. It includes four Cortex-A73 CPU cores running at up to 2.4 GHz and four A53s running at up to 1.8 GHz, along with ARM’s Mali G72 MP12 GPU.
What’s particularly new in the 970 is what Huawei calls its NPU, or Neural Processing Unit. The company has said that the tasks that can be offloaded to this processor can see 25 times the performance and 50 times the efficiency versus those running on the CPU cluster. This is aimed in particular at faster image recognition and better photography. At the show, Huawei said the phone can process 1.92 16-bit TeraFLOPs.
The Kirin 970 has a dual-image signal processor, a Category 18 LTE modem with 5-carrier aggregation, and 4-by-4 MIMO that should enable a maximum download speed of 1.2Gbps.
At Mobile World Congress, Huawei announced its first 5G modem, the Balong 5G01, which it said would be the first 5G modem to ship. It seems likely that some future applications processor will adopt this modem as well, but that hasn’t been announced yet. Technically, all these products are created by the firm’s HiSilicon subsidiary.
The chip likely to be at the heart of most of the flagship Android phones in the US this year is Qualcomm’s Snapdragon 845. This is an upgrade of the Snapdragon 835, which was used in most of 2017’s premium Android phones, and is already used in the North American versions of the Galaxy S9.
As with most of the other vendors, Qualcomm is pushing neural networks and AI as one of the biggest areas of improvements in this year’s chip, along with an increased focus on “immersion”—which essentially means better imaging.
In the AI area, Qualcomm likes to talk about having a multi-core Neural Processing Engine (NPE), which uses a new version of its Hexagon DSP as well as the CPU and GPU for inferencing.
The chip has the Hexagon 685 DSP, which Qualcomm says can more than double AI processing performance; a Kryo 385 CPU, which it says provides a 25 to 30 percent performance increase for its performance cores (four ARM Cortex-A75 cores running at up to 2.85 GHz), and up to a 15 percent performance increase for its “efficiency cores (four Cortex-A55 cores running at up to 1.8 GHz), with all sharing a 2MB L3 cache; and an Adreno 630 GPU, which Qualcomm says will support a 30 percent performance improvement or a 30 percent power reduction, as well as up to 2.5 times faster displays.
In the AI area, the chip supports a large number of different machine learning frameworks, and the company says this works for things such as object classification, face detection, scene segmentation, speaker recognition, and etc. Two highlighted applications are live bokeh effects (for producing portraits with a blurred background) and active depth sensing and structured light, which should allow improved face recognition. By moving inferencing from the cloud to the device, Qualcomm says you get the benefits of low latency, privacy, and improved reliability.
In the imaging area, the chip has a new version of Qualcomm’s Spectra ISP, improved Ultra HD video capture with multi-frame noise reduction, the ability to capture 16-megapixel video at 60 frames per second, and 720p slow-mo video at 480 frames per second. For VR, the 845 supports displays with a 2K-by-2K resolution at 120 frames per second, a big step up from the 1.5K-by-1.5K at 60 frames per second supported by the 835.
Other features include a secure processing unit, which uses its own core to store security information outside of the kernel, and works with the CPU and Qualcomm’s TrustZone capability.
The 845 integrates the X20 modem that Qualcomm introduced last year, which is capable of supporting LTE Category 18 (with speeds up to 1.2 Gbps), up to 5 carrier aggregation and 4X4 MIMO, and uses techniques such as Licensed-Assisted Access to make faster speeds possible in more areas.
The chip is manufactured on Samsung’s 10nm low-power process.
Qualcomm also makes the Snapdragon 600 family of application processors, led by the 660, which is used by many Chinese vendors, including Oppo and Vivo. In the run-up to Mobile World Congress, it introduced the Snapdragon 700 family, which has many of the same features as the 800 family, including the Hexagon DSP, Spectra ISP, Adreno graphics, and Kryo CPU. Compared with the 660, Qualcomm says it will offer a 2x improvement in on-device AI applications, and a 30 percent improvement in power efficiency.
While it uses Qualcomm processors in most of its North American phones, in many other markets, Samsung uses its own Exynos processors, and is starting to make such processors available to other phone makers.
Its new top-of-the-line is the Exynos 9810, which Samsung will use in international versions of the Galaxy S9 and S9+.
Again, Samsung is pushing new features for “deep learning-based software,” which it says helps the processor to accurately identify items or persons in the phones, and supports depth sense for face recognition.
The 9810 is also an octa-core chip, with four A55 cores for power efficiency and four custom CPU designs for performance. Samsung says these new cores, which can run at up to 2.9GHz, have a wider pipeline and optimized cache memory, giving them twice the single-core performance and 40 percent more multi-core performance compared with its predecessor, last year’s 8895. (Published benchmarks show improvements in the real world, but not as much as claimed; I remain skeptical of all the mobile benchmarks at this point.)
Other features include Mali-G72 MP18 graphics, support for up to 3840-by-2400 displays and 4096-by-2160 displays, a dual image signal processor (ISP), and support for 4K capture at 120 frames per second. The 9810 also has a Category 18 modem with 6 carrier aggregation and 4-by-4 MIMO for downlink (2 CA for uplink), with a maximum 1.2 Gbps downlink speed and 200 Mbs uploads. On paper, this matches the Category 18 modems that both Qualcomm and Huawei have in their current top chips. Like the Snapdragon 845, it is manufactured on Samsung’s second generation 10nm FinFET process.
MediaTek has been more of a player in mid-range phones and below, and last month introduced a new chip called the Helio P60 aimed at the “New Premium” market—mid-market phone in the $200-$400 range that offer all of the basic features of the higher end phones. The first phone announced that will use this chip is the Oppo R15.
The company’s top processor, announced last year, is the Helio X30, which is a deca-core processor aimed at premium phones. This includes two ARM Cortex-A73 CPU cores running at up to 2.5 GHz, four Cortex-A53 cores running at up to 2.2 GHz, and four A35 cores that can run at up to 1.9 GHz, along with Imagination’s PowerVR Series 7XT Plus graphics at 800 GHz and an LTE Category 10 modem capable of 3-carrier aggregation on the downlink. It’s an interesting chip, produced on TSMC’s 10nm process, and pushes the idea that more cores can be more flexible. Among the phones announced that use this are the Meizu Pro 7 Plus with dual screens, and the Vernee Apollo 2 (8MP front camera, 16MP + 13MP rear cameras).
Last year, MediaTek announced two mid-market processors, the Helio P23 and P30, aimed at global markets and China specifically, each with eight Cortex-A53 cores running at 2.53 GHz, and Mali G71 MP2 graphics. These are the chips that the P60 is designed to supersede, and offer more power and enable a series of new features.
The P60 offers more performance, and is a return to the big.LITTLE configuration ARM and MediaTek pushed in previous years, combining four of the more-powerful ARM Cortex-A73 at up to 2.0 GHz with four of the more-efficient Cortex-A53 cores, also at 2.0 GHz. These are joined by an ARM Mali G72 NMP3 GPU at up to 800 MHz, and are all controlled by the fourth version of MediaTek’s CorePilot technology for scheduling where tasks run. Compared with the P23 and P30, MediaTek says the P60 offers a 70 percent performance enhancement in both CPU and GPU operations.
MediaTek too is getting on the AI bandwagon, with the P60 including its NeuroPilot platform for neural network hardware acceleration. This supports Google Android Neural Network (NN) and the common AI frameworks, including TensorFlow, TensorFlow Lite, Caffe, and Caffe 2. This is effectively a specialized digital signal processor capable of 280 GMACs (billions of multiply-accumulate operations per second). It is designed to be used for things like facial recognition for unlocking a phone (something we’ve seen in high-end phones but not mid-range phones until now), and object recognition, even in videos, at 60 frames per second.
In addition, the P60 has a number of new image features, including three image sensor processors that can support a dual-camera configuration of 16 and 20 MP sensors or a single camera at up to 32 MP. (I haven’t yet seen a phone in production with a camera sensor with that many megapixels but they are supposedly coming.) These sensors add noise reduction features, along with real-time bokeh (the blurring of the background used in portrait modes).
The chip includes a modem that supports Category 7 downloads (at up to 300 Mbps) and Category 13 uploads (up to 150 Mbps with 2 carrier aggregation). It is manufactured on TSMC’s 12nm FinFet process, which the company says helps it deliver 25 percent power savings for power-intensive applications such as games, and 12 percent power savings overall.
Spreadtrum, which makes modems mostly sold in the Chinese market, announced a partnership with Intel that will use Intel’s 5G modem and ARM-compatible CPUs. This is still a couple of years away, so details aren’t yet available.
Note that while Spreadtrum isn’t very visible in the US, it trails only Qualcomm and MediaTek in the merchant market for application processors. It mostly sells products with ARM CPUs and its own 4G modem, but has a deal with, and is minority-owned by, Intel. This has resulted in a chip with Intel CPUs and Spreadtrum’s modem (the opposite of the new announcement).
Of course, it’s not only the chipmakers who see AI as the next big wave, and the companies that make the IP have also been making a big push in this area.
ARM, the most successful of the IP makers, announced a suite of IP for machine learning last month, including both hardware and software, and pushed this at Mobile World Congress.
Dubbed Project Trillium, this includes processor designs (IP) for both Machine Learning (ML) and Object Detection (OD), along with a new software library.
The ML processor is designed to sit within an application processor and run next to the CPU, GPU, and display core. The software library, which is known as ARM NN (neural network), is designed to support frameworks like TensorFlow, Caffe, and Android NN. This enables these applications to run through software alone on existing processors that have ARM CPUs and graphics; though of course, it will be sped up considerably when run on processors that include the ML cores. Third party software will also work on the processor core. ARM says the ML core was designed from the ground up specifically to run neural networks. It can run both 8 and 16-bit applications, though the trend is to focus on 8-bit for simplicity.
The OD processor is designed to sit alongside an image signaling processor (ISP), in order to provide low power object detection, specifically for applications like face detection and tracking movement. This is a dedicated hardware block designed to be used with new sensor technologies such as stereoscopic cameras.
ARM said the new IP would be available for developer preview in April and would be generally available later this year, but given a typical time cycle it’s unlikely the new processor cores would appear in chips until 2019 or later. Of course, the software, which works on existing cores, could be deployed much sooner.
ARM also pushed some new solutions for the Internet of Things, including a new SIM solution called Kigen, designed to be built inside SoCs for low-power devices to replace today’s physical SIM cards.
Imagination, known for its PowerVR graphics, announced its neural networking IP last fall, the PowerVR 2NX Neural Network Acceleration (NNA). This is a flexible architecture with one to eight cores, each of which can have 256 8-bit multiplay-accumulate units (MACs). Imagination has said it can perform over 3.2 trillion operations per second.
Other IP vendors are getting into the market as well. Ceva, which is known for its DSP cores, just announced NeuPro, a family of AI processor cores designed for edge devices. These build on processors the firm has sold in the computer vision area, and use the CDNN framework for a variety of “AI processes.” This will work with the common machine learning frameworks, and convert these to run on mobile processors for inferencing. The company plans processors ranging from 2 to 12.5 teraops per second (TOPS) designed for consumer, surveillance, and ADAS products (for autonomous vehicles). Ceva has said that one major automotive customer plans to enable 100 TOPS of performance using less than 10 watts of power. Licensing will start in the second half of this year.
Ceva also announced its PentaG platform of DSPs for 5G baseband modems. The company says that that its current DSPs are in 40 percent of the world’s handsets, covering about 900 million phones a year, and in modems from Intel, Samsung, and Spreadtrum. The new platform has more AI, used particularly for “link adaptation.” In the 5G world, handsets can have multiple links to a base station, and Ceva says its hardware and software helps determine the best link every few milliseconds. This can save a lot of power compared with using software alone. This isn’t a general-purpose DSP or neural network chip, but rather one designed specifically for communications. It was just announced and should be available in the third quarter.
Ceva is also making a big push for DSPs in the 5G base station market, and has said that as much as 50 percent of the 5G new radio infrastructure will use the company’s DSP IP, including systems from Nokia and ZTE.