Apple Intelligence

Two months ago, Apple announced a significant enhancement to their ecosystem: Apple Intelligence. This update integrates OpenAI's ChatGPT-4o, promising a wealth of new smart features and a smarter Siri.

“With ChatGPT from OpenAI integrated into Siri and Writing Tools, you get even more expertise when it might be helpful for you — no need to jump between tools. Siri can tap into ChatGPT for certain requests, including questions about photos or documents. And with Compose in Writing Tools, you can create and illustrate original content from scratch.

You control when ChatGPT is used and will be asked before any of your information is shared. Anyone can access ChatGPT for free, without creating an account. ChatGPT subscribers can connect accounts to access paid features within these experiences.”

- Apple

We will also get to see some tools for developers to use and expand our apps to take advantage of the system by integrating Writing tools, Image Playground API and new Assistant APIs for AppIntents into our apps. Let’s take a Quick Look at what these tools are and what they can do:

Language Tools and Focus

Apple has announced the new Writing Tools, which will allow users to rewrite, summarise, or proofread text inline anywhere they write it, even when it's in third-party apps.

There will be similar features to the mail app, where we could summarise a long e-mail straight from the inbox inline. We will also have a smart reply panel that will understand the context of the mail and provide the user with a list of options to choose from. When selected, these options will generate a reply and ask for your approval before sending it.

Apple Intelligence will also be able to prioritise messages in mail and notifications and group them into a separate list at the top of the notification centre. In addition, the new focus mode will be able to understand the context of the notifications and show you if they need your immediate attention.

For developers, Writing Tools will automatically be available in native text views like UITextView, NSTextView, and WKWebView. It also lets you summarise text in non-editable text views like WKWebViews and display it outside in a panel, which can then be copied or shared. We will also be getting new delegate methods and properties for UITextView/NSTextView and WKWebView that let us customise the behaviour of writing tools.

func textViewWritingToolsWillBegin(_ textView: UITextView) {
    // Prepare for Writing Tools session
}

func textViewWritingToolsDidEnd(_ textView: UITextView) {
    // Clean up after Writing Tools session
}

When Writing Tools interacts with the text view, we can check isWritingToolsActive to see if it's active and avoid certain operations in methods like textViewDidChange. We can also provide a range using the following delegate method, which makes writing tools ignore the text in that range.

func textView(_ textView: UITextView, writingToolsIgnoredRangesIn enclosingRange: NSRange) -> [NSRange] {
    let text = textView.textStorage.attributedSubstring(from: enclosingRange)
    return rangesInappropriateForWritingTools(in: text)
}

The writingToolsBehaviour property lets us specify the behaviour we want by specifying one of the 4 enum values, as follows. WKWebView's default value is set to .limited.

.none
// An option to prevent the writing tools from modifying the text in the view.
case `default`
// An option to let the system determine the best way to enable writing tools for the view.
case complete
// An option to provide the complete writing tools experience for the text view.
case limited
// An option to provide a limited, overlay-panel experience for the text view.

Writing Tools supports plain text, rich text(AttributedString) and Tables, which can be specified through the writingToolsAllowedInputOptions property of the text view. By default, the setting for writingToolsAllowedInputOptions in UITextView is:

[.plainText, .richText]

“If you don't set it, we assume your text view can render plain text and rich text, but not tables. If your text view only accepts plain text or if it can handle tables, you can specify the options explicitly.”

- Get started with Writing Tools (WWDC 24)

So unless explicitly set otherwise, a UITextView is assumed to handle plain text and rich text for Writing Tools, but not tables.

In addition to these advanced Writing Tools, Apple is introducing powerful image-generation capabilities with Image Playground.

Image Playground, Genmoji and Redesigned Photo App

According to Apple, our apps will be able to generate images using a description or a photo in the library using Image Playground, along with a dedicated ImagePlayground app. We can also generate images based on the context of the message chat or a slide.

Furthermore, we can turn rough sketches into pictures in the notes app. By drawing a circle around a sketch or in an empty space, the Image wand will generate a picture based on the sketch's context.

In iOS 18, we will see the new Photos app, which is completely redesigned. There’s no tab bar, instead everything is organised on one screen which lets you see the library, recents and collection as you scroll. Also, when you scroll down, it snaps to the Library view, where we can see the photos filtered by year and month with more filter options. We will also be able to search photos and videos by describing what you’re looking for, and we can even find a particular moment in a video clip that fits our search description. The new Clean Up tool in the Photos app will allow us to remove background objects from the photos.

What is Genmoji?

iOS 18 will have an updated Emoji keyboard with all the personalised content, stickers, Memoji, and Animoji in one place. It will also give you the option to describe an emoji, and with that description, it will generate one that closely matches it. Normal emoji are standardised Unicode characters, meaning they are defined by a universal set of codes that represent specific images, and these codes are interpreted by devices to display the corresponding emoji image. On the other hand, Genmoji or other personalised emojis/images are not tied to Unicode standards. Instead, it uses a newly introduced API NSAdaptiveImageGlyph.

The NSAdaptiveImageGlyph is introduced to support using AI-generated images or any personalised image like standard emojis within the text. This API enables these images to be integrated into text, allowing them to be formatted, copied, pasted, and displayed alongside regular text in rich text views.

“An NSAdaptiveImageGlyph is data object for an emoji-like image that can appear in attributed text.”

- Apple

These adaptive image glyphs are based on a standard image format with multiple resolutions, ensuring they look good in any context. They also come with additional metadata, such as a unique identifier and a content description for accessibility, which helps maintain consistency and usability across different devices and platforms. For apps that already support rich text, enabling this functionality might require just a simple configuration change. For example, setting the newly introduced supportsAdaptiveImageGlyph property to true will enable the support for Genmoji.

let textView = UITextView()
textView.supportsAdaptiveImageGlyph = true

Siri

Siri draws on Apple Intelligence’s capabilities to deliver more natural, contextually relevant, and personal assistance to users using the new features of the App Intents framework.

Introduced in iOS 16, the AppIntents framework provides a way to define custom intents that enable Siri to perform specific app actions. These actions are provided to Siri on the first launch of your app, unlike the Intents Extensions, which require us to donate to make Siri learn user behaviour over time. This framework simplifies connecting app features to Siri and other system services.

How do AppIntents currently work?

To expose our app's functionality to system experiences, such as Siri, Shortcuts, Widgets, and so on, we have to implement AppShortcutsProvider. This lets us provide a list of AppShortcuts and a ShortcutTileColor.

The property appShortcuts is a result builder that lets you define the Shortcuts your app provides one by one, just like we do in SwiftUI Views.

We can provide shortcuts using the struct AppShortcut, which takes in an intent, invocation phrases, title, and image.

Next, to provide the Intents, we need to implement AppIntent protocol, which lets you provide phrases that can trigger the functionality, describe the needed data for the functionality using Primitive types, or define custom ones using AppEntity and AppEnum, and implement the method that performs the functionality.

With set parameters, the system attempts to resolve them in the order of their declaration in the AppIntent body. After it resolves all parameters, the system calls perform() to perform the app intent with its configured parameters.

Apple Intelligence and AppIntents

To make the AppIntent work with Apple Intelligence's pre-trained models, we need to provide the newly introduced Swift Macros for Intents that generate additional properties and add protocol conformance for your app intent, entity, and enum implementation.

To create implementations that work well with Siri and Apple Intelligence:

  • For AppIntent implementation, use the AssistantIntent(schema:) macro.

  • For AppEntity implementation, use the AssistantEntity(schema:) macro.

  • For AppEnum implementation, use the AssistantEnum(schema:) macro.

Each macro requires us to provide a schema value to generate an app intent, app entity, or app enum code that Apple Intelligence can understand. The value we provide to the macros, the assistant schema, has two parts:

  • The App intent domain that describes a collection of APIs for specific functionality; for example, .photos for the photos domain if an app has photos or video functionality.

  • The schema, an action or a content type within the domain, the specific API for the app intent, app entity, or app enum you create.

What is a Schema?
In the context of Siri and Apple Intelligence, a "schema" refers to a predefined shape or structure that Siri's foundational models are trained to recognise.

These schemas help us create App Intents by conforming to a specific shape, simplifying the integration process with Siri. The schema defines the expected inputs and outputs, ensuring consistency and making it easier for Siri to understand and perform actions within apps.

A list of schema domains can be found here https://developer.apple.com/documentation/AppIntents/app-intent-domains

Summary

In summary, Apple Intelligence in iOS 18 brings a new level of sophistication to the Apple ecosystem. By integrating powerful AI capabilities like ChatGPT and providing developers with advanced text and image manipulation tools, Apple continues to push the boundaries of what its devices can achieve. Whether you're an everyday user or a developer, these innovations promise to enhance your experience, making your interactions with technology more intuitive and powerful than ever.


Digital transformation in government

23 August 2024

We reflect on our work with government on digital transformation and the unique challenges – and opportunities – faced in providing great digital services for citizens.

Scroll to top