Forget Gemini: the iPhone really needs an LLM store

2024-03-30 13:30:00

Reflecting on last week’s column, in which I wrote about how the hypothesis that the Google Gemini Having a presence on the iPhone can be more positive than bad, I came to a new conclusion: is having Gemini on the iPhone better than having nothing? Undoubtedly. But do you know what’s better than having Gemini on your iPhone? Have access to all the other LLMs ¹ also.

What I’m about to propose is not something likely, and goes against everything Apple has been doing on iOS (to the point of generating countless lawsuits for anti-competitive practices), but just imagine: what if iOS allowed us to natively access one or more LLMs of our choice, rather than just relying on Gemini in a fixed manner?

The bad news is that, beyond the improbability, history shows us that even when Apple decides to do something like this, it does it in a way impractical or useful.

If you use the essential TextExpander, you know what I’m talking about. On macOS, just install TextExpander, and it will allow you to use keyboard shortcuts in any typing context, anywhere on the system ².

On the iPhone, the system does not allow this type of integration. Therefore, it was up to TextExpander to launch an application that behaves like a third-party keyboard and, within it, provide keyboard shortcuts. This usage flow is, as TextExpander users know, terrible.

And the fault, of course, is not TextExpander, but rather the plaster that Apple applies to iOS under the (important, but sometimes exaggerated) security veil. At the same time that it protects people who would fall for scams that would take advantage of this openness of the system, it frustrates those looking for something beyond the basics at the intersection between native functionalities and complementary productivity resources.

OK, but what about LLMs?

Well then. The idea would be as follows: in the same way that it is possible to register different email providers in the Mail app, if iOS 18 allowed users to log in no ChatGPT, no Perplexity, no Google Gemini, no You, no Microsoft Copilot, not Mistral, no Claude and in others directly in the system settings as complementary Siri sources — perhaps via an app? 👀 —, iOS could enable native integration between the user and their favorite LLMs, instead of forcing the person to search for the LLM in Safari, or in an app that is isolated from the rest of the system.

This would be especially useful for those who have already familiarized themselves with or adjusted a specific LLM to their preferences and needs, which for frequent use translates into an immense gain in productivity and assertiveness of the model.

Take ChatGPT as an example. Anyone who subscribes to the platform’s paid plan has access to a personalization feature that allows the user to provide important information about their context of use, as well as allowing them to adjust the way ChatGPT should behave in all conversations.

In practice, this function opens up possibilities such as, for example, a doctor being able to say “most of the time, I will use you to send scientific articles in other languages. Therefore, when translating or generating responses, always keep in mind the context of medical or scientific terms and offer a more accurate translation. And whenever you bring me affirmations, list the references with the quote in Vancouver Style”.

A user who wants to practice English can instruct ChatGPT to always formulate answers using the most frequent vocabulary words, or ask the model to always correct him when he makes a spelling or grammar error throughout the conversation.

Could an Apple-made LLM offer something similar? Undoubtedly. But first, it needs to provide something that is at the level of ChatGPT to begin with, and given the possibility of the partnership with Google, this seems as unlikely as the LLM store.

What about privacy?

Here, as always, things become more nebulous, as sometimes LLM would be used in contexts that involve the privacy not only of the user, but also of one or more interlocutors. Think, for example, of a group message exchange. The person using the LLM may be comfortable with the idea of copying and pasting message history when formulating a response, but what about the other participants? Or how about use in corporate environments, with email histories containing sensitive information?

It is true that, in all these cases, nothing prevents the user from copying and pasting the texts into ChatGPT. On the other hand, this is usually where Apple says “yes, and I’m different from my competitor. If he wants to allow this, that’s his problem. I won’t allow it.” Let’s go back to the plaster talk.

That said, Apple also already has a solution to this problem, and it’s called notarization. If there was an approval process to control which LLMs could have access to iOS ³Apple would have the possibility of revoking the access of a bad actor who turned out to be unreliable, or who changed their terms of use to the point of making them unsafe.

What’s more, by basically outsourcing this (very importantof course) aspect of iOS while it is unable to offer a good solution on its own, Apple would have more time to work and seek to resolve delicate issues related to model biases, hallucinations and the always controversial issue of data used in training.

And to be clear: offering an efficient and private multimodal LLM is the least we expect from Apple, and it’s unthinkable that the future of iOS and macOS won’t include something like this. But if at this point the alternatives are to outsource and offer a choice between providers who already know what they are doing, or not offer a useful native LLM for who knows how long, I would definitely prefer the first option.

Opera summary

As I said last week, if Apple allows Google Gemini access to the iPhone, I imagine it will be just to the Nano model, designed to come installed in the system and not require the internet for anything. This makes the functionality less useful than access to a 10x, 20x larger model that lives on the web, as is the case with the Gemini Pro, but it can make the idea of having Google more present on our iPhones more palatable.

From a practical and useful point of view of LLMs, there is a grande difference between just occasionally showing half a dozen emails to Google Gemini so that it momentarily remembers the way you write, compared to having, for example, a native LLM and that (with privacy in mind) uses messages, photos, location , files, browsing history, and more as part of permanent template customization.

Something like this second situation would make the iPhone the most useful personalized AI tool in the world overnight, and I think that’s what we expect from Apple in the future.

However, even the most optimistic must recognize that the chances of this happening soon are very low, especially considering that, if Apple is talking to Google, it is probably to fill a need that it already knows it will not be able to solve on its own anytime soon. .

If Apple’s idea is to really give in and give in to Gemini Pro to offer AI functionalities, in this case I cannot see this solution as anything more efficient than allowing the user to integrate with their LLM of choice or confidence, especially considering that there are more efficient LLMs than Gemini on the market.

Just in the last week, for example, the Close 3 Work and Anthropic surpassed ChatGPT-4 and became the new leader of ranking da HuggingFace . In this same ranking, Gemini Pro is in fourth place. Nano, of course, doesn’t even appear on the list.