Specific Criticism of CVSS4
Marc Ruef
The strange way Siri can be used to attack your iPhone
It happened when playing a YouTube video on an iPhone XS with iOS 12.3.1; suddenly, Siri piped up. It was as if she had heard the command Hey, Siri and responded. But there was no such command in the video. At first, we thought it might be a coincidence.
A few weeks later, the same happened again with another video. This time, the device was being held, and the effect was also observed in a device lying on a surface. And there was no command for Siri in the second video, either. It looked like we had a self-reference attack on our hands.
The first time the effect manifested, the phone was being held horizontally in the crook of the neck. The video that triggered it was 10 Game Company Decisions That BACKFIRED BADLY at the 7:30 mark. When we went to reproduce the vulnerability, this would prove to be a key part of the puzzle. Only when the device was correctly positioned was the command for Siri successfully executed.
So we initially suspected that the position and/or the distance could be significant. We figured that the echo of the sound within and outside the housing was responsible for the command.
When the problem recurred a few weeks later, it was under slightly different circumstances. This time, the trigger video was Terminator T-900 Explained at the 3:38 mark. And the effect did not occur in the same lying position. Instead, it could (also) be observed when the device was held in someone’s hands horizontally. It would become apparent that it could also be placed on the table.
The interesting thing here was that it only occurred when the volume level was at 1 or 2. As soon as the device was turned up, there was no command to Siri. One reason for this could be that voice recognition works differently with lower-volume sources. This may strengthen frequencies or independently fill in gaps, which may give rise to the Siri command.
Self-referencing entries are nothing new, as we have shown with gesture recognition in Samsung smart TVs. There it is possible to make the device start filming through a reflection of an entry that impacts itself.
In this respect, Siri was interesting insofar as a corresponding audio construct might be able to control the virtual assistant under certain circumstances. This would open up numerous things that could be executed through Siri. These range from displaying information about the storage of content, through to changing settings.
In accordance with the responsible disclosure process, we made prior email contact with Apple on July 10, 2019 and told them about our discovery. We included both test cases (incl. links to the videos).
During some experimental testing, we were able to let YouTube videos call Siri on the same device. This should usually not be possible. (…) It looks like some resonance disturbance (within the case) is responsible for this effect. An attacker might be able to create a video which controls a device.
The next day, the Apple Security Team replied. They indicated that the facts were correct, but they did not consider it a risk:
After examining your report, we do not see any actual security implications. “Hey Siri” is meant to make your voice more recognizable for Siri, not limit access to Siri for only your voice.
Here it became apparent that they had not understood the attack vector and the associated possibilities.
If the manufacturer does not want to acknowledge or fix a vulnerability, we will aim to disclose it. The public pressure that results can usually force a correction. And so, in this case, we assigned a CVE through MITRE. But MITRE then pointed out that Apple itself was a CNA (CVE Numbering Authority), and only it was authorized to generate CVEs for Apple products and that consequently MITRE was not able to assign a CVE.
This indirectly indicated that allowing manufacturers themselves to function as CNAs meant relinquishing a key element of vulnerability disclosure. This left us with no choice but to make our own disclosure without a CVE.
And then iOS 13 appeared. And on the same device, the effects of the two videos can no longer be reproduced. In the security advisories published since the discovery, there are no indications of the effect, nor of any measures that might have been introduced. We must therefore assume that Apple unknowingly or even secretly solved the problem. So we recommend updating to iOS 13. Unfortunately, this is already the second cooperation relating to disclosure of an Apple vulnerability where there were unnecessary complications.
The vulnerability we found is curious and exotic. And the story of its discovery shows once again that some vulnerabilities can be found by accident. We are still unable to conclusively account for the effect. Whether it was due to acoustic reflections or the resonance in the housing, we don’t know.
In any case, Apple’s lack of cooperation proved disappointing. And the fact that the problem suddenly no longer exists in newer versions of iOS, but with no commentary to be found anywhere, leaves a bad taste in the mouth. This is precisely the type of behavior that the No More Free Bugs movement was highlighting ten years ago, and which even now can lead some people to decide they’re better off forgoing the thankless conflict of responsible disclosure.
Our experts will get in contact with you!
Marc Ruef
Marc Ruef
Marc Ruef
Marc Ruef
Our experts will get in contact with you!