yes, artificial intelligence could be hacked, but it wouldn’t be easy
When it comes to cybersecurity, there are two types of systems in this world: ones that have already been hacked, and ones that will be hacked. The sheer number of abstractions needed for most modern applications to successfully and effectively work make it impossible to secure every possible attack vector, and securing something is a very relative term which means that you safeguarded it from vulnerabilities you know or were able to discover. Persistent hackers could easily find a way in with an approach you never even considered. This is why the only thing you can really do is re-enforce your code as much as possible to deter anyone but the most determined hackers, sending them in search of easier prey instead of wasting a lot of time and resources against a hardened target.
This is an important thing to keep in mind with every type of system, including those which incorporate artificial intelligence. But hold on, you may be wondering, AI is just the result of a set of formulas processing data in a certain context. How exactly does one hack that and what would they gain? Well, their goal would be to tease out information most commonly used by most AI applications today: personal data mining. How would they get it? According to computer scientist Dawn Song, AI hackers would try to game the system to understand how it prioritizes inputs, or send data that would force it to retrain itself in malicious or compromising ways, and spit out enough hints for them to understand the underlying data it uses to make its decisions.
In a way, these would be advanced forms of fuzzing, or sending intentionally malformed inputs to cause a system to crash or misbehave and reveal its internal workings. Normally, this would be used to identify something about the underlying code, operating systems, or platforms, allowing hackers to deploy a toolkit that tests every relevant vulnerability they know. But in AI, that’s not the goal because we already know the formulas by which it works. No, the aim would be to try and force it to reveal how it makes decisions so you can trick it, or better yet, force it to train on new data gamed by you so it will learn to do your bidding. Because these systems can become extremely large and complex when tackling sophisticated tasks, it could take a while to detect your interference and restore the machines.
Now, all that said, this AI fuzzing isn’t going to be a technique a casual hacker employs and it will be very difficult to get a lot of useful data out of it compared to just trying to breach a client database. It would probably most likely be deployed by state actors like spies trying to guess their way into information they seek by replicating a target’s behavior, or engineers trying to find a way to trick autonomous surveillance or attack drones. Traditional deterministic code is still going to be the weakest part of any commercial system, and it’s far more likely that hackers would concentrate their efforts there because it would be less work using simpler methods and with a much more immediate and higher payoff. But at the same time, AI engineers should be mindful of how their creations could be abused or tricked, and make sure they maintain strict control over the training data and cycles.