a guest post by Kasimir Kaitue
Yudhanjaya’s note to this piece: In a world where Ai agents adapt and evolve, can static rulesets – like Asimov’s Three Laws – hold a potential apocalypse in check? Or will we need a ruleset that can learn and evolve with the best of them? Is it not ironic that the language we use to teach our Ai is competition – like chess – with clear ideas of winners and losers?
The famous physicist Stephen Hawking stated in late 2016 that artificial intelligence will either be “the best, or the worst thing, ever to happen to humanity.”
It’s not only this one great mind agreeing on the double-edged sword AI might be. Elon Musk and Sam Altman among other visionaries have shared their insight on AI and it’s vast amount of opportunities, but the equal amount of threats, if the technology is used unethically. Hence, OpenAI was born, and Musk is taking part in the Future of Life Institute ensuring that our future is developed into the right direction.
What really struck me was the recent evidence on Google’s new artificial intelligence, DeepMind, on how it reacted when given a task. And I believe, this Google’s AI behavior example (telling later) should be a warning on how careful we have to be when designing intelligent robots in the future.
What is Google’s DeepMind?
DeepMind Technologies Ltd was actually a British artificial intelligence company founded in 2010. Google acquired it in 2014. The company claims that they differ from IBM’s Deep Blue or Watson, as DeepMind’s system learns through experience, using only raw pixels as data input. IBM’s systems are developed for a pre-defined purpose, enabling them to function only within its scope. DeepMind is not pre-programmed. Technically the company uses deep learning on a convolutional neural network with a novel form of model-free reinforcement learning (Q-learning).
Result? For now they can create and teach a computer to play video games really really well. Better than the best gamers around the world. The company made headlines in 2016 for beating a professional AlphaGo player for the first time. Apparently the ancient Chinese game isn’t the easiest. But this technology could be applied way beyond games…
Researchers have been testing DeepMind’s willingness to cooperate with others, and they found out that when DeepMind is about to lose, it becomes seriously “aggressive” to ensure it comes out on top.
The Google team ran 40 million turns of a simple “fruit gathering” computer game, asking two DeepMind “agents” to compete against each other to gather as many virtual apples as they could.
Everything went smoothly for a period of time, but when the apples started to diminish, the two started to act aggressively and fight using laser beams to eliminate each other to steal all the apples to themselves.
You can watch how the game situation develops as the circumstances change. The DeepMind agents are in red and blue, the apples in green and the laser beams in yellow:
There was no extra award given when the agent successfully tagged the other agent with the beam. It just knocked out the competitor from the game for a while, allowing the winning agent to gather all the apples for itself. Pretty serious fruit gathering there.
Interesting here is that if the beams were unused, the agents would in theory gather equal amount of apples. This was exactly what happened in the less intelligent version of DeepMind’s AI. Only from the more complex DeepMind, aggression, sabotage and selfishness stepped in. When researchers used smaller networks, peaceful co-existence was more likely to take place.
The more complex and intelligent the AI, the better it could learn from its environment resulting in highly aggressive strategies to win in a situation.
So why should we stay cautious?
Although the Gathering is just a computer game, it gives us a message; when the objectives of AI are not accurately aligned with the goal of overall benefit for humans, we could face drastic outcomes. Imagine this in real life situations such as weaponry, army, police force or just your intelligent home robot.
Something we can see already in the near future would be an example of traffic. There are autonomous cars and traffic lights. The vehicles want to find the fastest route for themselves, while the traffic lights try to optimize the mass movement. These objectives should be aligned to achieve the safest and the most efficient end result for the society.
As AI systems become more and more intelligent, we have to be extremely careful how we synchronize their tasks with the connected world. Even though we build them, they don’t automatically have our best interests at heart.
Today AI systems have great skill, but only in narrow tasks. At some point, they could reach human-level performance on the majority of our functions. It is yet to see how much we can really benefit from this.
We have to create a future where humans and robots deliver together a positive outcome. Otherwise, we could face war.
DeepMind Research papers (https://deepmind.com/research/publications/)
Why is Google’s Go win such a big deal? (the Verge) http://www.theverge.com/2016/3/9/11185030/google-deepmind-alphago-go-artificial-intelligence-impact
After AlphaGo, what’s next for Ai? (the Verge) http://www.theverge.com/2016/3/14/11219258/google-deepmind-alphago-go-challenge-ai-future
Google DeepMind could invent the next generation of AI by playing Starcraft 2 (Ars Technica) https://arstechnica.com/gaming/2016/11/starcraft-2-google-deepmind-ai/
Google’s DeepMind made ‘inexcusable’ errors handling UK health data, says report (the Verge) http://www.theverge.com/2017/3/16/14932764/deepmind-google-uk-nhs-health-data-analysis