AI Alignment is not a Problem - But if so, is it Really Solvable?

AI and Security: How Can We Ensure Our Safety?

Apr 18, 2023

Article voiceover

0:00

-10:04

The Alignment Problem

The alignment problem has become a highly discussed topic in the field of AI due to the recent rapid advancements. This problem refers to the challenge of ensuring that AI behaves in ways that are consistent with the goals and values of humans. While I recognize the potential dangers and harm to society, in this article, I am going to argue against the basic assumptions of alignment problem because I believe that if we identify it as an AI-specific problem that should be addressed within the industry, our solutions may be flawed or inappropriate. It is important to note that my focus is solely on the ethical and abstract dimensions of the issue, and not on its technical aspects.

The challenge arises from the fact that AI systems are increasingly being used to make decisions and take actions that can have significant consequences for people's lives, and yet these systems may not always align with the values and priorities of humanity.

One of the most serious concerns surrounding the alignment problem is the potential for AI systems to behave in ways that are harmful or even catastrophic. This could happen if an AI were to optimize its function in ways that are inconsistent with human values, or if it were to prioritize achieving its goals without considering the unintended consequences of its actions.

Objective and Subjective Values

“If we are to have peace, we must learn loyalty to a larger group. And before we can learn loyalty, the thing to which we are to be loyal must be created.”

Kenneth Waltz

Without delving too deeply into philosophy, it is important to determine what is considered as human value because it has been a topic of discussion for centuries. One of the prominent philosophers who talked about this subject was Immanuel Kant. According to Kant, there is a thought-process that could determine what is considered as human value. He called it the categorical imperative. Categorical imperatives refer to a set of principles that should guide human behavior regardless of the circumstances. These principles are absolute and apply universally to all human beings. In other words, they are not subjective and cannot be altered based on individual preferences.

It is worth mentioning that even if these thoughts are considered correct, they represent a religious and idealistic approach to values. It is almost impossible to achieve complete agreement on every question within a society. So, even if Kant was right, it is impossible to align all individuals with his values.

However, some other thinkers hold the belief that our world is subjective, and there is no objective way to determine what constitutes human values. Friedrich Nietzsche believed that there is no objective truth or morality, and that human values are created by individuals and societies based on their subjective experiences and perspectives.

He argues that what is valuable to one person may not be valuable to another. For instance, one person may value material possessions, while another person may value family relationships. It is impossible to come up with a set of principles that apply to everyone equally. If he was right, we can be sure that there is no way to form a universal value that could be applied to all members of society.

Practical Approach to Power

“Power positions do not yield to arguments, however rationally and morally valid, but only to superior power.”

Hans Morgenthau

If we align an ASI(Artificial Superintelligence) to specific values, we are likely poised for an even bigger catastrophe. Imagine a world where the most powerful AI system is aligned only with its creator's values, and the creator lacks emotional sensibility, has sadistic tendencies, or is a sociopath. In this theoretical scenario, a dystopian future is guaranteed. Now, one could argue that this is the very point of aligning AI, but we must assume that different parties, including dictatorships, fundamentalists, and military groups, will develop AI systems. It is certain that these actors will align their systems differently from what most people favor in terms of ethics. How can we fight against them if this happens?

In this scenario, the only solution would be a system that represents our values and can fight against these harmful AIs. At the end of the day, we would come to the realization that AI is only a weapon, and its dangers can only be eliminated by other weapons or by achieving a balance of power.

Thank you for reading Rushing Robotics. This post is public so feel free to share it.

AI as a Weapon

“The implication of game theory, which is also the implication of the third image, is, however, that the freedom of choice of any one state is limited by the actions of the others.”

Kenneth Waltz

AI has become an integral part of modern society, with many industries and businesses adopting its use to enhance their operations. By leveraging the power of AI, humans have been able to achieve their goals more efficiently and effectively than ever before. However, the possibility of AI becoming self-conscious and exceeding human-level intelligence is a topic of concern. If AI were to create its own mind through self-improvement, it could surpass human intellect in all fields, we would have no chance to fight against it.

This possibility is not entirely surprising, as we have created many tools with the capability to destroy humankind, such as the nuclear bomb. If only one nation possessed such a weapon, their alignment with certain values would be crucial. If this nation chose to rule over others, they would have unparalleled power.

One way to prevent the concentration of power in the hands of a few is to democratize access to technology. By making AI available to a broader audience, we can ensure that no single entity or individual can dominate others.

Balance of Power

The balance of power theory in international relations declare that states have a vested interest in preventing any single state from becoming too militarily dominant, as it may pose a threat to their survival. According to the theory, if one state gains significant military power it is likely to exploit its weaker neighbors prompting them to come together in a defensive coalition. This creates a sense of equilibrium in the international system, where rival coalitions balance against each other to prevent aggression and maintain stability.

Realists, like Kenneth Waltz, who are tied to the balance of power theory argue that a system with a balance of power is more stable than one with a dominant state. This is because in a balanced system, aggression is perceived as unprofitable, as it is met with resistance from coalitions of states. This creates a disincentive against aggressive actions and reduces the likelihood of one state dominating others.

When faced with a perceived threat, states have several strategies to ensure their safety. Balancing involves forming alliances with other states to counterbalance the perceived threat. This can involve military cooperation, diplomatic partnerships, or economic ties to collectively resist the dominant state.

In international relations, the balance of power has been a key factor in maintaining stability and preventing wars. If we substitute nations with concurrent AI systems, we can see that a balance of power could be achieved within the field by having powerful concurrent actors. In this way, a higher standard of security could be attained. One big AI in the hands of a few is always more concerning than a competitive space with many.

Efficiently securing AI

We can assume that the danger posed by AI could be mitigated by distributing its power. As we have seen in international relations, the key to maintaining balance is by balancing the power of actors. If we can decentralize access to AI through open-source systems, and align different concurrent AI with different values, we may achieve greater security.

By making AI accessible to a broader audience and encouraging the development of diverse AI systems with varying values, we can avoid the concentration of power in the hands of a few entities or individuals. The distribution of power across a wider network of actors would make it more difficult for any one AI system to dominate others.

The most efficient way to democratize access to AI is likely through the use of open-source systems. Open-source software allows for greater transparency and accountability, as anyone can view the code and make modifications to it. This not only increase innovation but also ensures that the development of AI is more democratic, as it is not monopolized by a few large corporations or organizations. Open-source AI systems allow for a diverse range of values and priorities to be incorporated into the development of AI, as different actors can create and modify AI to align with their own values.

Unfortunately open-source systems alone may not be sufficient to ensure the safe, equitable, and smooth integration of AI into our lives. Democratizing access to AI hardware is also important, as hardware is a necessary component of AI development and use. Ensuring that more people and organizations have access to the necessary hardware would reduce the risk of one entity or individual dominating the field and also allow for a wider range of values and priorities to be incorporated into the development of advanced AI. This is likely to happen automatically as the cost of hardware reduces over time, but the former still needs to be done and requires human actions. Instead of trying to align the general values into one big system, decentralization of power would be crucial in ensuring its safe and equitable integration into society. Overall, the question arises: in a scenario where an ASI breaks free, what can save us if not another superintelligence with a different alignment?