Newsportal - Ruhr-Universität Bochum
The number of devices connected to the Internet is continuously on the increase – including household appliances.© Roberto Schirdewahn
Thorsten Holz intends to make the Internet of Things more secure.© Roberto Schirdewahn
In order to analyse a software, Thorsten Holz does not require the original source code. The binary code, which he can read directly from a device, is sufficient.© Roberto Schirdewahn
Together with other security experts, Thorsten Holz analysed the engine control software of Volkswagen diesel cars.© Roberto Schirdewahn
Closing security gaps in Internet-connected households
Due to manipulated emission values of diesel cars, Volkswagen made the headlines for months. The scandal was exposed following emission tests in the United States. However, it is not necessary to measure pollutants to notice that something is amiss at the automotive giant. Having analysed the engine control software, an IT expert from Hamburg and IT security experts from Bochum were able to precisely reconstruct in what way the enterprise committed the fraud.
During an exhaust emission test, a vehicle undergoes a specific test cycle. The length of time the vehicle has to accelerate and the moment when it has to start to brake are exactly specified. The Volkswagen software for engine control in diesel cars checks several times per second if a car is undergoing such a test cycle. If so, the vehicle remains in a low-emission modus; if not, engine control switches illicitly into a modus with higher emissions.
The team headed by Thorsten Holz precisely simulated this process with their analysis methods in the programme code. They thus assisted Hamburg-based IT expert Felix Domke, who had used this approach to test the software and achieved the same results.
For the purpose of their experiments, they did not even have the original source code at their disposal, as that remains the manufacturer’s corporate secret. Rather, the software was available as binary code of zeros and ones – a format that poses no challenge to a processor whatsoever, but which humans cannot read.
Prof Dr Thorsten Holz from the RUB Chair for System Security knows how to detect abnormalities in a binary code. Together with his team, he develops methods for automated software analysis. However, exposing fraudulent behaviour of Volkswagen and other corporations does not constitute the researchers’ primary objective. The IT experts detect vulnerabilities in a wide range of applications, in order to render the Internet of Things more secure.
On average, there are one or two safety-critical vulnerabilities per 20,000 lines of code in a well-maintained software.
An increasing number of objects are connected to the Internet; soon, it will not only be considered perfectly normal that printers, computers, and telephones are online, but also cars, refrigerators and many more. Nine times out of ten, the software running on the connected devices has security gaps.
“On average, there are one or two safety-critical vulnerabilities per 20,000 lines of code in a well-maintained software,” says Thorsten Holz. Made up of 40 to 50 million lines, the Windows operating system probably contains thousands of security gaps. A printer has as many as several hundred thousand lines of code.
With the Internet of Things, the connected world spreads into all areas of everyday life – this is why protection from attacks has become more and more important. But where are the security solutions for so many different devices supposed to come from?
IT experts are confronted with the challenge that devices in the Internet of Things contain different processors. Security solutions available to date work, for the most part, only for one specific type. However, the ideal solution would be if one single tool existed capable of detecting vulnerabilities in different objects, regardless of the processor and the underlying system. The tool should not be dependent on the source code of the original software, because that software is often the manufacturer’s corporate secret.
Detecting security gaps automatically
The RUB researchers are striving to create the basis for precisely such a tool. The European Research Council supports them in the project “Leveraging Binary Analysis to Secure the Internet of Things”, short Bastion. The kicker of the Bastion method: it does not require the source code, but only the binary code which can be read from each device. The tool is supposed to automatically detect vulnerabilities therein, regardless of the processor for which the software has been written.
In order to fulfil their function, devices have to be equipped with processors that differ in terms of complexity. Electronic door keys, for example, contain microcontrollers that are small and cheap and don’t consume much electricity. They can only execute approx. 20 commands, those including arithmetic operations such as addition and subtraction, or commands such as as “jump to a specific segment of the code”.
Processors use different languages
Intel processors in computers, on the other hand, have to be fast. They are much more complex and understand more than 500 commands, including arithmetic operations and jump instructions. But they can do much more, for example execute with one single command an encryption that is made up of hundreds of single steps.
Consequently, a simple microcontroller could never understand a programme running on an Intel processor. Moreover, the processors use different languages. They all operate with the binary code, that means they process commands in the form of zeros and ones. But an identical command – such as “add two numbers” – might be represented as different sequences of zeros and ones on a microcontroller and on an Intel processor, even though both mean the same.
In order to ensure that their security analyses do not depend on the processors, the researchers from Bochum translated the binary code into a so called intermediate language. In the intermediate language, an addition command of the microcontroller looks the same as an addition command of an Intel processor.
This is routine work that requires a great deal of diligence.
“This is routine work that requires a great deal of diligence,” points out Holz, because it is necessary to translate many different instructions. “Our intermediate language contains less than two dozen commands. We have to carry out operations in many small steps that are performed in one single step by more complex processors.” When dealing with an encryption command of an Intel processor, the researchers would, for example, segment it in a long string of arithmetic and logical operations as well as jump instructions.
Once a programme has been translated into an intermediate language, Thorsten Holz’ team can perform an automated analysis in order to detect vulnerabilities. The researchers look for programming errors exploited by the attackers to assume control over the software.
For the purpose of a vulnerability search, the RUB researchers translated a software into an intermediate language and subsequently represented it in a diagram: each programme has a starting point. From there, the diagram divides like the branches of a tree. Each branch illustrates a possible path that the programme may take.
Let’s assume the researchers depict a calculator app in the form of a diagram: if they entered “two plus four” in the app, they would run through the diagram along a specific branch and perform the addition; another branch represents the case of two numbers being subtracted from each other; yet another one stands for a multiplication. A diagram representing the entire app boasts many branches and arborisations. The graph analysis enables Holz’ team to depict the complex network in a compact manner and to analyse it.
The critical segments in the code are those where a variable is only permitted a certain length, but the attacker is able to write in it and exceed that length. Or logical mistakes: they may occur when the programme verifies if a variable corresponds with a specific condition, for example less than, more than or equals zero. If the programmer forgets to read one of these conditions, an attacker might be able to use that gap for gaining access.
Once the researchers have detected programming errors, they subsequently test if those errors are security-critical. This is because not every gap has any ramifications in practice. “Sometimes the software contains programming errors,” explains Thorsten Holz, “but not all errors can be exploited by attackers.”
In their analysis, the researchers identify the conditions under which a specific segment of the code is activated. To this end, they utilise standard procedures such as symbolic execution. They feed the evaluated programme, for example a calculator app, with variables rather than specific numbers. An example: rather than five and eight, the app is given the placeholders alpha and beta as inputs. Subsequently, an algorithm calculates the values the variables have to assume in order to reach a specific point in the programme code. “The result may well be that alpha has to lie somewhere between 100 and 500, in order to arrive at the critical vulnerability in the code,” illustrates Holz.
That means the software analysis is carried out in three steps: translating into the intermediate language, detecting programming errors, and testing under which conditions the vulnerabilities become relevant.
Closing security gaps automatically
However, Thorsten Holz would like to not only automatically detect security gaps, but also to protect users from them. Together with his team, he therefore develops methods for closing security-relevant vulnerabilities automatically. To this end, the code of the original software has to be altered.
As the analyses are performed on the intermediate language level, the researchers also implement the new security solutions into the intermediate language. For the processor to be able to execute those instructions, the command has to be translated back into its binary language.
We are currently hitting a snag in the final translation step.
“It is as if one translated a German text into English, added another passage and then translated it back into German,” elaborates Holz. “We are currently hitting a snag in the final translation step. But I am confident that our attempts will one day be crowned with success.”
Using the Internet Explorer as an example, he and his colleagues have demonstrated that the method is sound in principle. In 2015, the IT expert detected a security gap in the programme that they were able to close automatically. “We have naturally contacted the manufacturer and informed him about the vulnerabilities,” as Holz explains the standard procedure. “Microsoft has now closed those gaps with an update.”
Security gaps not always fixed immediately
However, sometimes it takes a while until security gaps are noticed and fixed by the manufacturers. This is where the methods developed by Thorsten Holz and his team are expected to help. They protect users from attacks even if security gaps had not yet been officially closed – regardless if the object in question is an Internet browser, a phone or a refrigerator.
Currently, the Bochum method is not yet fully processor-independent. But there is a lot of time to realise this goal before the project is wrapped up in 2020. In a feasibility study, the researchers have already demonstrated in what way vulnerabilities can be theoretically identified, independent of the processor architecture in the binary code. Moreover, they have successfully translated the binary code for three processor types named Intel, ARM and MIPS into the intermediate language. Other types are to follow.
9 June 2016