Secure coding

What is x86 assembly?

Srinivas
February 10, 2021 by
Srinivas

This is the first installment of a series of articles on x86 assembly. In this series of articles, readers will learn x86 assembly language fundamentals, which can help them in a variety of security domains such as malware analysis, reverse engineering and exploit development.

This article covers some background information such as what x86 assembly is, its history and where it is used. In the subsequent articles, we will discuss various x86 assembly programming concepts.

Learn Secure Coding

Learn Secure Coding

Build your secure coding skills in C/C++, iOS, Java, .NET, Node.js, PHP and other languages.

What is x86 assembly?

Before understanding x86 assembly, let us begin by understanding what assembly language is. Assembly is a low-level programming language that can directly communicate with the processor. Processors can understand only 0s and 1s (binary), and it is very difficult for a human to provide instructions to the processor in binary language.

Assembly language acts as a bridge between the human and processor. However, assembly language is machine-dependent. As it is very specific to the hardware processor family, programs written for one computer might not run in another computer with different hardware configurations. It means a program written for a machine running on an Intel processor might not work with a machine running on an ARM or MIPS processor. Hence, different processors need different assembly language instructions. It should be noted that compilers sometimes produce assembly code as an intermediate step when translating a high-level program into machine code. 

Following are some of the examples of processor families:

  • Intel (x86 and x86_64)
  • ARM
  • MIPS

Assembly programming targeted specifically towards Intel 32 bit processors is known as x86 assembly. It should be noted that in addition to Intel processors, there may be other processors that use the same instruction set, AMD for instance.

What are computer instruction sets (ISA)?

We mentioned some example processor families earlier in this article. Not every processor can understand the same instructions.

The instruction MOV R0, R2 works for an ARM-based processor but not an Intel processor. Similarly, the instruction MOV EAX, ECX can be understood by an intel processor but not an ARM processor. However, this instruction, which is part of x86 ISA, can be understood both by Intel and AMD processors. Another example is that it is not possible to do direct operations on memory in ARM and we can't perform computations on memory in ARM-based processors, whereas it is possible to do them in x86.

The combination of instructions a CPU understands and the registers it knows about is called the Instruction Set Architecture (ISA). The instruction set architecture is the interface between your hardware and the software. The only way that you can interact with the hardware is the instruction set of the processor. In general, an ISA defines the supported data types, the registers, the hardware support for managing main memory, fundamental features such as the memory consistency, addressing modes, virtual memory, and the input/output model of a family of implementations of the ISA.

An ISA may be classified in a number of different ways. A common classification is by architectural complexity. A complex instruction set computer (CISC) has many specialized instructions, some of which may only be rarely used in practical programs. A reduced instruction set computer (RISC) simplifies the processor by efficiently implementing only the instructions that are frequently used in programs.

x86 history: History and origin of the x86 instruction set

Before we get into the technical details of the x86-32 platform and assembly, a brief history of it might be helpful in understanding how the architecture has evolved over the years. x86 today is commonly referred to as an instruction set for Intel 32-bit processors.

x86 is a family of instruction set architectures initially developed by Intel based on the Intel 8086 microprocessor and its 8088 variant. Intel first introduced 8-bit 8080 processors in 1974. Later, Intel introduced the 16-bit 8086 microprocessor in 1978 as an extension of Intel's 8-bit 8080 microprocessor. Several successors to Intel's 8086 processor have been released with the names 80186, 80286, 80386 and 80486.

As you can notice, most of these processor names are ending with 86 and thus the term x86 came into existence. 80386 is Intel's first 32-bit type processor which was released in 1985 and continued with processors such as the Intel Pentium, Intel Pentium 4, Intel Pentium Core Duo, and the Advanced Micro Devices (AMD) Athlon.

x86 usage: How and where x86 is used

Most applications written in the early days of programming were written mostly in assembly language due to the fact that they had to fit in a small area of memory and run as efficiently as possible on slow processors.

Coming to x86 processors, they are ubiquitous in both stationary and portable personal computers and are also used in midrange computers, workstations, servers, and most new supercomputer clusters. A large amount of software, including a large list of x86 operating systems such as Microsoft Windows, Android-x86, Firefox OS and Chrome OS are using x86-based hardware.

See the next article in the series, Introduction to x86 assembly and syntax.

Intro to x86 Disassembly

Intro to x86 Disassembly

Build your x86 assembly skills with six courses covering the basics of computer architecture, how to build and debug x86, x86 assembly instructions and more.

Sources

Srinivas
Srinivas

Srinivas is an Information Security professional with 4 years of industry experience in Web, Mobile and Infrastructure Penetration Testing. He is currently a security researcher at Infosec Institute Inc. He holds Offensive Security Certified Professional(OSCP) Certification. He blogs atwww.androidpentesting.com. Email: srini0x00@gmail.com