docs/HardwareAssistedAddressSanitizerDesign.rst


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171

=======================================================
Hardware-assisted AddressSanitizer Design Documentation
=======================================================

This page is a design document for
**hardware-assisted AddressSanitizer** (or **HWASAN**)
a tool similar to :doc:`AddressSanitizer`,
but based on partial hardware assistance.


Introduction
============

:doc:`AddressSanitizer`
tags every 8 bytes of the application memory with a 1 byte tag (using *shadow memory*),
uses *redzones* to find buffer-overflows and
*quarantine* to find use-after-free.
The redzones, the quarantine, and, to a less extent, the shadow, are the
sources of AddressSanitizer's memory overhead.
See the `AddressSanitizer paper`_ for details.

AArch64 has the `Address Tagging`_ (or top-byte-ignore, TBI), a hardware feature that allows
software to use 8 most significant bits of a 64-bit pointer as
a tag. HWASAN uses `Address Tagging`_
to implement a memory safety tool, similar to :doc:`AddressSanitizer`,
but with smaller memory overhead and slightly different (mostly better)
accuracy guarantees.

Algorithm
=========
* Every heap/stack/global memory object is forcibly aligned by `TG` bytes
  (`TG` is e.g. 16 or 64). We call `TG` the **tagging granularity**.
* For every such object a random `TS`-bit tag `T` is chosen (`TS`, or tag size, is e.g. 4 or 8)
* The pointer to the object is tagged with `T`.
* The memory for the object is also tagged with `T` (using a `TG=>1` shadow memory)
* Every load and store is instrumented to read the memory tag and compare it
  with the pointer tag, exception is raised on tag mismatch.

For a more detailed discussion of this approach see https://arxiv.org/pdf/1802.09517.pdf

Instrumentation
===============

Memory Accesses
---------------
All memory accesses are prefixed with an inline instruction sequence that
verifies the tags. Currently, the following sequence is used:


.. code-block:: none

  // int foo(int *a) { return *a; }
  // clang -O2 --target=aarch64-linux -fsanitize=hwaddress -c load.c
  foo:
       0:	08 00 00 90 	adrp	x8, 0 <__hwasan_shadow>
       4:	08 01 40 f9 	ldr	x8, [x8]          // shadow base (to be resolved by the loader)
       8:	09 dc 44 d3 	ubfx	x9, x0, #4, #52 // shadow offset
       c:	28 69 68 38 	ldrb	w8, [x9, x8]    // load shadow tag
      10:	09 fc 78 d3 	lsr	x9, x0, #56       // extract address tag
      14:	3f 01 08 6b 	cmp	w9, w8            // compare tags
      18:	61 00 00 54 	b.ne	24              // jump on mismatch
      1c:	00 00 40 b9 	ldr	w0, [x0]          // original load
      20:	c0 03 5f d6 	ret
      24:	40 20 21 d4 	brk	#0x902            // trap

Alternatively, memory accesses are prefixed with a function call.

Heap
----

Tagging the heap memory/pointers is done by `malloc`.
This can be based on any malloc that forces all objects to be TG-aligned.
`free` tags the memory with a different tag.

Stack
-----

Stack frames are instrumented by aligning all non-promotable allocas
by `TG` and tagging stack memory in function prologue and epilogue.

Tags for different allocas in one function are **not** generated
independently; doing that in a function with `M` allocas would require
maintaining `M` live stack pointers, significantly increasing register
pressure. Instead we generate a single base tag value in the prologue,
and build the tag for alloca number `M` as `ReTag(BaseTag, M)`, where
ReTag can be as simple as exclusive-or with constant `M`.

Stack instrumentation is expected to be a major source of overhead,
but could be optional.

Globals
-------

TODO: details.

Error reporting
---------------

Errors are generated by the `HLT` instruction and are handled by a signal handler.

Attribute
---------

HWASAN uses its own LLVM IR Attribute `sanitize_hwaddress` and a matching
C function attribute. An alternative would be to re-use ASAN's attribute
`sanitize_address`. The reasons to use a separate attribute are:

  * Users may need to disable ASAN but not HWASAN, or vise versa,
    because the tools have different trade-offs and compatibility issues.
  * LLVM (ideally) does not use flags to decide which pass is being used,
    ASAN or HWASAN are being applied, based on the function attributes.

This does mean that users of HWASAN may need to add the new attribute
to the code that already uses the old attribute.


Comparison with AddressSanitizer
================================

HWASAN:
  * Is less portable than :doc:`AddressSanitizer`
    as it relies on hardware `Address Tagging`_ (AArch64).
    Address Tagging can be emulated with compiler instrumentation,
    but it will require the instrumentation to remove the tags before
    any load or store, which is infeasible in any realistic environment
    that contains non-instrumented code.
  * May have compatibility problems if the target code uses higher
    pointer bits for other purposes.
  * May require changes in the OS kernels (e.g. Linux seems to dislike
    tagged pointers passed from address space:
    https://www.kernel.org/doc/Documentation/arm64/tagged-pointers.txt).
  * **Does not require redzones to detect buffer overflows**,
    but the buffer overflow detection is probabilistic, with roughly
    `1/(2**TS)` chance of missing a bug (6.25% or 0.39% with 4 and 8-bit TS
    respectively).
  * **Does not require quarantine to detect heap-use-after-free,
    or stack-use-after-return**.
    The detection is similarly probabilistic.

The memory overhead of HWASAN is expected to be much smaller
than that of AddressSanitizer:
`1/TG` extra memory for the shadow
and some overhead due to `TG`-aligning all objects.

Supported architectures
=======================
HWASAN relies on `Address Tagging`_ which is only available on AArch64.
For other 64-bit architectures it is possible to remove the address tags
before every load and store by compiler instrumentation, but this variant
will have limited deployability since not all of the code is
typically instrumented.

The HWASAN's approach is not applicable to 32-bit architectures.


Related Work
============
* `SPARC ADI`_ implements a similar tool mostly in hardware.
* `Effective and Efficient Memory Protection Using Dynamic Tainting`_ discusses
  similar approaches ("lock & key").
* `Watchdog`_ discussed a heavier, but still somewhat similar
  "lock & key" approach.
* *TODO: add more "related work" links. Suggestions are welcome.*


.. _Watchdog: https://www.cis.upenn.edu/acg/papers/isca12_watchdog.pdf
.. _Effective and Efficient Memory Protection Using Dynamic Tainting: https://www.cc.gatech.edu/~orso/papers/clause.doudalis.orso.prvulovic.pdf
.. _SPARC ADI: https://lazytyped.blogspot.com/2017/09/getting-started-with-adi.html
.. _AddressSanitizer paper: https://www.usenix.org/system/files/conference/atc12/atc12-final39.pdf
.. _Address Tagging: http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.den0024a/ch12s05s01.html