summaryrefslogtreecommitdiffstats
path: root/Documentation/config-accounts.txt
blob: 13e663e0bb06508efb55afb2d610a6fa3b74ae6f (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
= Gerrit Code Review - Accounts

== Overview

Starting from 2.15 Gerrit accounts are fully stored in
link:note-db.html[NoteDb].

The account data consists of a sequence number (account ID), account
properties (full name, preferred email, registration date, status,
inactive flag), preferences (general, diff and edit preferences),
project watches, SSH keys, external IDs, starred changes and reviewed
flags.

Most account data is stored in a special link:#all-users[All-Users]
repository, which has one branch per user. Within the user branch there
are Git config files for the link:#account-properties[
account properties], the link:#preferences[account preferences] and the
link:#project-watches[project watches]. In addition there is an
`authorized_keys` file for the link:#ssh-keys[SSH keys] that follows
the standard OpenSSH file format.

The account data in the user branch is versioned and the Git history of
this branch serves as an audit log.

The link:#external-ids[external IDs] are stored as Git Notes inside the
`All-Users` repository in the `refs/meta/external-ids` notes branch.
Storing all external IDs in a notes branch ensures that each external
ID is only used once.

The link:#starred-changes[starred changes] are represented as
independent refs in the `All-Users` repository. They are not stored in
the user branch, since this data doesn't need versioning.

The link:#reviewed-flags[reviewed flags] are not stored in Git, but are
persisted in a database table. This is because there is a high volume
of reviewed flags and storing them in Git would be inefficient.

Since accessing the account data in Git is not fast enough for account
queries, e.g. when suggesting reviewers, Gerrit has a
link:#account-index[secondary index for accounts].

[[all-users]]
== `All-Users` repository

The `All-Users` repository is a special repository that only contains
user-specific information. It contains one branch per user. The user
branch is formatted as `refs/users/CD/ABCD`, where `CD/ABCD` is the
link:access-control.html#sharded-user-id[sharded account ID], e.g. the
user branch for account `1000856` is `refs/users/56/1000856`. The
account IDs in the user refs are sharded so that there is a good
distribution of the Git data in the storage system.

A user branch must exist for each account, as it represents the
account. The files in the user branch are all optional. This means
having a user branch with a tree that is completely empty is also a
valid account definition.

Updates to the user branch are done through the
link:rest-api-accounts.html[Gerrit REST API], but users can also
manually fetch their user branch and push changes back to Gerrit. On
push the user data is evaluated and invalid user data is rejected.

To hide the implementation detail of the sharded account ID in the ref
name Gerrit offers a magic `refs/users/self` ref that is automatically
resolved to the user branch of the calling user. The user can then use
this ref to fetch from and push to the own user branch. E.g. if user
`1000856` pushes to `refs/users/self`, the branch
`refs/users/56/1000856` is updated. In Gerrit `self` is an established
term to refer to the calling user (e.g. in change queries). This is why
the magic ref for the own user branch is called `refs/users/self`.

A user branch should only be readable and writeable by the user to whom
the account belongs. To assign permissions on the user branches the
normal branch permission system is used. In the permission system the
user branches are specified as `refs/users/${shardeduserid}`. The
`${shardeduserid}` variable is resolved to the sharded account ID. This
variable is used to assign default access rights on all user branches
that apply only to the owning user. The following permissions are set
by default when a Gerrit site is newly installed or upgraded to a
version which supports user branches:

.All-Users project.config
----
[access "refs/users/${shardeduserid}"]
  exclusiveGroupPermissions = read push submit
  read = group Registered Users
  push = group Registered Users
  label-Code-Review = -2..+2 group Registered Users
  submit = group Registered Users
----

The user branch contains several files with account data which are
described link:#account-data-in-user-branch[below].

In addition to the user branches the `All-Users` repository also
contains a branch for the link:#external-ids[external IDs] and special
refs for the link:#starred-changes[starred changes].

Also the next available value of the link:#account-sequence[account
sequence] is stored in the `All-Users` repository.

[[account-index]]
== Account Index

There are several situations in which Gerrit needs to query accounts,
e.g.:

* For sending email notifications to project watchers.
* For reviewer suggestions.

Accessing the account data in Git is not fast enough for account
queries, since it requires accessing all user branches and parsing
all files in each of them. To overcome this Gerrit has a secondary
index for accounts. The account index is either based on
link:config-gerrit.html#index.type[Lucene or Elasticsearch].

Via the link:rest-api-accounts.html#query-account[Query Account] REST
endpoint link:user-search-accounts.html[generic account queries] are
supported.

Accounts are automatically reindexed on any update. The
link:rest-api-accounts.html#index-account[Index Account] REST endpoint
allows to reindex an account manually. In addition the
link:pgm-reindex.html[reindex] program can be used to reindex all
accounts offline.

[[account-data-in-user-branch]]
== Account Data in User Branch

A user branch contains several Git config files with the account data:

* `account.config`:
+
Stores the link:#account-properties[account properties].

* `preferences.config`:
+
Stores the link:#preferences[user preferences] of the account.

* `watch.config`:
+
Stores the link:#project-watches[project watches] of the account.

In addition it contains an
link:https://en.wikibooks.org/wiki/OpenSSH/Client_Configuration_Files#.7E.2F.ssh.2Fauthorized_keys[
authorized_keys] file with the link:#ssh-keys[SSH keys] of the account.

[[account-properties]]
=== Account Properties

The account properties are stored in the user branch in the
`account.config` file:

----
[account]
  fullName = John Doe
  preferredEmail = john.doe@example.com
  status = OOO
  active = false
----

For active accounts the `active` parameter can be omitted.

The registration date is not contained in the `account.config` file but
is derived from the timestamp of the first commit on the user branch.

When users update their account properties by pushing to the user
branch, it is verified that the preferred email exists in the external
IDs.

Users are not allowed to flip the active value themselves; only
administrators and users with the
link:access-control.html#capability_modifyAccount[Modify Account]
global capability are allowed to change it.

Since all data in the `account.config` file is optional the
`account.config` file may be absent from some user branches.

[[preferences]]
=== Preferences

The account properties are stored in the user branch in the
`preferences.config` file. There are separate sections for
link:intro-user.html#preferences[general],
link:user-review-ui.html#diff-preferences[diff] and edit preferences:

----
[general]
  showSiteHeader = false
[diff]
  hideTopMenu = true
[edit]
  lineLength = 80
----

The parameter names match the names that are used in the preferences REST API:

* link:rest-api-accounts.html#preferences-info[General Preferences]
* link:rest-api-accounts.html#diff-preferences-info[Diff Preferences]
* link:rest-api-accounts.html#edit-preferences-info[Edit Preferences]

If the value for a preference is the same as the default value for this
preference, it can be omitted in the `preferences.config` file.

Defaults for preferences that apply for all accounts can be configured
in the `refs/users/default` branch in the `All-Users` repository.

[[project-watches]]
=== Project Watches

Users can configure watches on projects to receive email notifications
for changes of that project.

A watch configuration consists of the project name and an optional
filter query. If a filter query is specified, email notifications will
be sent only for changes of that project that match this query.

In addition, each watch configuration can contain a list of
notification types that determine for which events email notifications
should be sent. E.g. a user can configure that email notifications
should only be sent if a new patch set is uploaded and when the change
gets submitted, but not on other events.

Project watches are stored in a `watch.config` file in the user branch:

----
[project "foo"]
  notify = * [ALL_COMMENTS]
  notify = branch:master [ALL_COMMENTS, NEW_PATCHSETS]
  notify = branch:master owner:self [SUBMITTED_CHANGES]
----

The `watch.config` file has one project section for all project watches
of a project. The project name is used as subsection name and the
filters with the notification types, that decide for which events email
notifications should be sent, are represented as `notify` values in the
subsection. A `notify` value is formatted as
"<filter> [<comma-separated-list-of-notification-types>]". The
supported notification types are described in the
link:user-notify.html#notify.name.type[Email Notifications documentation].

For a change event, a notification will be sent if any `notify` value
of the corresponding project has both a filter that matches the change
and a notification type that matches the event.

In order to send email notifications on change events, Gerrit needs to
find all accounts that watch the corresponding project. To make this
lookup fast the secondary account index is used. The account index
contains a repeated field that stores the projects that are being
watched by an account. After the accounts that watch the project have
been retrieved from the index, the complete watch configuration is
available from the account cache and Gerrit can check if any watch
matches the change and the event.

[[ssh-keys]]
=== SSH Keys

SSH keys are stored in the user branch in an `authorized_keys` file,
which is the
link:https://en.wikibooks.org/wiki/OpenSSH/Client_Configuration_Files#.7E.2F.ssh.2Fauthorized_keys[
standard OpenSSH file format] for storing SSH keys:

----
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAAAgQCgug5VyMXQGnem2H1KVC4/HcRcD4zzBqSuJBRWVonSSoz3RoAZ7bWXCVVGwchtXwUURD689wFYdiPecOrWOUgeeyRq754YWRhU+W28vf8IZixgjCmiBhaL2gt3wff6pP+NXJpTSA4aeWE5DfNK5tZlxlSxqkKOS8JRSUeNQov5Tw== john.doe@example.com
# DELETED
# INVALID ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAAAgQDm5yP7FmEoqzQRDyskX+9+N0q9GrvZeh5RG52EUpE4ms/Ujm3ewV1LoGzc/lYKJAIbdcZQNJ9+06EfWZaIRA3oOwAPe1eCnX+aLr8E6Tw2gDMQOGc5e9HfyXpC2pDvzauoZNYqLALOG3y/1xjo7IH8GYRS2B7zO/Mf9DdCcCKSfw== john.doe@example.com
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAAAgQCaS7RHEcZ/zjl9hkWkqnm29RNr2OQ/TZ5jk2qBVMH3BgzPsTsEs+7ag9tfD8OCj+vOcwm626mQBZoR2e3niHa/9gnHBHFtOrGfzKbpRjTWtiOZbB9HF+rqMVD+Dawo/oicX/dDg7VAgOFSPothe6RMhbgWf84UcK5aQd5eP5y+tQ== john.doe@example.com
----

When the SSH API is used, Gerrit needs an efficient way to lookup SSH
keys by username. Since the username can be easily resolved to an
account ID (via the account cache), accessing the SSH keys in the user
branch is fast.

To identify SSH keys in the REST API Gerrit uses
link:rest-api-accounts.html#ssh-key-id[sequence numbers per account].
This is why the order of the keys in the `authorized_keys` file is
used to determines the sequence numbers of the keys (the sequence
numbers start at 1).

To keep the sequence numbers intact when a key is deleted, a
'# DELETED' line is inserted at the position where the key was deleted.

Invalid keys are marked with the prefix '# INVALID'.

[[external-ids]]
== External IDs

External IDs are used to link external identities, such as an LDAP
account or an OAUTH identity, to an account in Gerrit.

External IDs are stored as Git Notes in the `All-Users` repository. The
name of the notes branch is `refs/meta/external-ids`.

As note key the SHA1 of the external ID key is used. This ensures that
an external ID is used only once (e.g. an external ID can never be
assigned to multiple accounts at a point in time).

[IMPORTANT]
If the external ID key is changed manually you must adapt the note key
to the new SHA1. Otherwise the external ID becomes inconsistent and is
ignored by Gerrit.

The note content is a Git config file:

----
[externalId "username:jdoe"]
  accountId = 1003407
  email = jdoe@example.com
  password = bcrypt:4:LCbmSBDivK/hhGVQMfkDpA==:XcWn0pKYSVU/UJgOvhidkEtmqCp6oKB7
----

The config file has one `externalId` section. The external ID key which
consists of scheme and ID in the format '<scheme>:<id>' is used as
subsection name.

The `accountId` field is mandatory, the `email` and `password` fields
are optional.

The external IDs are maintained by Gerrit, this means users are not
allowed to manually edit their external IDs. Only users with the
link:access-control.html#capability_accessDatabase[Access Database]
global capability can push updates to the `refs/meta/external-ids`
branch. However Gerrit rejects pushes if:

* any external ID config file cannot be parsed
* if a note key does not match the SHA of the external ID key in the
  note content
* external IDs for non-existing accounts are contained
* invalid emails are contained
* any email is not unique (the same email is assigned to multiple
  accounts)
* hashed passwords of external IDs with scheme `username` cannot be
  decoded

[[starred-changes]]
== Starred Changes

link:dev-stars.html[Starred changes] allow users to mark changes as
favorites and receive email notifications for them.

Each starred change is a tuple of an account ID, a change ID and a
label.

To keep track of a change that is starred by an account, Gerrit creates
a `refs/starred-changes/YY/XXXX/ZZZZZZZ` ref in the `All-Users`
repository, where `YY/XXXX` is the sharded numeric change ID and
`ZZZZZZZ` is the account ID.

A starred-changes ref points to a blob that contains the list of labels
that the account set on the change. The label list is stored as UTF-8
text with one label per line.

Since JGit has explicit optimizations for looking up refs by prefix
when the prefix ends with '/', this ref format is optimized to find
starred changes by change ID. Finding starred changes by change ID is
e.g. needed when a change is updated so that all users that have
the link:dev-stars.html#default-star[default star] on the change can be
notified by email.

Gerrit also needs an efficient way to find all changes that were
starred by an account, e.g. to provide results for the
link:user-search.html#is-starred[is:starred] query operator. With the
ref format as described above the lookup of starred changes by account
ID is expensive, as this requires a scan of the full
`refs/starred-changes/*` namespace. To overcome this the users that
have starred a change are stored in the change index together with the
star labels.

[[reviewed-flags]]
== Reviewed Flags

When reviewing a patch set in the Gerrit UI, the reviewer can mark
files in the patch set as reviewed. These markers are called ‘Reviewed
Flags’ and are private to the user. A reviewed flag is a tuple of patch
set ID, file and account ID.

Each user can have many thousands of reviewed flags and over time the
number can grow without bounds.

The high amount of reviewed flags makes a storage in Git unsuitable
because each update requires opening the repository and committing a
change, which is a high overhead for flipping a bit. Therefore the
reviewed flags are stored in a database table. By default they are
stored in a local H2 database, but there is an extension point that
allows to plug in alternate implementations for storing the reviewed
flags. To replace the storage for reviewed flags a plugin needs to
implement the link:dev-plugins.html#account-patch-review-store[
AccountPatchReviewStore] interface. E.g. to support a multi-master
setup where reviewed flags should be replicated between the master
nodes one could implement a store for the reviewed flags that is
based on MySQL with replication.

[[account-sequence]]
== Account Sequence

The next available account sequence number is stored as UTF-8 text in a
blob pointed to by the `refs/sequences/accounts` ref in the `All-Users`
repository.

Multiple processes share the same sequence by incrementing the counter
using normal git ref updates. To amortize the cost of these ref
updates, processes increment the counter by a larger number and hand
out numbers from that range in memory until they run out. The size of
the account ID batch that each process retrieves at once is controlled
by the link:config-gerrit.html#notedb.accounts.sequenceBatchSize[
notedb.accounts.sequenceBatchSize] parameter in the `gerrit.config`
file.

[[replication]]
== Replication

To replicate account data the following branches from the `All-Users`
repository must be replicated:

* `refs/users/*` (user branches)
* `refs/meta/external-ids` (external IDs)
* `refs/starred-changes/*` (star labels)
* `refs/sequences/accounts` (account sequence numbers, not needed for Gerrit
  slaves)

GERRIT
------
Part of link:index.html[Gerrit Code Review]

SEARCHBOX
---------