summaryrefslogtreecommitdiffstats
path: root/Documentation/config-accounts.txt
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation/config-accounts.txt')
-rw-r--r--Documentation/config-accounts.txt423
1 files changed, 423 insertions, 0 deletions
diff --git a/Documentation/config-accounts.txt b/Documentation/config-accounts.txt
new file mode 100644
index 0000000000..35e6800a85
--- /dev/null
+++ b/Documentation/config-accounts.txt
@@ -0,0 +1,423 @@
+= Gerrit Code Review - Accounts
+
+== Overview
+
+Starting from 2.15 Gerrit accounts are fully stored in
+link:note-db.html[NoteDb].
+
+The account data consists of a sequence number (account ID), account
+properties (full name, preferred email, registration date, status,
+inactive flag), preferences (general, diff and edit preferences),
+project watches, SSH keys, external IDs, starred changes and reviewed
+flags.
+
+Most account data is stored in a special link:#all-users[All-Users]
+repository, which has one branch per user. Within the user branch there
+are Git config files for the link:#account-properties[
+account properties], the link:#preferences[account preferences] and the
+link:#project-watches[project watches]. In addition there is an
+`authorized_keys` file for the link:#ssh-keys[SSH keys] that follows
+the standard OpenSSH file format.
+
+The account data in the user branch is versioned and the Git history of
+this branch serves as an audit log.
+
+The link:#external-ids[external IDs] are stored as Git Notes inside the
+`All-Users` repository in the `refs/meta/external-ids` notes branch.
+Storing all external IDs in a notes branch ensures that each external
+ID is only used once.
+
+The link:#starred-changes[starred changes] are represented as
+independent refs in the `All-Users` repository. They are not stored in
+the user branch, since this data doesn't need versioning.
+
+The link:#reviewed-flags[reviewed flags] are not stored in Git, but are
+persisted in a database table. This is because there is a high volume
+of reviewed flags and storing them in Git would be inefficient.
+
+Since accessing the account data in Git is not fast enough for account
+queries, e.g. when suggesting reviewers, Gerrit has a
+link:#account-index[secondary index for accounts].
+
+[[all-users]]
+== `All-Users` repository
+
+The `All-Users` repository is a special repository that only contains
+user-specific information. It contains one branch per user. The user
+branch is formatted as `refs/users/CD/ABCD`, where `CD/ABCD` is the
+link:access-control.html#sharded-user-id[sharded account ID], e.g. the
+user branch for account `1000856` is `refs/users/56/1000856`. The
+account IDs in the user refs are sharded so that there is a good
+distribution of the Git data in the storage system.
+
+A user branch must exist for each account, as it represents the
+account. The files in the user branch are all optional. This means
+having a user branch with a tree that is completely empty is also a
+valid account definition.
+
+Updates to the user branch are done through the
+link:rest-api-accounts.html[Gerrit REST API], but users can also
+manually fetch their user branch and push changes back to Gerrit. On
+push the user data is evaluated and invalid user data is rejected.
+
+To hide the implementation detail of the sharded account ID in the ref
+name Gerrit offers a magic `refs/users/self` ref that is automatically
+resolved to the user branch of the calling user. The user can then use
+this ref to fetch from and push to the own user branch. E.g. if user
+`1000856` pushes to `refs/users/self`, the branch
+`refs/users/56/1000856` is updated. In Gerrit `self` is an established
+term to refer to the calling user (e.g. in change queries). This is why
+the magic ref for the own user branch is called `refs/users/self`.
+
+A user branch should only be readable and writeable by the user to whom
+the account belongs. To assign permissions on the user branches the
+normal branch permission system is used. In the permission system the
+user branches are specified as `refs/users/${shardeduserid}`. The
+`${shardeduserid}` variable is resolved to the sharded account ID. This
+variable is used to assign default access rights on all user branches
+that apply only to the owning user. The following permissions are set
+by default when a Gerrit site is newly installed or upgraded to a
+version which supports user branches:
+
+.All-Users project.config
+----
+[access "refs/users/${shardeduserid}"]
+ exclusiveGroupPermissions = read push submit
+ read = group Registered Users
+ push = group Registered Users
+ label-Code-Review = -2..+2 group Registered Users
+ submit = group Registered Users
+----
+
+The user branch contains several files with account data which are
+described link:#account-data-in-user-branch[below].
+
+In addition to the user branches the `All-Users` repository also
+contains a branch for the link:#external-ids[external IDs] and special
+refs for the link:#starred-changes[starred changes].
+
+Also the next available value of the link:#account-sequence[account
+sequence] is stored in the `All-Users` repository.
+
+[[account-index]]
+== Account Index
+
+There are several situations in which Gerrit needs to query accounts,
+e.g.:
+
+* For sending email notifications to project watchers.
+* For reviewer suggestions.
+
+Accessing the account data in Git is not fast enough for account
+queries, since it requires accessing all user branches and parsing
+all files in each of them. To overcome this Gerrit has a secondary
+index for accounts. The account index is either based on
+link:config-gerrit.html#index.type[Lucene or Elasticsearch].
+
+Via the link:rest-api-accounts.html#query-account[Query Account] REST
+endpoint link:user-search-accounts.html[generic account queries] are
+supported.
+
+Accounts are automatically reindexed on any update. The
+link:rest-api-accounts.html#index-account[Index Account] REST endpoint
+allows to reindex an account manually. In addition the
+link:pgm-reindex.html[reindex] program can be used to reindex all
+accounts offline.
+
+[[account-data-in-user-branch]]
+== Account Data in User Branch
+
+A user branch contains several Git config files with the account data:
+
+* `account.config`:
++
+Stores the link:#account-properties[account properties].
+
+* `preferences.config`:
++
+Stores the link:#preferences[user preferences] of the account.
+
+* `watch.config`:
++
+Stores the link:#project-watches[project watches] of the account.
+
+In addition it contains an
+link:https://en.wikibooks.org/wiki/OpenSSH/Client_Configuration_Files#.7E.2F.ssh.2Fauthorized_keys[
+authorized_keys] file with the link:#ssh-keys[SSH keys] of the account.
+
+[[account-properties]]
+=== Account Properties
+
+The account properties are stored in the user branch in the
+`account.config` file:
+
+----
+[account]
+ fullName = John Doe
+ preferredEmail = john.doe@example.com
+ status = OOO
+ active = false
+----
+
+For active accounts the `active` parameter can be omitted.
+
+The registration date is not contained in the `account.config` file but
+is derived from the timestamp of the first commit on the user branch.
+
+When users update their account properties by pushing to the user
+branch, it is verified that the preferred email exists in the external
+IDs.
+
+Users are not allowed to flip the active value themselves; only
+administrators and users with the
+link:access-control.html#capability_modifyAccount[Modify Account]
+global capability are allowed to change it.
+
+Since all data in the `account.config` file is optional the
+`account.config` file may be absent from some user branches.
+
+[[preferences]]
+=== Preferences
+
+The account properties are stored in the user branch in the
+`preferences.config` file. There are separate sections for
+link:intro-user.html#preferences[general],
+link:user-review-ui.html#diff-preferences[diff] and edit preferences:
+
+----
+[general]
+ showSiteHeader = false
+[diff]
+ hideTopMenu = true
+[edit]
+ lineLength = 80
+----
+
+The parameter names match the names that are used in the preferences REST API:
+
+* link:rest-api-accounts.html#preferences-info[General Preferences]
+* link:rest-api-accounts.html#diff-preferences-info[Diff Preferences]
+* link:rest-api-accounts.html#edit-preferences-info[Edit Preferences]
+
+If the value for a preference is the same as the default value for this
+preference, it can be omitted in the `preferences.config` file.
+
+Defaults for general and diff preferences that apply for all accounts
+can be configured in the `refs/users/default` branch in the `All-Users`
+repository.
+
+[[project-watches]]
+=== Project Watches
+
+Users can configure watches on projects to receive email notifications
+for changes of that project.
+
+A watch configuration consists of the project name and an optional
+filter query. If a filter query is specified, email notifications will
+be sent only for changes of that project that match this query.
+
+In addition, each watch configuration can contain a list of
+notification types that determine for which events email notifications
+should be sent. E.g. a user can configure that email notifications
+should only be sent if a new patch set is uploaded and when the change
+gets submitted, but not on other events.
+
+Project watches are stored in a `watch.config` file in the user branch:
+
+----
+[project "foo"]
+ notify = * [ALL_COMMENTS]
+ notify = branch:master [ALL_COMMENTS, NEW_PATCHSETS]
+ notify = branch:master owner:self [SUBMITTED_CHANGES]
+----
+
+The `watch.config` file has one project section for all project watches
+of a project. The project name is used as subsection name and the
+filters with the notification types, that decide for which events email
+notifications should be sent, are represented as `notify` values in the
+subsection. A `notify` value is formatted as
+"<filter> [<comma-separated-list-of-notification-types>]". The
+supported notification types are described in the
+link:user-notify.html#notify.name.type[Email Notifications documentation].
+
+For a change event, a notification will be sent if any `notify` value
+of the corresponding project has both a filter that matches the change
+and a notification type that matches the event.
+
+In order to send email notifications on change events, Gerrit needs to
+find all accounts that watch the corresponding project. To make this
+lookup fast the secondary account index is used. The account index
+contains a repeated field that stores the projects that are being
+watched by an account. After the accounts that watch the project have
+been retrieved from the index, the complete watch configuration is
+available from the account cache and Gerrit can check if any watch
+matches the change and the event.
+
+[[ssh-keys]]
+=== SSH Keys
+
+SSH keys are stored in the user branch in an `authorized_keys` file,
+which is the
+link:https://en.wikibooks.org/wiki/OpenSSH/Client_Configuration_Files#.7E.2F.ssh.2Fauthorized_keys[
+standard OpenSSH file format] for storing SSH keys:
+
+----
+ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAAAgQCgug5VyMXQGnem2H1KVC4/HcRcD4zzBqSuJBRWVonSSoz3RoAZ7bWXCVVGwchtXwUURD689wFYdiPecOrWOUgeeyRq754YWRhU+W28vf8IZixgjCmiBhaL2gt3wff6pP+NXJpTSA4aeWE5DfNK5tZlxlSxqkKOS8JRSUeNQov5Tw== john.doe@example.com
+# DELETED
+# INVALID ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAAAgQDm5yP7FmEoqzQRDyskX+9+N0q9GrvZeh5RG52EUpE4ms/Ujm3ewV1LoGzc/lYKJAIbdcZQNJ9+06EfWZaIRA3oOwAPe1eCnX+aLr8E6Tw2gDMQOGc5e9HfyXpC2pDvzauoZNYqLALOG3y/1xjo7IH8GYRS2B7zO/Mf9DdCcCKSfw== john.doe@example.com
+ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAAAgQCaS7RHEcZ/zjl9hkWkqnm29RNr2OQ/TZ5jk2qBVMH3BgzPsTsEs+7ag9tfD8OCj+vOcwm626mQBZoR2e3niHa/9gnHBHFtOrGfzKbpRjTWtiOZbB9HF+rqMVD+Dawo/oicX/dDg7VAgOFSPothe6RMhbgWf84UcK5aQd5eP5y+tQ== john.doe@example.com
+----
+
+When the SSH API is used, Gerrit needs an efficient way to lookup SSH
+keys by username. Since the username can be easily resolved to an
+account ID (via the account cache), accessing the SSH keys in the user
+branch is fast.
+
+To identify SSH keys in the REST API Gerrit uses
+link:rest-api-accounts.html#ssh-key-id[sequence numbers per account].
+This is why the order of the keys in the `authorized_keys` file is
+used to determines the sequence numbers of the keys (the sequence
+numbers start at 1).
+
+To keep the sequence numbers intact when a key is deleted, a
+'# DELETED' line is inserted at the position where the key was deleted.
+
+Invalid keys are marked with the prefix '# INVALID'.
+
+[[external-ids]]
+== External IDs
+
+External IDs are used to link external identities, such as an LDAP
+account or an OAUTH identity, to an account in Gerrit.
+
+External IDs are stored as Git Notes in the `All-Users` repository. The
+name of the notes branch is `refs/meta/external-ids`.
+
+As note key the SHA1 of the external ID key is used. This ensures that
+an external ID is used only once (e.g. an external ID can never be
+assigned to multiple accounts at a point in time).
+
+The note content is a Git config file:
+
+----
+[externalId "username:jdoe"]
+ accountId = 1003407
+ email = jdoe@example.com
+ password = bcrypt:4:LCbmSBDivK/hhGVQMfkDpA==:XcWn0pKYSVU/UJgOvhidkEtmqCp6oKB7
+----
+
+The config file has one `externalId` section. The external ID key which
+consists of scheme and ID in the format '<scheme>:<id>' is used as
+subsection name.
+
+The `accountId` field is mandatory, the `email` and `password` fields
+are optional.
+
+The external IDs are maintained by Gerrit, this means users are not
+allowed to manually edit their external IDs. Only users with the
+link:access-control.html#capability_accessDatabase[Access Database]
+global capability can push updates to the `refs/meta/external-ids`
+branch. However Gerrit rejects pushes if:
+
+* any external ID config file cannot be parsed
+* if a note key does not match the SHA of the external ID key in the
+ note content
+* external IDs for non-existing accounts are contained
+* invalid emails are contained
+* any email is not unique (the same email is assigned to multiple
+ accounts)
+* hashed passwords of external IDs with scheme `username` cannot be
+ decoded
+
+[[starred-changes]]
+== Starred Changes
+
+link:dev-stars.html[Starred changes] allow users to mark changes as
+favorites and receive email notifications for them.
+
+Each starred change is a tuple of an account ID, a change ID and a
+label.
+
+To keep track of a change that is starred by an account, Gerrit creates
+a `refs/starred-changes/YY/XXXX/ZZZZZZZ` ref in the `All-Users`
+repository, where `YY/XXXX` is the sharded numeric change ID and
+`ZZZZZZZ` is the account ID.
+
+A starred-changes ref points to a blob that contains the list of labels
+that the account set on the change. The label list is stored as UTF-8
+text with one label per line.
+
+Since JGit has explicit optimizations for looking up refs by prefix
+when the prefix ends with '/', this ref format is optimized to find
+starred changes by change ID. Finding starred changes by change ID is
+e.g. needed when a change is updated so that all users that have
+the link:dev-stars.html#default-star[default star] on the change can be
+notified by email.
+
+Gerrit also needs an efficient way to find all changes that were
+starred by an account, e.g. to provide results for the
+link:user-search.html#is-starred[is:starred] query operator. With the
+ref format as described above the lookup of starred changes by account
+ID is expensive, as this requires a scan of the full
+`refs/starred-changes/*` namespace. To overcome this the users that
+have starred a change are stored in the change index together with the
+star labels.
+
+[[reviewed-flags]]
+== Reviewed Flags
+
+When reviewing a patch set in the Gerrit UI, the reviewer can mark
+files in the patch set as reviewed. These markers are called ‘Reviewed
+Flags’ and are private to the user. A reviewed flag is a tuple of patch
+set ID, file and account ID.
+
+Each user can have many thousands of reviewed flags and over time the
+number can grow without bounds.
+
+The high amount of reviewed flags makes a storage in Git unsuitable
+because each update requires opening the repository and committing a
+change, which is a high overhead for flipping a bit. Therefore the
+reviewed flags are stored in a database table. By default they are
+stored in a local H2 database, but there is an extension point that
+allows to plug in alternate implementations for storing the reviewed
+flags. To replace the storage for reviewed flags a plugin needs to
+implement the link:dev-plugins.html#account-patch-review-store[
+AccountPatchReviewStore] interface. E.g. to support a multi-master
+setup where reviewed flags should be replicated between the master
+nodes one could implement a store for the reviewed flags that is
+based on MySQL with replication.
+
+[[account-sequence]]
+== Account Sequence
+
+The next available account sequence number is stored as UTF-8 text in a
+blob pointed to by the `refs/sequences/accounts` ref in the `All-Users`
+repository.
+
+Multiple processes share the same sequence by incrementing the counter
+using normal git ref updates. To amortize the cost of these ref
+updates, processes increment the counter by a larger number and hand
+out numbers from that range in memory until they run out. The size of
+the account ID batch that each process retrieves at once is controlled
+by the link:config-gerrit.html#notedb.accounts.sequenceBatchSize[
+notedb.accounts.sequenceBatchSize] parameter in the `gerrit.config`
+file.
+
+[[replication]]
+== Replication
+
+To replicate account data the following branches from the `All-Users`
+repository must be replicated:
+
+* `refs/users/*` (user branches)
+* `refs/meta/external-ids` (external IDs)
+* `refs/starred-changes/*` (star labels)
+* `refs/sequences/accounts` (account sequence numbers, not needed for Gerrit
+ slaves)
+
+GERRIT
+------
+Part of link:index.html[Gerrit Code Review]
+
+SEARCHBOX
+---------