| Commit message (Collapse) | Author | Age | Files | Lines |
|\
| |
| |
| |
| |
| |
| |
| | |
* stable-2.16:
Call retryDone() when giving up after lock failures
Fix issue with task cleanup after retry
Change-Id: Id987043c8a26bd3f69fb4bd5b84591ae20cb83ba
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Previously when giving up after retrying due to too many lock failures,
a 'replication start --wait' command would wait indefinitely if it was
waiting on the push that gave up. Fix this by calling retryDone() after
giving up which will trigger the ReplicationStatus to reflect a failure
allowing the waiting to complete.
Change-Id: I0debade83612eb7ce51bab0191ab99464a6e7cd3
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Destination.notifyFinished method calls finish on
ReplicationTasksStorage.Task objects which are not scheduled for retry.
The issue is that for rescheduled tasks PushOne.isRetrying
will always returns true even if task is already replicated.
That creates a situation where tasks scheduled for retry are
never cleaned up.
Bug: Issue 12754
Change-Id: I4b10c2752da6aa7444f57c3ce4ab70eb00c3f14e
|
|\|
| |
| |
| |
| |
| |
| | |
* stable-2.16:
Use volatile and AtomicIntegers to be thread safe
Change-Id: I90a3e17e2f49d07707409ba390c0a6dd0501b512
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Modify the fields in ReplicationState class to be volatile and
AtomicIntegers so that changes to them are reflected to other
threads. By not doing so, modifications made by one thread to
these fields may not be reflected instantly depending on
cpu caching thus resulting in incorrect state
Change-Id: I76512b17c19cc68e4f1e6a5223899f9a184bb549
|
|\|
| |
| |
| |
| |
| |
| | |
* stable-2.16:
Fix replication to retry on lock errors
Change-Id: I6e262d2c22d2dcd49b341b3c752d6d8b6c93b32c
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Versions of Git released since 2014 have created a new status
"failed to update ref" which replaces the two statuses "failed to lock"
and "failed to write". So, we now see the newer status when the remote
is unable to lock a ref.
Refer Git commit:
https://github.com/git/git/commit/6629ea2d4a5faa0a84367f6d4aedba53cb0f26b4
Config 'lockErrorMaxRetries' is not removed as part of this change
as folks who have it configured currently don't run into unexpected
behavior with retries when they upgrade to a newer version of the
plugin. Also, the "failed to lock" check is not removed for folks
still using a version of Git older than 2014.
Change-Id: I9b3b15bebd55df30cbee50a0e0c2190d04f2f443
|
|\|
| |
| |
| |
| |
| |
| |
| | |
* stable-2.16:
ReplicationStorageIT: Wait for all pushes without order
ReplicationTasksStorage: Add multi-primary unit tests
Change-Id: I1d749621c189ee2e49f092ddc7558f83e508411f
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Some tests don't have a predefined order for which events will be
replicated first. Using a timeout based on a single replication event is
flawed when we don't know the expected order. Instead, use a timeout for
the group of events and ignore the order.
For two events replicating to a single remote with a single thread, we
expect the complete replication to take twice as long. Two events
replicating to two remotes will use one thread each and therefore not
take any longer than the single remote case.
Change-Id: Ieb21b7eee32105eab5b5a15a35159bb4a837e363
|
| |\ |
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
These tests examine the replication scenarios under multi-primary
setup making use of the api calls present in ReplicationTasksStorage
class similarly as done in single primary setup.
These tests ensure that the replication compatibility in multi-primary
setup is not broken.
Change-Id: I375b731829f3c0640d3a7a98635e1e5c526908ca
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
All other ITs split e2e and storage tests on stable-2.16, so this change
only updates the new replicateBranchDeletion tests that were added in
stable-3.0.
The e2e check for if the destination branch is removed or not stays in
ReplicationIT and the check that a task is created in storage when the
branch delete API is invoked moves to ReplicationStorageIT.
This split allows the best practices for verifying e2e and storage to be
applied independently.
Change-Id: Iec7ee090bd614e3442b1f9cb454437c9e05290be
|
|\| |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
* stable-2.16:
Refactor Replication*IT tests to share a base class
ReplicationIT: Add shouldMatch* e2e tests
ReplicationStorageIT: Move shouldMatch* tests from ReplicationIT
ReplicationStorageIT: Add shouldFire*ChangeRefs tests
Move storage-based ITs into ReplicationStorageIT
ReplicationQueue: Remove unused method
This change does not try to reimpose the breakdown of tests that was
done in 2.16. That will be done in follow up change(s) to improve
reviewability of this change.
Change-Id: I83202997610c5ad0d8849cb477ca36db8df760f5
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
These classes have very similar setups and duplicate helper methods.
Improve maintainability by reducing the duplication.
ReplicationQueueIT is not modified because it is merged into
ReplicationIT on stable-3.0.
Change-Id: Ibc22ae4d0db2d09009f65c0e745f1095c67827ba
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
These new tests utilize creating a branch in a way that does not trigger
replication so that scheduleFullSync() is responsible for replicating
the update. In this way, the tests verify the destination receives the
update because scheduleFullSync() matched the given URI.
Change-Id: I4ae15d0301a308a12cbca3684915e89ca421e02f
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
These tests are focused on verifying storage, so they belong in
ReplicationStorageIT. Improve these tests to better verify storage
correctness by switching the 'now' parameter to false such that
replicationDelay is honored and follow the ReplicationStorageIT
pattern using a very long delay. These improvements make these tests
much more stable.
The tests improve the ref matching slightly by comparing to the
PushOne.ALL_REFS constant.
Also removes the disableDeleteForTesting flag as there are no users of
it now.
A later change can add ReplicationIT e2e tests for these use cases.
Change-Id: Iaa14a7429a40fb62325259efa1c7d7637deef95a
|
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Copy the shouldFire*IncompleteUri tests as shouldFire*ChangeRefs to
fill a gap in test coverage.
Change-Id: Ia8df64a8574b776e6a9f7201c0862f1e6794687e
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Tests in ReplicationStorageIT utilize very long replication delays such
that tasks are never expected to complete during the test. This allows
test writers to assume the task files are still there.
Refactor tests from ReplicationIT into ReplicationStorageIT and focus
them on verifying storage correctness. This is mostly a direct copy
except that shouldFirePendingOnlyToStoredUri gets renamed and split into
two tests. One that validates tasks are fired and another that validates
replication completes to the expected destinations. This split is
necessary because of the very long delay methodology mentioned above.
Code sharing between ReplicationIT and ReplicationStorageIT will be
improved in a later commit.
Change-Id: I41179c20a10354953cff3628368dfd5f910cc940
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
And drop the misleading @VisibleForTesting annotation from the method
the removed method was wrapping. scheduleFullSync() is public so that
PushAll can call it.
Change-Id: I0139e653654fcaf20de68dddfb5ea85560a323d0
|
|\| |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
* stable-2.16:
ReplicationIT: Remove unnecessary storage inspection
ReplicationIT: Fix invalid replicationDelay setting
Split replication plugins tests in two groups
Change-Id: I2d27b715a2bfc9832ee559556d1c8acfe671d893
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Integration tests shouldn't need to rely on inspecting the underlying
ReplicationTasksStorage layer(s). All of these tests already verify the
expected end result.
This leaves 4 tests that currently completely rely on inspecting the
task storage to verify the expected result. Those tests need further
improvement to decouple from the storage layer.
Change-Id: I029d63ce7d07414d9bf5d9290d556378beedcabf
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Setting config values for a remote in replication.config vs the remote's
own config file results in the replication.config values being ignored.
Fix this by setting the values in each remote's config file.
This test had delays added to avoid any flakiness, but the delays
weren't working because of this issue. While the test generally passes,
the delay makes it safer from races.
Change-Id: Idcdf5f07b3fc91724068ec6216527665c4a48bb3
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Run unit-tests and integration tests in parallel by splitting
them into two separate tasks.
This also allows to potentially identify which group of tests
is flaky, because Bazel would flag one or the other in case of
instability.
Change-Id: I21f969a17e3653dfc5ab93d71cc6955024fc2d8f
|
|\| |
| | |
| | |
| | |
| | |
| | |
| | | |
* stable-2.16:
Make the shouldReplicateNewProject test more reliable
Change-Id: I447043d502987070bc395936484a1cb23a5ddabc
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
The ReplicationIT shouldReplicateNewProject was failing regularly on my
machine. Improve the timeout for this test so that it explicitly
includes the time needed to wait for the project to be created, not just
the scheduling and retry times.
Change-Id: Ibf3cc3506991b222ded3ee4ddfbd7e2d60341d60
|
|\| |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
* stable-2.16:
Fix synopsis in replication start cmd documentation
Don't wait for pending events to process on startup
Change-Id: If4bc69761a19a0137301535759dc8a317ea04186
|
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
--url is usable with --all or projects and on its own. Update the
usage to reflect this.
Change-Id: Id3637f7bf61b7f65348b19ec0616808ef3f44ccf
|
| |/
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Previously, on large Gerrit installations with many projects and/or many
replication destinations, the replication plugin could take very long
periods of time to startup. This was particularly a problem if the
pending(persisted) event count was large as they all were rescheduled
before the plugin finished initializing. Change this behavior so that
startup merely begins the process of scheduling the pending events, but
does not wait for them to complete.
Bug: Issue 12769
Change-Id: I224c2ce2a35f987af2343089b9bb00a7fcb7e3be
|
|\|
| |
| |
| |
| |
| |
| | |
* stable-2.16:
ReplicationTasksStorage: Add unit tests
Change-Id: I8095d012b5cfa497267b6ef027f697c7e8369533
|
| |
| |
| |
| | |
Change-Id: I164426e70937bc3c4ac426be3056a01e9229746b
|
|\|
| |
| |
| |
| |
| |
| | |
* stable-2.16:
Fix naming for delay for draining the replication event queue
Change-Id: I3cba1756a10a1c12db96d04ca55d3feb7bc8784e
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Thread.sleep() takes milliseconds as an argument, not seconds.
Otherwise, multiplying by 1000 would be a bug.
Also switches to returning a long, which fixes a potential overflow when
multiplying by 1000.
Change-Id: I3fc5c939e8c09c134e24fa9381e96e6529b5be4d
|
|\|
| |
| |
| |
| |
| |
| |
| | |
* stable-2.16:
Improve readability of shouldFirePendingOnlyToStoredUri test
Fix flakiness in ReplicationIT for pending events firing
Change-Id: Id40baca92acc9fba8656630f725d55e5fbb6662b
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Make the ReplicationIT.shouldFirePendingOnlyToStoredUri easier
to read and simplify the extraction of the replication tasks
associated to a change ref, as regex matching isn't required
and could be misleading when reading the test.
Change-Id: Ib493275872b56bc04cdcfb541b7cfa7ecfb1e058
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Fix the shouldFirePendingOnlyToStoredUri test by making sure
that events are NOT executed by the replication engine until
the tests has completed the preparation phase.
The Gerrit build on stable-2.16 became flaky right afterward
the merge of the new shouldFirePendingOnlyToStoredUri test which
highlighted the flakiness.
The test wants to simulate a situation where a ref-update needs
to be propagated to two remotes: remote1 and remote2.
For doing so, it configures the two remotes and crates a change
for generating the two replication tasks files on the filesystem.
Then, it looks for the events associated for remote1 and removes
them, so that the next replication queue startup won't find it
and won't replicate the change to remote1.
During the interval of time between the creation of the change
and the removal of the underlying replication task on the filesystem,
the replication task could have been executed already and the
test failed.
Make sure that the replication does not kick in by
setting the replication timeout to Integer.MAX_VALUE at the
beginning. Then, once the replication task file is removed on the
filesystem, set it back to default and reload the configuration to
trigger the firing of the events.
Remove also the explicit start/stop of the replication queue, as
the config reload is already a stop/start process and it
automatically triggering an event replay.
Change-Id: Ifd591da37e94b6ce8f281cb0404f3f3c737489f3
|
|\|
| |
| |
| |
| |
| |
| | |
* stable-2.16:
Only fire the specified pending event URI
Change-Id: Ib800603d830c9b4ba688b0222ac5642ad50f17a0
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Previously the startup firing of pending events would fire every URI for
a project ref combination on startup. To avoid duplicates, it only ever
fired one round of every URI per project/ref combination. This had the
side effect that if only a single URI were stored, presumably because
the other URIs were completed before shutdown, this would result in the
creation of way more replication events than necessary, presumably many
duplicates of already completed pushes. Fix this behavior by only firing
to the specific stored URI, and remove the duplicate project/ref
filtering since that now would prevent firing to more than one URI for
the same project/ref combination when there actually are stored events
for multiple URIs. Add a test to confirm the correct new more limiting
behavior.
Bug: Issue 12779
Change-Id: I56d314af2ecbf84362dda099fa28f1b8f82cefa7
|
| |
| |
| |
| |
| |
| |
| | |
When a branch is deleted, the deletion should be replicated to
the remote when "remote.name.mirror" is enabled.
Change-Id: If59a9bb15958f4559d62452a309afcf1ca6c3789
|
|\|
| |
| |
| |
| |
| |
| | |
* stable-2.16:
Make SecureCredentialsFactory public
Change-Id: I757ba1004ce2a851c7857762b178de9294deae21
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Access to secure.config is useful to more than just replication plugin.
Allow instantiating this class from packages other than replication
plugin. Specifically this is useful, as this class can be used from
pull-replication too.
Change-Id: Id268c869e993c6cabacfa0043ec269172e0efba1
(cherry picked from commit c09a7c08fb44094c7475313ac52154adac39a54c)
|
| |
| |
| |
| |
| |
| |
| | |
This allows it to be used in implementations that extend the
replication plugin.
Change-Id: Id81f5986f24720b9575c1987c21b2ae9672ddd37
|
|\|
| |
| |
| |
| |
| |
| |
| | |
* stable-2.16:
Fix replication of project deletion
Improve project creation replication integration test
Change-Id: I1818511118ed1738cf76d48cb49b66a52f1d83c8
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Fix a regression where the project deletion was not propagated
to the remote nodes because its associated project state was
missing from the project cache.
When replicating deletions, the project state is not present as
the replication plugin is receiving the deletions ex-post
and therefore do not have access to the project anymore.
The associated checks for project visibility and read only state
are not valid for project deletions.
Bug: Issue 12806
Change-Id: I7a9ac01b01d5dd40b8bf0c4d3347256f430329ac
|
| |
| |
| |
| |
| |
| |
| |
| | |
Make sure project creation is correctly managed
and tested, asserting that the repository contains at least one
ref replicated and not just the bare repo showing up.
Change-Id: I0fdc1e73390c2abd3e40d2a02fd7e4ce7f20bb67
|
|\|
| |
| |
| |
| |
| |
| | |
* stable-2.16:
Make persistent task keys stable
Change-Id: Iefda465c739f4669b5394d3c57f7abe0d7513b5f
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
GSON was used to create json as the input for sha1 task keys, however
gson can order the json keys differently anytime. Use the values in a
specific order to create stable keys.
Bug: Issue 11760
Change-Id: I6900b5ddb3ba8ab7b5cf7803ae9dd551b5980a59
|
|\|
| |
| |
| |
| |
| |
| | |
* stable-2.16:
Prevent persistent task listing interruptions on IOExceptions
Change-Id: Ib8bd758a3dd9c24968ec58be921c4475a7bde030
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
When iterating over the list of persisted tasks, it is possible for an
IOException to occur while reading a specific task. Prevent this
exception from breaking out of the iteration by catching and logging the
exception inside the loop instead of outside of it.
Also improve the logging by differentiating between failures that are
severe versus potentially related to other node actions since in a
multi-master scenario with shared storage, it is common for operations
on one node to "interfere" with task listing operations on another node
without causing a malfunction. Specifically, improve the exception
handling so that the logging in these latter cases have a likely
explanation of the listing error, and do not consider these specific
filesystem errors operational errors.
Change-Id: Ia2ad431c20142ff0ce23dbace34aec837e3d8540
|
|\|
| |
| |
| |
| |
| |
| | |
* stable-2.16:
Fix firing pending "..all.." events on startup
Change-Id: I04f042199fd8935bee987b8363956115a40e0872
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The Destination.wouldPushRef() method is called on startup to see if a
Destination is configured to be pushed to for a specific ref, if it is
not configured to do so, then firing the pending update is skipped.
Since the magic "..all.." ref will never match the configuration in
replication.config, always match it since if replication is configured
at all, then it should be matched.
Bug: Issue 11745
Change-Id: I53bd527932e6aea9ddd465772925d601aa034bd3
(cherry picked from commit 3ddf835c203565dbd415f468e0d40eac1b815c63)
|