Last modified: 2014-03-10 10:14:34 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T48723, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 46723 - Jenkins merged a faulty change
Jenkins merged a faulty change
Status: RESOLVED FIXED
Product: Wikimedia
Classification: Unclassified
Continuous integration (Other open bugs)
unspecified
All All
: Highest normal (vote)
: ---
Assigned To: Antoine "hashar" Musso (WMF)
:
: 47208 (view as bug list)
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2013-03-30 14:27 UTC by Antoine "hashar" Musso (WMF)
Modified: 2014-03-10 10:14 UTC (History)
8 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments
python script parsing build logs to find Zuul commit vs Git plugin checkout (868 bytes, text/x-python-script)
2013-04-09 21:05 UTC, Antoine "hashar" Musso (WMF)
Details
output of checkbug46723.py (10.82 KB, text/plain)
2013-04-09 21:10 UTC, Antoine "hashar" Musso (WMF)
Details
Console output for https://integration.wikimedia.org/ci/job/mediawiki-core-phpunit-parser/5386/console (2.90 KB, text/plain)
2013-04-11 21:58 UTC, Antoine "hashar" Musso (WMF)
Details

Description Antoine "hashar" Musso (WMF) 2013-03-30 14:27:48 UTC
Aaron Schulz wrote:

I noticed that https://gerrit.wikimedia.org/r/#/c/33971/ passed the tests but after it was merged, the new tests started failing for everything. The commit to revert it also failed so I override Jenkins and merged anyway, and the failures went away for new commits. This indicates that something broken is going, possibly Jenkins running tests just against master rather than master + the patch, which would explain this problem.
Comment 1 Gerrit Notification Bot 2013-04-09 08:20:17 UTC
Related URL: https://gerrit.wikimedia.org/r/58283 (Gerrit Change I4b3fadccaae9c35964a0c47d63b22c4f35148a24)
Comment 2 Antoine "hashar" Musso (WMF) 2013-04-09 08:46:55 UTC
From bug 47031 : https://gerrit.wikimedia.org/r/#/c/57436/ has been merged although it is faulty.

The unit tests ran on patchset upload did catch the issue:

https://integration.wikimedia.org/ci/job/mediawiki-core-phpunit-misc/5222/console : FAILURE

But the gating run after CR+2 did not catch it:

https://integration.wikimedia.org/ci/job/mediawiki-core-phpunit-misc/5223/console : SUCCESS


The root cause is that despite the ZUUL_REF points to the proper merge commit, the Jenkins Git plugin seems to use the current origin/master to build.
Comment 3 Antoine "hashar" Musso (WMF) 2013-04-09 13:30:00 UTC
build #5223 


Workspace did get wiped:
 02:46:53 Wiping out workspace first.

It check out the revision:
 02:46:56 Checking out Revision 4c69569db71d149feff6c4b10ea7a493425d67fd (origin/master)

That is the master revision NOT the change. The commit should have been
7dd3356a51951f8cdfe463552b5e5aae272e8e60

----

The related merge job
https://integration.wikimedia.org/ci/job/mediawiki-core-merge/11333/console

02:44:17 Commencing build of Revision 7dd3356a51951f8cdfe463552b5e5aae272e8e60 (origin/master)
02:44:17 Checking out Revision 7dd3356a51951f8cdfe463552b5e5aae272e8e60 (origin/master)

----

The ZUUL_REF has probably not been resolved properly and the git plugin fallback to master.

There is also the possibility that the mediawiki-core-phpunit-misc job was using ZUUL_COMMIT as a refspec instead of ZUUL_REF. That might prevent the plugin from fetching the revision.   The job history is no more accessible due to an unexpected upgrade (see bug 47040).
Comment 4 Antoine "hashar" Musso (WMF) 2013-04-09 21:05:32 UTC
Created attachment 12065 [details]
python script parsing build logs to find Zuul commit vs Git plugin checkout
Comment 5 Antoine "hashar" Musso (WMF) 2013-04-09 21:10:24 UTC
Created attachment 12066 [details]
output of checkbug46723.py

The result script output highlight that some builds are not testing what they should be testing because they check out a parent commit.  By looking at the Jenkins Git plugin source code, it seems that whenever the reference is not parseable (aka: git rev-parse $ZUUL_REF), the plugin fallback to use master or some parent commit.

I need to improve the script to find out if that happens in a specific pipeline or for some specific refs.
Comment 6 Antoine "hashar" Musso (WMF) 2013-04-09 21:18:59 UTC
Extract for the two builds referenced somewhere above:

Verifying /var/lib/jenkins/jobs/mediawiki-core-phpunit-misc/builds/5222/log
Zuulcommit: 8cc0b601aa2db6db09ac0e4d70847293d75875aa
Checkedout: 8cc0b601aa2db6db09ac0e4d70847293d75875aa
Verifying /var/lib/jenkins/jobs/mediawiki-core-phpunit-misc/builds/5223/log
Zuulcommit: 7dd3356a51951f8cdfe463552b5e5aae272e8e60
Checkedout: 4c69569db71d149feff6c4b10ea7a493425d67fd (MISMATCH)


We can see that build 5223 did not used the proper commit :-]


I suspect git plugin does not fetch the proper references / can't find it. That result internally in an unknown sha1 and then git plugin fallback to master or something else.

I will try to reproduce the issue in labs with git plugin set to verbose. That needs to start Jenkins with -Dhudson.plugins.git.GitSCM.verbose=true
Comment 7 Antoine "hashar" Musso (WMF) 2013-04-10 10:44:52 UTC
I have traced the issue as far as mediawiki-core-lint build #19 from made on November 22nd 2012).


MISMATCH in /var/lib/jenkins/jobs/mediawiki-core-lint/builds/19/log
Pipeline: gate
Zuulcommit: 76606b66b006ac0e62087e6d00b1e4bdd56fff09
Checkedout: 232e34733fc68739ba96cccc31d3ff88f9484a23
Comment 8 Antoine "hashar" Musso (WMF) 2013-04-10 11:59:21 UTC
We are lacking the git plugin verbose mode in production due to a bug. It is corrected with https://gerrit.wikimedia.org/r/58489  . That will help find out what the plugin is doing internally.
Comment 9 Antoine "hashar" Musso (WMF) 2013-04-11 21:58:20 UTC
Created attachment 12084 [details]
Console output for https://integration.wikimedia.org/ci/job/mediawiki-core-phpunit-parser/5386/console
Comment 10 Antoine "hashar" Musso (WMF) 2013-04-11 21:59:04 UTC
ZUUL_COMMIT=76cb37f0c69dcd69884fc6e66681e77c8045a08e

but it fetched origin/master instead :-(
Comment 11 Antoine "hashar" Musso (WMF) 2013-04-12 09:29:58 UTC
The branch specifier in the git plugin is set to ZUUL_BRANCH which is 'master'. 

In the git plugin (at git-plugin/src/main/java/hudson/plugins/git/util/DefaultBuildChooser.java ), the getCandidateRevisions() will recognize whether the branch looks like a sha1 (if it matches /[0-9a-f]{6,40}/) and in such a case will create a detached branch using that commit.

Seems the Jenkins job macro should then use ZUUL_COMMIT as a branch specifier.
Comment 12 Gerrit Notification Bot 2013-04-12 09:38:32 UTC
Related URL: https://gerrit.wikimedia.org/r/58865 (Gerrit Change Iafebfffe480886fc8956e56517291b1b3b1fc0cc)
Comment 13 Gerrit Notification Bot 2013-04-12 09:38:35 UTC
Related URL: https://gerrit.wikimedia.org/r/58865 (Gerrit Change Iafebfffe480886fc8956e56517291b1b3b1fc0cc)
Comment 14 Antoine "hashar" Musso (WMF) 2013-04-12 09:39:46 UTC
I have updated mediawiki-core-whitespaces job to use ZUUL_COMMIT as a refspec specifier. The job is non voting so that is not going to do any harm.

The experimental change is https://gerrit.wikimedia.org/r/58865
Comment 15 Liangent 2013-04-12 09:40:43 UTC
(In reply to comment #13)
> Related URL: https://gerrit.wikimedia.org/r/58865 (Gerrit Change
> Iafebfffe480886fc8956e56517291b1b3b1fc0cc)

Why is this comment duplicated?
Comment 16 Antoine "hashar" Musso (WMF) 2013-04-14 11:37:22 UTC
*** Bug 47208 has been marked as a duplicate of this bug. ***
Comment 17 Gerrit Notification Bot 2013-04-15 10:40:47 UTC
https://gerrit.wikimedia.org/r/58865 (Gerrit Change Iafebfffe480886fc8956e56517291b1b3b1fc0cc) | change APPROVED and MERGED [by Hashar]
Comment 18 Antoine "hashar" Musso (WMF) 2013-04-15 10:41:21 UTC
https://gerrit.wikimedia.org/r/#/c/58865/ has been deployed.

I am now manually updating the jobs which are not under JJB:

analytics-libanon
analytics-udp-filters
analytics-webstatscollector
analytics-wikistats
mwext-PoolCounter-pep8
mwext-VisualEditor-docgen
operations-debs-python-voluptuous-debbuild
parsoid-parse-tool-check
parsoid-roundtrip-test-check
parsoid-runTests
test-mediawiki-merge
Comment 19 Antoine "hashar" Musso (WMF) 2013-04-15 10:53:18 UTC
Will monitor over the next few days. Lowering priority for now.
Comment 20 Antoine "hashar" Musso (WMF) 2013-04-16 13:00:26 UTC
hashar@gallium:~$ ./checkbug46723.py mediawiki-core-phpunit-api --filter 2013-04-16*
Found 0 mismatches in 29 log files.
hashar@gallium:~$ ./checkbug46723.py mediawiki-core-phpunit-misc --filter 2013-04-16*
Found 0 mismatches in 29 log files.
$

Seems it got fixed :-]  Will verify again during the week, but so far that looks good.
Comment 21 Antoine "hashar" Musso (WMF) 2013-04-20 13:11:46 UTC
I have verified the jobs triggered over the past few days. Seems to work fine now :-)  The root cause was using ZUUL_BRANCH as a branch specifier instead of ZUUL_COMMIT.
Comment 22 Gerrit Notification Bot 2014-03-05 21:04:20 UTC
Change 117045 had a related patch set uploaded by Hashar:
Parsoid: uses ZUUL_COMMIT as a git refspec to build

https://gerrit.wikimedia.org/r/117045
Comment 23 Gerrit Notification Bot 2014-03-10 10:14:28 UTC
Change 117045 merged by jenkins-bot:
Parsoid: uses ZUUL_COMMIT as a git refspec to build

https://gerrit.wikimedia.org/r/117045

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links