Browse code

yum_install_package: fix errexit and retry

Since I93e9f312a94aeb086925e069a83ec1d3d3419423 yum_install isn't safe
under errexit. This means it really only works when called by
tools/install_prereqs.sh because for some reason, we don't set that
there.

However, there is a problem with the retry logic when detecting failed
installs. A failed package install should stop further progress, but
with the current retry logic it just goes ahead and retries the
installation, which then incorrectly passes. You can see this
happening in a test like [1].

In our detection scripts, make a failed package or missing packages
exit with error-code 2, and "die" when we see this to correctly stop.

[1] http://logs.openstack.org/81/285881/1/check/gate-tempest-dsvm-platform-fedora23-nv/a83be30/logs/devstacklog.txt.gz

Change-Id: I4ea5515fa8e82a66aefa3ec3a48b823b645274f7

Ian Wienand authored on 2016/03/01 11:14:39
Showing 1 changed files
... ...
@@ -1302,13 +1302,14 @@ function yum_install {
1302 1302
 
1303 1303
     time_start "yum_install"
1304 1304
 
1305
-    # Warning: this would not work if yum output message
1306
-    # have been translated to another language
1305
+    # - We run with LC_ALL=C so string matching *should* be OK
1306
+    # - Exit 1 if the failure might get better with a retry.
1307
+    # - Exit 2 if it is fatal.
1307 1308
     parse_yum_result='             \
1308 1309
         BEGIN { result=0 }         \
1309 1310
         /^YUM_FAILED/ { exit $2 }  \
1310
-        /^No package/ { result=1 } \
1311
-        /^Failed:/    { result=1 } \
1311
+        /^No package/ { result=2 } \
1312
+        /^Failed:/    { result=2 } \
1312 1313
         //{ print }                \
1313 1314
         END { exit result }'
1314 1315
 
... ...
@@ -1316,15 +1317,21 @@ function yum_install {
1316 1316
     # missing or failed packages are OK.
1317 1317
     # See https://bugzilla.redhat.com/show_bug.cgi?id=965567
1318 1318
     (sudo_with_proxies "${YUM:-yum}" install -y "$@" 2>&1 || echo YUM_FAILED $?) \
1319
-        | awk "$parse_yum_result"
1320
-    result=$?
1321
-
1322
-    if [ "$result" != 0 ]; then
1323
-        echo $LINENO "${YUM:-yum}" install failure: $result
1324
-    fi
1319
+        | awk "$parse_yum_result" && result=$? || result=$?
1325 1320
 
1326 1321
     time_stop "yum_install"
1327 1322
 
1323
+    # if we return 1, then the wrapper functions will run an update
1324
+    # and try installing the package again as a defense against bad
1325
+    # mirrors.  This can hide failures, especially when we have
1326
+    # packages that are in the "Failed:" section because their rpm
1327
+    # install scripts failed to run correctly (in this case, the
1328
+    # package looks installed, so when the retry happens we just think
1329
+    # the package is OK, and incorrectly continue on).
1330
+    if [ "$result" == 2 ]; then
1331
+        die "Detected fatal package install failure"
1332
+    fi
1333
+
1328 1334
     return "$result"
1329 1335
 }
1330 1336