This is the mail archive of the
systemtap@sourceware.org
mailing list for the systemtap project.
RFC: improve build-id mismatch error reporting
- From: Timo Juhani Lindfors <timo dot lindfors at iki dot fi>
- To: systemtap at sourceware dot org
- Date: Tue, 13 Sep 2011 10:54:38 +0300
- Subject: RFC: improve build-id mismatch error reporting
Hi,
currently if Debian users upgrade their kernel but forget to reboot they
get a pretty cryptic error message from systemtap:
> ERROR: Build-id mismatch: "kernel" vs. "vmlinux-2.6.32-5-amd64" byte 0 (0x5e vs 0xff)
> Pass 5: run failed. Try again with another '--vp 00001' option.
I want to improve that. I think ideally it should say
> ERROR: Debug symbols don't match running kernel. You are running
> "Linux version 2.6.32-5-amd64 (Debian 2.6.32-35) (dannf@debian.org) (gcc version 4.3.5 (Debian 4.3.5-4) ) #1 SMP Tue Jun 14 09:42:28 UTC 2011"
> but the symbols are for
> "Linux version 2.6.32-5-amd64 (Debian 2.6.32-35squeeze2) (dannf@debian.org) (gcc version 4.3.5 (Debian 4.3.5-4) ) #1 SMP Fri Sep 9 20:23:16 UTC 2011"
but the problem is that I don't know how to get this information from
kernel space. So far I can get it with
$ strings /usr/lib/debug/boot/vmlinux-2.6.32-5-amd64 | grep "^Linux version"
Linux version 2.6.32-5-amd64 (Debian 2.6.32-35squeeze2) (dannf@debian.org) (gcc version 4.3.5 (Debian 4.3.5-4) ) #1 SMP Fri Sep 9 20:23:16 UTC 2011
$ cat /proc/version
Linux version 2.6.32-5-amd64 (Debian 2.6.32-35) (dannf@debian.org) (gcc version 4.3.5 (Debian 4.3.5-4) ) #1 SMP Tue Jun 14 09:42:28 UTC 2011
which is not very clean. The "2.6.32-35" part is available in
/usr/src/linux-headers-3.0.0-1-amd64/include/generated/compile.h:
/* This file is auto generated, version 1 */
/* SMP */
#define UTS_MACHINE "x86_64"
#define UTS_VERSION "#1 SMP Sat Aug 27 16:21:11 UTC 2012"
#define LINUX_COMPILE_DISTRIBUTION "Debian"
#define LINUX_COMPILE_DISTRIBUTION_OFFICIAL_BUILD
#define LINUX_COMPILE_DISTRIBUTION_UPLOADER "ben@decadent.org.uk"
#define LINUX_COMPILE_DISTRIBUTION_VERSION "3.0.0-3"
#define LINUX_COMPILE_BY "unknown"
#define LINUX_COMPILE_HOST "Debian"
#define LINUX_COMPILER "gcc version 4.5.3 (Debian 4.5.3-8) "
and gets embedded in the kernel as part of the linux_proc_banner
variable due to a debian specific patch to version.c. Unfortunately that
variable is not exported.
Fortunately utsname()->version is available to us. It does not have the
version number but instead the build date but now that I think of it
this might be even better: people might recompile their custom kernels
and forget to update the version number. I came up with
--- a/translate.cxx
+++ b/translate.cxx
@@ -1210,6 +1210,7 @@
// just in case modversions didn't.
o->newline() << "{";
o->newline(1) << "const char* release = UTS_RELEASE;";
+ o->newline() << "const char* version = UTS_VERSION;";
// NB: This UTS_RELEASE compile-time macro directly checks only that
// the compile-time kbuild tree matches the compile-time debuginfo/etc.
@@ -1230,6 +1231,12 @@
o->newline() << "rc = -EINVAL;";
o->newline(-1) << "}";
+ o->newline() << "if (strcmp (utsname()->version, version)) {";
+ o->newline(1) << "_stp_error (\"module version mismatch (%s vs %s)\", "
+ << "version, utsname()->version);";
+ o->newline() << "rc = -EINVAL;";
+ o->newline(-1) << "}";
+
// perform buildid-based checking if able
o->newline() << "if (_stp_module_check()) rc = -EINVAL;";
@@ -5878,6 +5885,7 @@
s.op->newline() << "#include <linux/utsname.h>";
s.op->newline() << "#include <linux/version.h>";
// s.op->newline() << "#include <linux/compile.h>";
+ s.op->newline() << "#include <generated/compile.h>";
s.op->newline() << "#include \"loc2c-runtime.h\" ";
s.op->newline() << "#include \"access_process_vm.h\" ";
which gives output line
ERROR: module version mismatch (#1 SMP Sat Aug 27 16:21:11 UTC 2012 vs #1 SMP Sat Aug 27 16:21:11 UTC 2011)
What do you think?
-Timo