This is the mail archive of the binutils@sourceware.org mailing list for the binutils project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Two level comdat priorities in gold


I'd prefer to avoid something like two-level comdats, which forces the
linker into the two-pass approach used for --gc-sections and --icf.
However, if you do need something like that, I think using a group
flag in the GRP_MASKOS bit range to identify "weak" comdat groups
would be preferable to the .gnu.comdat.low section, and would address
the issue with ld -r. It also ought to be possible to implement such a
concept without going to two passes; perhaps by maintaining a list of
weak comdat groups.

>>> While it is possible to construct test cases for this problem using C
>>> inline functions, in practice the problem is going to arise in C++.
>>> In C++ it's similar to the problem solved by using ABI tags.  This
>>> suggests to me that we should have a compiler option allowing an ABI
>>> tag to be specified for all weak definitions.  As far as I can see
>>> that would address the entire problem, with no confusion about -r, and
>>> permitting optimized functions to call optimized versions of the vague
>>> linkage definitions.

This still might have a problem with virtual functions, if the
compiler needs to emit a vtable. In that case, the vtable will point
to the "accidentally optimized" -mxxx function, and if that's the copy
of the vtable we end up with, everyone is going to call it. You could
consider extending the ABI tag to the vtable itself, essentially
creating a new class, but that would still be a problem if we
construct an instance of the class in the optimized code and return
that instance to non-optimized code.

It seems to me that the only safe way to do this is to make sure that
generated templates and out-of-line inlines are generated with
optimization suitable for the entire program. The downside of not
calling avx-optimized template functions from avx-optimized code
doesn't seem that bad -- if a call is performance sensitive enough, it
should be inlined, in which case the optimizations could apply.

If you could simply disallow virtual functions, either the
localization approach or the ABI tag approach should work -- they have
essentially the same effect, except that with ABI tags you could avoid
the code bloat.

You could also just add a compiler option to suppress generated
template functions and out-of-line inlines completely, then cross your
fingers and hope that the needed functions will be available in some
other object. If you're in control of the libraries that contain these
avx-optimized functions, that may not be as dangerous as it sounds --
just add a normally-optimized .o that instantiates all the routines
needed by the avx-optimized objects.

You mentioned pointer equality, but if that's really an issue, the
only way I can see to solve that is to make sure you don't apply the
-mxxx options to the template functions.

-cary


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]