This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: On glibc's resolver


On Tue, Dec 25, 2012 at 10:14 PM, Dimitrios Apostolou <jimis@gmx.net> wrote:
> I was trying to write a patch for glibc so hopefully this is the appropriate
> list, please let me know otherwise.

Excellent question, and this is the right list.

> I have been tracing weird behaviour of my mail client (alpine) and ended up
> in getaddrinfo() calls, which are handled by glibc's resolver. In
> particular, when I connect my laptop to different networks and the previous
> DNS server is unreachable, resolver never re-reads its cache and all queries
> timeout after several retries.

What we need is a test case with expected and observed behaviour.
Given a test case we can justify or refute the expected or observed
behaviour against relevant standards or prior art.

> Apparently this is a known issues, and a web search reveals discussions from
> as early as 2003. I'd appreciate your opinions, I was thinking of writing a
> patch but I can't figure out where it should go, alpine or glibc, code or
> documentation! Here are the replies I gathered from a web search:

Could you please provide references to the prior discussions so we can
review them also?

> 1) Use a caching daemon (nscd maybe, some argue that it does not provide a
> solution) which should be restarted/reloaded when changing networks.
>
> 2) Call res_init() if getaddrinfo() fails.

These two solutions are interesting in that the
distribution/application is in control of when and why the resolver
should carry out the costly operation of reloading whatever data is
required to resolve a name.

The distribution can rehup nscd when the network is reconfigured. The
application can call res_init() as required (perhaps as documented by
new documentation).

> 3) Patch glibc to stat() /etc/resolv.conf, checking for changes. Debian,
> Ubuntu are patched.

This sounds like the worst possible solution, imposing a penalty on
all applications for a change that is well defined in a higher level.

> 4) Use a custom DNS library, glibc is unsuitable for this purpose.

Certainly an option. You are allowed to so as you wish with your system.

However, I do not think that glibc is unsuitable for these purposes
and that with some effort we can put together a solution.

> Here is my take. About nscd, I'm having the problem on a major distro
> (Fedora) so I can only guess there are good reasons for not using it by
> default.

The complexity of caching name server requests is not something that
should be enabled by default unless there is a specific need.

> On (2), res_init() is a BSD non-standard function, and its man page doesn't
> mention such a purpose. In fact I can't be sure if it's safe to call it
> multiple times and I see no guarantee that it will re-initialise the
> resolver more than once. If it's the proposed way shouldn't it be mentioned
> in both res_init() and getaddrinfo()'s man pages, or otherwise a big warning
> that resolv.conf is never reparsed?

This seems like a sensible solution e.g. an API call that guarantees
that the resolver can operate correctly after a network configuration
change.

I haven't reviewed the code in question so I don't actually know if
res_init() is safe to be used this way. Part of your work would be to
look into this and propose the documentation patch and provide
sufficient background to justify the changes.

> On (3) I don't have a Debian system to check it, but the overhead of
> stat'ing on every request is probably unacceptable. I was thinking of
> writing a patch that would stat() and reparse after a single request
> timeout, so that following retries (unless RES_DFLRETRY is reached) will
> automatically connect to the new servers. Would that be acceptable?

No.

> Finally using a custom library sounded logical, until I started reading
> glibc's resolver. Really, with such size and complexity and even
> asynchronous interface provided, shouldn't we also provide the simplest
> facilities?

We should.

> And a related question, is there a way to setup resolver behaviour (timeout,
> retries) for a process programmatically, instead of changing the system-wide
> resolv.conf?

There is no interface for this. This is another place where
enhancements would be greatly appreciated.

Please feel free to email libc-help@sourceware.org if you have any
more general questions about how to X or Y.

Cheers,
Carlos.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]