This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: tracking memory map changes


David Smith <dsmith@redhat.com> writes:

> [...]
> Let me start by describing what is composed by a memory map.  

Thanks for getting started with this.

> vm_start-vm_end  flags vm_pgoff MJ:MN inode      path
> ----------------- ---- -------- ----- -------    -----------------
> 00110000-00111000 r-xp 00110000 00:00 0          [vdso]
> [...]
> 00674000-007c7000 r-xp 00000000 fd:00 3083040    /lib/libc-2.7.so
> 007c7000-007c9000 r-xp 00153000 fd:00 3083040    /lib/libc-2.7.so
> 007c9000-007ca000 rwxp 00155000 fd:00 3083040    /lib/libc-2.7.so
> [...]
> 08048000-0804d000 r-xp 00000000 fd:00 2621473    /bin/cat
> 0804d000-0804e000 rw-p 00004000 fd:00 2621473    /bin/cat
> [...]
> At first I was confused by multiple vm_area_structs for /lib/ld-2.7.so,
> /lib/libc-2.7.so and /bin/cat, until I realized they were for the .text,
> .data, and .bss sections of those files.  [...]

Not necessarily: there can exist rw- mappings of the same areas that
are later (or even concurrently) mapped r-x.  BSS regions in
particular aren't even really mapped in from a given binary because
they're not present in there in the first place.

  eu-readelf -S FILE: Type == NOBITS

> Note that there are no explicit flags set on a vm_area_struct for the
> differences between sections - in other words, there is nothing that
> definitively says that this particular vm_area_struct maps a .text
> section vs. a .data section vs. a .bss section.  [...]

That's OK - the kernel doesn't care.  The vm_pgoff value tells us
which page of the underlying ELF file is being mapped.  The translator
will need to pass enough data to the runtime to figure out that, e.g.,
page 0x153000 of libc-2.7.so refers to its text segment.

  eu-readelf -l FILE: Offset

> Frank, here are some initial questions.
>
> Q1: What information will the runtime need from each vm_area_struct?
> I'd guess the path, vm_start, and vm_end at a minimum.

And vm_pgoff.

> Q2: Will the runtime want to know only about new text sections being
> added or all sections?

For now, the text stuff (which, for your purposes, may be those pages
that are mapped in with "x" (execute) privileges).  Before long though
I'd like to give the runtime a map of the programs' *data* also, so
that data pointers can be mapped to data symbols.  That would perhaps
allow us to profile "frequently accessed variables".

> Q3: Will the runtime want to know about any of the vm_area_structs not
> associated with a file?

For now, probably not.  This should not be a hard decision though.

> When /bin/cat, gets exec'ed, the /lib/ld-2.7.so and /bin/cat files are
> already mapped in. [...]

Yup, and we definitely want to know about those.

> So, where/how to track memory map changes?  Here are a few ideas:
> [...]  3) Turn on utrace syscall return tracing for that thread and
> wait for mmap calls to return.  This is probably the easiest route,
> but it forces every syscall for that thread to go through the slow
> path.  [...]

Let's do this for now.

> In all of the above methods the code won't know what was added, just
> that a new vm_area_struct might exist, so I'll have to figure out a
> way to track changes. [...]

Right.


- FChE


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]