Discussion:
Problems with packages on a netbsd-6 sparc
Chris Ross
2012-05-02 03:23:49 UTC
Permalink
So, I have an old sparc20 I was getting running, and I built a April 29 netbsd-6 for it, and it seemed to boot and run just fine. I had a few watchdog reset problems, but running without mixing processor speeds seems to have fixed that.

Right now I'm running it with a pair of 125Mhz Hypersparcs. And, it's working fine. But, I noticed that one of the early packages I built, sudo, received a bus error as soon as it was run.

Going back to pkgsrc, and building again, I saw it behave the same way. So I built it again after modifying the makefile replacing "-O2" with "-g", and it suddenly works. At least, as root, I was able to "sudo date" and see:

# ./sudo date
sudo: unknown login class: daemon
Tue May 1 12:30:59 EDT 2012
#

So, I built it again, with -O2 and saw what I'd seen before:

# ./sudo date
Bus error
#

Does anyone have any idea what the issue is here? Compiler bug? Thanks... I'll do some more testing with different optimization levels, but the problem clearly exists...

- Chris
Martin Husemann
2012-05-02 08:34:51 UTC
Permalink
Post by Chris Ross
Does anyone have any idea what the issue is here? Compiler bug? Thanks... I'll do some more testing with different optimization levels, but the problem clearly exists...
Most likely a sudo bug, compile it with "-g -O2" and check the core.

Martin
Chris Ross
2012-05-02 14:28:13 UTC
Permalink
Post by Martin Husemann
Most likely a sudo bug, compile it with "-g -O2" and check the core.
Is it really a bug in sudo if the non-optimized code works, but the
optimized code causes a crash?

Looking at it in gdb it certainly looks like a mis-aligned access.
But, I'm not sure what "<optimized out>" means.

Program received signal SIGBUS, Bus error.
0x00024e24 in make_pwitem (pw=0x204a7aa8, name=0x0) at ./pwutil.c:178
178 memcpy(newpw, pw, sizeof(struct passwd));
(gdb) print newpw
$1 = <optimized out>
(gdb) print pw
$2 = (const struct passwd *) 0x204a7aa8
(gdb)


I'll take a look at the code directly, but I still don't understand
how it could be a code bug, rather than an optimizer bug, that only
shows when code is optimized.

- Chris
Greg Troxel
2012-05-02 14:44:14 UTC
Permalink
Is it really a bug in sudo if the non-optimized code works, but the
optimized code causes a crash?

hard to say, but time to read assembly and then we can see.....

cc -g -S will be helpful - that's what I use in times like this.


There is a perhaps-similar issue with pkgsrc/security/cfs. On i386, it
seems to work fine when built with an older compiler (maybe gcc 3.? from
netbsd 4) and fail with gcc 4.1. I haven't figured that out yet.
Mouse
2012-05-02 15:07:48 UTC
Permalink
Post by Chris Ross
Looking at it in gdb it certainly looks like a mis-aligned access.
But, I'm not sure what "<optimized out>" means.
0x00024e24 in make_pwitem (pw=0x204a7aa8, name=0x0) at ./pwutil.c:178
178 memcpy(newpw, pw, sizeof(struct passwd));
(gdb) print newpw
$1 = <optimized out>
(gdb) print pw
$2 = (const struct passwd *) 0x204a7aa8
It means the variable is not stored anywhere at that point, because the
optimizer determined it didn't need to exist. Most likely it has gone
dead (in the data-flow sense), but I can imagine other possibilities.

I'm not sure why you think this means it's misaligned. While I'm not
certain, I think sparc32 has no types requiring stricter than 8-byte
alignment, meaning 204a7aa8 is suitably aligned for anything. Looking
at struct passwd, I don't see any types there which I would expect to
be wider than 8 bytes, though, admittedly, I'm not looking at the -6
<pwd.h>.
Post by Chris Ross
I'll take a look at the code directly, but I still don't understand
how it could be a code bug, rather than an optimizer bug, that only
shows when code is optimized.
Code that depends on things C does not promise will not infrequently
"work" when unoptimized but break when optimized, or vice versa. But,
in such a case, the problem is the code depending on something not
promised, not a bug in the optimizer.

As a simple example of code which is likely to change its behaviour
when optimiized even with a non-buggy optimizer, consider

a = arr[i++] + arr[i++];

This is off in undefined-behaviour weeds (it violates 6.5 #2), and the
machine code it compiles to, if it compiles at all, is (formally)
permitted to behave in any way it pleases. This includes producing
different behaviour depending on the time of day, the geographical
location of the user, or...whether optimization was turned on.

Admittedly, behaviour depending on the user's location is unlikely,
since few implementations have access to that information, but it is
permitted by the language spec. Bringing it back to _likely_ results,
it is reasonably likely that an uoptimized compile might produce the
equivalent of

tmp1 = arr[i++];
tmp2 = arr[i++];
a = tmp1 + tmp2;

while an optimized compile might produce the equivalent of

a = arr[i] + arr[i];
i += 2;

or

i += 2;
a = arr[i-1] + arr[i-1];

or even

a = arr[i] + arr[i];
i ++;

Depending on what the code's author intended, any of these might
constitute "working" in the sense of producing the desired effect, with
others "breaking". Yet this is not a bug in the optimizer; it's a bug
in the code.

/~\ The ASCII Mouse
\ / Ribbon Campaign
X Against HTML ***@rodents-montreal.org
/ \ Email! 7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3 27 4B
Chris Ross
2012-05-02 15:28:46 UTC
Permalink
Post by Chris Ross
0x00024e24 in make_pwitem (pw=0x204a7aa8, name=0x0) at ./pwutil.c:178
178 memcpy(newpw, pw, sizeof(struct passwd));
(gdb) print newpw
$1 =<optimized out>
(gdb) print pw
$2 = (const struct passwd *) 0x204a7aa8
It means the variable is not stored anywhere at that point, because the
optimizer determined it didn't need to exist. Most likely it has gone
dead (in the data-flow sense), but I can imagine other possibilities.
I'm not sure why you think this means it's misaligned. While I'm not
certain, I think sparc32 has no types requiring stricter than 8-byte
alignment, meaning 204a7aa8 is suitably aligned for anything. Looking
at struct passwd, I don't see any types there which I would expect to
be wider than 8 bytes, though, admittedly, I'm not looking at the -6
<pwd.h>.
Oh, I guessed it was an alignment issue just because I've seen that so
many times on sparc's. And, it's not pw I was worried about the alignment
of, it was newpw.

Through some printf debugging, I confirmed that when optimized, newpw
was at a 4-byte alignment. It was a pointer that was offset from a malloc'd
block by sizeof(cache_item), a struct earlier in the same source file.
But interestingly, when not optimized, newpw is also at the same 4-byte
alignment, and the memcpy *doesn't* cause a SIGBUS. That surprised me.

- Chris
Eric Haszlakiewicz
2012-05-02 15:34:12 UTC
Permalink
Post by Chris Ross
Looking at it in gdb it certainly looks like a mis-aligned access.
But, I'm not sure what "<optimized out>" means.
0x00024e24 in make_pwitem (pw=0x204a7aa8, name=0x0) at ./pwutil.c:178
178 memcpy(newpw, pw, sizeof(struct passwd));
(gdb) print newpw
$1 = <optimized out>
(gdb) print pw
$2 = (const struct passwd *) 0x204a7aa8
My guess is that the crash actually happens a few instructions later,
at one of the FIELD_COPY(...) lines that tries to set fields in newpw.

Are you looking at the core dump with gdb, or are you stepping through it?
I'll bet that if you step through it you'll see it get past this memcpy call.

eric
Eduardo Horvath
2012-05-02 15:23:19 UTC
Permalink
Post by Chris Ross
Post by Martin Husemann
Most likely a sudo bug, compile it with "-g -O2" and check the core.
Is it really a bug in sudo if the non-optimized code works, but the
optimized code causes a crash?
Looking at it in gdb it certainly looks like a mis-aligned access. But, I'm
not sure what "<optimized out>" means.
Program received signal SIGBUS, Bus error.
0x00024e24 in make_pwitem (pw=0x204a7aa8, name=0x0) at ./pwutil.c:178
178 memcpy(newpw, pw, sizeof(struct passwd));
(gdb) print newpw
$1 = <optimized out>
(gdb) print pw
$2 = (const struct passwd *) 0x204a7aa8
(gdb)
I'll take a look at the code directly, but I still don't understand how it
could be a code bug, rather than an optimizer bug, that only shows when code
is optimized.
Did it really die in "memcpy()"?

It may be an optimization bug... sortof.

The gcc optimizer will sometimes inline memcpy() instead of calling the
real thing if it thinks it can do a better job. In this case it probably
thinks it knows the size of the copy and makes the assumption that both
the source and destination are properly aligned so it can generate code
that looks sorta like this:

for (i=0; i<sizeof(struct passwd)/sizeof(long); i++)
((long *)&newpw)[i] = ((long *)&pw)[i];

Of course if one of those buffers happens to be unaligned, things go boom.
And if you turn off optimization it calls the real memcpy() which does all
sorts of fancy things like check for unaligned buffers and DTRT.

Most likely the code is actually buggy in the sense that it passes in an
unaligned buffer and assigns it to a "struct passwd *" variable, something
that according to the C standard I believe is undefined. It's also
something that the optimizer can't really know about.

I think all you need to do is give gcc the parameter to tell it to turn
off the inlined memcpy() support, something like "-mno-memcpy".

Eduardo
Martin Husemann
2012-05-02 15:27:54 UTC
Permalink
Post by Chris Ross
Is it really a bug in sudo if the non-optimized code works, but the
optimized code causes a crash?
Yes, without optimization the total data layout may differ.
The code plays tricks with homegrown allocation, packing together various
structs and strings in a single allocation, i.e. a few lines before
the code you cited:

cp = (char *) item + sizeof(struct cache_item);
newpw = (struct passwd *) cp;

(though unless strange optimization options are in place this should not
cause misalignement at first glance)

Martin
Martin Husemann
2012-05-02 16:11:23 UTC
Permalink
I had a look: as Eduardo suggested, the memcpy is inlined as a struct
copy. The code is lying (by using bogus casts), and the compiler is
correct in its optimization.

This patch avoids the lie about the memcpy() args to the compiler,
but I have not tested it further:

--- pwutil.c.orig 2011-10-21 15:18:38.000000000 +0200
+++ pwutil.c 2012-05-02 18:09:37.000000000 +0200
@@ -174,8 +174,8 @@
* Copy in passwd contents and make strings relative to space
* at the end of the buffer.
*/
+ memcpy(cp, pw, sizeof(struct passwd));
newpw = (struct passwd *) cp;
- memcpy(newpw, pw, sizeof(struct passwd));
cp += sizeof(struct passwd);
FIELD_COPY(pw, newpw, pw_name, nsize);
FIELD_COPY(pw, newpw, pw_passwd, psize);

Martin
Martin Husemann
2012-05-03 08:31:47 UTC
Permalink
Post by Martin Husemann
This patch avoids the lie about the memcpy() args to the compiler,
FYI, I have fixed the package.

This was caused by additional alignement required for 64bit time_t.
The proper fix was to let the compiler calculate alignment and offsets
for us.

Martin
Chris Ross
2012-05-03 21:05:05 UTC
Permalink
Post by Martin Husemann
FYI, I have fixed the package.
This was caused by additional alignement required for 64bit time_t.
The proper fix was to let the compiler calculate alignment and offsets
for us.
Okay. Seems like an interesting fix, only as confusing as the code already was, but it works. :-)

Can anyone try to send this upstream to the sudo maintainers? Also, could this be pulled up to pkgsrc-2012Q1, as it's a bug fix?

Thanks…

- Chris
Greg A. Woods
2012-05-18 03:48:00 UTC
Permalink
At Wed, 2 May 2012 18:11:23 +0200, Martin Husemann <***@duskware.de> wrote:
Subject: Re: Problems with packages on a netbsd-6 sparc
Post by Martin Husemann
I had a look: as Eduardo suggested, the memcpy is inlined as a struct
copy. The code is lying (by using bogus casts), and the compiler is
correct in its optimization.
This patch avoids the lie about the memcpy() args to the compiler,
--- pwutil.c.orig 2011-10-21 15:18:38.000000000 +0200
+++ pwutil.c 2012-05-02 18:09:37.000000000 +0200
@@ -174,8 +174,8 @@
* Copy in passwd contents and make strings relative to space
* at the end of the buffer.
*/
+ memcpy(cp, pw, sizeof(struct passwd));
newpw = (struct passwd *) cp;
- memcpy(newpw, pw, sizeof(struct passwd));
cp += sizeof(struct passwd);
FIELD_COPY(pw, newpw, pw_name, nsize);
FIELD_COPY(pw, newpw, pw_passwd, psize);
Why not avoid the non-intuitive memcpy() all together all the time and
just let the compiler do its job????

newpw = pw;

C has allowed structure copies like this since at least C89, and much
earlier in many good compilers (the data pointed to by the fields which
are pointers still needs to be copied of course).

(of course the memory allocation hack there in sudo should be fixed too,
or maybe sudo should just be thrown in the trash and left there :-))
--
Greg A. Woods
Planix, Inc.

<***@planix.com> +1 250 762-7675 http://www.planix.com/
Mouse
2012-05-18 04:33:28 UTC
Permalink
Post by Greg A. Woods
Post by Martin Husemann
+ memcpy(cp, pw, sizeof(struct passwd));
newpw =3D (struct passwd *) cp;
- memcpy(newpw, pw, sizeof(struct passwd));
Why not avoid the non-intuitive memcpy() all together all the time
and just let the compiler do its job????
newpw =3D pw;
Well, *newpw = *pw;, but otherwise, yeah...
Post by Greg A. Woods
(of course the memory allocation hack there in sudo should be fixed
too, or maybe sudo should just be thrown in the trash and left there
:-))
That's what I would do with it. I once looked at setting up sudo for
one of my paid jobs. Here's the report I wrote for the person who
asked me to set it up, edited to remove internal stuff that it isn't my
place to reveal, fix up formatting (the copy of this I'm working from
has been forcibly mangled by a ticket system that insists on shoving
Webpage interfaces around everything), and even fix two typos. But the
basic content is accurate. This is dated 2011-11-01.

First problem:

I went looking to see where sudo came from. Even with my
half-assed Web skillz, I quickly found four different webpages each
of which seemed to think it was the real sudo webpage; given what
sudo is, I was inclined to trust none of them.

After some asking around on an external IRC channel, I settled on
sudo.ws as probably correct.

Second problem:

The thing is friggin' enormous. It's got dynamic loading and
plugins and a whole raft of crap that has no business anywhere near
a central part of a security system. It's drunk the
configure-script koolaid, which is a disaster I've ranted about
elsewhere. But this was for work, which historically doesn't care
about that sort of security issue, so I ignored all that.

Third problem:

I got it built and installed. I set up a rudimentary sudoers file
which you can still find in [target machine]'s /etc/sudoers as of
this writing; the only non-comment lines are

root ALL=(ALL) ALL
Defaults:root env_editor
%infs ALL=(ALL) ALL

visudo is happy with this. But when I run sudo itself, even as
root, I get

sudo: >>> /etc/sudoers: syntax error near line 1 <<<
Segmentation fault

I did some debugging and got nowhere. /etc/sudoers is parsed with
a lex-and-yacc parser; I added debugging at the stdio level and
found it reads the whole file, then at the tokenizer level and
found it errors after getting only one token (which is COMMENT).

At this point I decided to ask the list for help. So I sent mail
to sudo-workers-***@sudo.ws to subscribe.

Fourth problem:

The mail got stuck in [my] mailq saying "Connection reset by
sudo.ws.". A manual run of the mailq reveals that it seems to be
under the impression I'm a spam sender:

sudo-workers-***@sudo.ws... Connecting to sudo.ws. via esmtp...
220 core.courtesan.com ESMTP spamd IP-based SPAM blocker; Tue Nov 1 16:27:35 2011
Post by Greg A. Woods
Post by Martin Husemann
EHLO Sparkle.Rodents-Montreal.ORG
250 Hello, spam sender. Pleased to be wasting your time.
250 You are about to try to deliver spam. Your time will be spent, for nothing.
250 This is hurting you more than it is hurting me.
Post by Greg A. Woods
Post by Martin Husemann
DATA
451 Temporary failure, please try again later.
Post by Greg A. Woods
Post by Martin Husemann
QUIT
sudo-workers-***@sudo.ws... Deferred: Connection reset by sudo.ws.
Closing connection to sudo.ws.

Now, I am - or at least was - listed by SORBS, because I refuse to
consider their abuse attempts anything but, well, abuse attempts.
And there are still a few places left that block based on SORBS.
So I tried from [that workplace's mailserver]:

[(hostname)] 1> telnet 64.85.164.38 smtp
Trying 64.85.164.38...
Connected to courtesan.com.
Escape character is '^]'.
220 core.courtesan.com ESMTP spamd IP-based SPAM blocker; Tue Nov 1 16:33:00 2011
helo (mailserver's FQDN)
250 Hello, spam sender. Pleased to be wasting your time.

I didn't bother going any further, since it seems to be of the
opinion that [that mailserver] is a spam source too [...].

At this point, I am inclined to throw sudo out completely. It has
failed catastrophically at design, at implementation, and at
support.

[...]

(I failed to think of and thus neglected to mention "distribution"
among the things sudo failed at.)

/~\ The ASCII Mouse
\ / Ribbon Campaign
X Against HTML ***@rodents-montreal.org
/ \ Email! 7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3 27 4B
Greg A. Woods
2012-05-18 20:42:06 UTC
Permalink
At Fri, 18 May 2012 00:33:28 -0400 (EDT), Mouse <***@Rodents-Montreal.ORG> wrote:
Subject: Re: Problems with packages on a netbsd-6 sparc
Post by Mouse
Post by Greg A. Woods
or maybe sudo should just be thrown in the trash and left there
:-)
That's what I would do with it. I once looked at setting up sudo for
one of my paid jobs. Here's the report I wrote for the person who
asked me to set it up
Couple all that with the fact there have been at least 18 serious
security bugs in sudo itself (never mind with how naive people can
trivially configure it to be a barn without any doors).....

The very concept of it is extremely flawed IMNSHO. It's a silly hack
created by someone who didn't properly understand the Unix security
model and since then it's been hacked on to add even more anti-security
features than one could ever imagine.

Of course even without it there are far too few people who fully
understand the true implications of just being allowed to run "su".
--
Greg A. Woods
Planix, Inc.

<***@planix.com> +1 250 762-7675 http://www.planix.com/
Eivind Evensen
2012-05-25 10:25:29 UTC
Permalink
Post by Mouse
The mail got stuck in [my] mailq saying "Connection reset by
sudo.ws.". A manual run of the mailq reveals that it seems to be
220 core.courtesan.com ESMTP spamd IP-based SPAM blocker; Tue Nov 1 16:27:35 2011
EHLO Sparkle.Rodents-Montreal.ORG
250 Hello, spam sender. Pleased to be wasting your time.
250 You are about to try to deliver spam. Your time will be spent, for nothing.
250 This is hurting you more than it is hurting me.
DATA
451 Temporary failure, please try again later.
QUIT
Closing connection to sudo.ws.
Now, I am - or at least was - listed by SORBS, because I refuse to
consider their abuse attempts anything but, well, abuse attempts.
And there are still a few places left that block based on SORBS.
[(hostname)] 1> telnet 64.85.164.38 smtp
Trying 64.85.164.38...
Connected to courtesan.com.
Escape character is '^]'.
220 core.courtesan.com ESMTP spamd IP-based SPAM blocker; Tue Nov 1 16:33:00 2011
helo (mailserver's FQDN)
250 Hello, spam sender. Pleased to be wasting your time.
That's spamd:
http://www.openbsd.org/cgi-bin/man.cgi?query=spamd&apropos=0&sektion=0&manpath=OpenBSD+Current&arch=i386&format=html
--
Eivind
David Laight
2012-05-18 06:55:35 UTC
Permalink
Post by Greg A. Woods
Post by Martin Husemann
* Copy in passwd contents and make strings relative to space
* at the end of the buffer.
*/
+ memcpy(cp, pw, sizeof(struct passwd));
newpw = (struct passwd *) cp;
- memcpy(newpw, pw, sizeof(struct passwd));
cp += sizeof(struct passwd);
FIELD_COPY(pw, newpw, pw_name, nsize);
FIELD_COPY(pw, newpw, pw_passwd, psize);
Why not avoid the non-intuitive memcpy() all together all the time and
just let the compiler do its job????
*newpw = *pw;
C has allowed structure copies like this since at least C89, and much
earlier in many good compilers (the data pointed to by the fields which
are pointers still needs to be copied of course).
You need to read the entire thread!
C only allows you to take pointers to objects, cast them to certain other
types (including char * and void *), cast them back to the SAME type
and then dereference them.
This means that the compiler is allowed to assume that 'pw' is a pointer
to an item of the relevent type, and thus has the alignment constraints
of 'struct passwd', and so can be read (in this case) with 64bit accesses.
Since the compiler knows about memcpy() that also applies when it decides
on the code to use for the memcpy() call.
Since the alignment is tracked as a property of the value, no amount
of casting will remove it.

David
--
David Laight: ***@l8s.co.uk
Greg A. Woods
2012-05-18 20:13:11 UTC
Permalink
At Fri, 18 May 2012 07:55:35 +0100, David Laight <***@l8s.co.uk> wrote:
Subject: Re: Problems with packages on a netbsd-6 sparc
Post by David Laight
Post by Greg A. Woods
Post by Martin Husemann
* Copy in passwd contents and make strings relative to space
* at the end of the buffer.
*/
+ memcpy(cp, pw, sizeof(struct passwd));
newpw = (struct passwd *) cp;
- memcpy(newpw, pw, sizeof(struct passwd));
cp += sizeof(struct passwd);
FIELD_COPY(pw, newpw, pw_name, nsize);
FIELD_COPY(pw, newpw, pw_passwd, psize);
Why not avoid the non-intuitive memcpy() all together all the time and
just let the compiler do its job????
*newpw = *pw;
C has allowed structure copies like this since at least C89, and much
earlier in many good compilers (the data pointed to by the fields which
are pointers still needs to be copied of course).
You need to read the entire thread!
C only allows you to take pointers to objects, cast them to certain other
types (including char * and void *), cast them back to the SAME type
and then dereference them.
I think that's kind of irrelevant given that I suggested getting rid of
the memcpy() entirely, but....

C allows pointers of, say, "void *" to be cast to _ANY_ type, alignment
restrictions aside. Also, any function is allowed to return a "void *"
pointer which, given such a claim in that function's definition, the
programmer can then assume it will be aligned correctly for any use.
The standard malloc() function is such a function.

(So far as I can recall there is no requirement in C, and certainly not
old C or C89, that a "void *" pointer must be cast back to the type it
had before it was cast into a "void *" pointer. Pointer aliasing like
that is not necessarily going to work in a _portable_ way if alignment
restrictions have not been dealt with somehow, but it's still allowed.)

If I'm not mistaken the "cp" pointer above, at the line where it is
assigned to "newpw" prior to the patch, is unchanged from the value it
had when it was most recently assigned from the return value of a
malloc() call.

In that case it MUST be properly aligned for any type. Turning it into
a "struct passwd *" is perfectly valid -- the fact that the storage
being pointed at is larger than a "struct passwd" is irrelevant (it's
just an example of premature optimisation).

--
Greg A. Woods
Planix, Inc.

<***@planix.com> +1 250 762-7675 http://www.planix.com/
Chris Ross
2012-05-18 20:28:33 UTC
Permalink
Post by Greg A. Woods
If I'm not mistaken the "cp" pointer above, at the line where it is
assigned to "newpw" prior to the patch, is unchanged from the value it
had when it was most recently assigned from the return value of a
malloc() call.
You are mistaken. cp was the result of malloc() result being incremented by the sizeof a struct.

- Chris
Greg A. Woods
2012-05-19 00:31:31 UTC
Permalink
At Fri, 18 May 2012 16:28:33 -0400, Chris Ross <cross+***@distal.com> wrote:
Subject: Re: Problems with packages on a netbsd-6 sparc
Post by Chris Ross
Post by Greg A. Woods
If I'm not mistaken the "cp" pointer above, at the line where it is
assigned to "newpw" prior to the patch, is unchanged from the value it
had when it was most recently assigned from the return value of a
malloc() call.
You are mistaken. cp was the result of malloc() result being
incremented by the sizeof a struct.
The original code I have says:

if ((cp = malloc(total)) == NULL)
return(NULL);
newpw = (struct passwd *) cp;

I see in the current pkgsrc patches though that this is no longer true.

In other words I guess the code got even more broken since then. :-)
--
Greg A. Woods
Planix, Inc.

<***@planix.com> +1 250 762-7675 http://www.planix.com/
Mouse
2012-05-18 22:09:13 UTC
Permalink
Post by Greg A. Woods
Post by Greg A. Woods
Post by Martin Husemann
+ memcpy(cp, pw, sizeof(struct passwd));
newpw = (struct passwd *) cp;
- memcpy(newpw, pw, sizeof(struct passwd));
Why not avoid the non-intuitive memcpy() all together all the time
and just let the compiler do its job????
*newpw = *pw;
I think that's kind of irrelevant given that I suggested getting rid
of the memcpy() entirely, but....
The problem is that cp did not point to a struct passwd. It pointed
into a mallocked buffer, incremented by the size of some other struct.
"newpw = (struct passwd *) cp" is off in undefined-behaviour weeds
unless cp is suitably aligned (6.3.2.3 #7, and note that indirecting
through the result is not necessary to invoke undefined behaviour).
Because of this, the old memcpy (the one copying into newpw) was
justified in assuming newpw was properly aligned for a struct passwd,
which it was on many historical machines, either because of the way the
compiler treated the two structs in question or because the machines
didn't actually have any alignment restrictions. But in this case that
particular bit of nonportability came back to bite the code. Struct
assignment such as you suggest would be equally undefined and, in
practice, would almost certainly be at least as big a lose as the
memcpy, in exactly the cases where memcpy failed.
Post by Greg A. Woods
(So far as I can recall there is no requirement in C, and certainly
not old C or C89, that a "void *" pointer must be cast back to the
type it had before it was cast into a "void *" pointer. [...])
No, there's no requirement that anything be done with it at all.

However, if it is cast to some other type, it has to be correctly
aligned for that type (which is difficult to ensure in general) or "the
behavior is undefined".

/~\ The ASCII Mouse
\ / Ribbon Campaign
X Against HTML ***@rodents-montreal.org
/ \ Email! 7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3 27 4B
Martin Husemann
2012-05-18 07:16:57 UTC
Permalink
FYI: that this has been fixed differently/slightly more intrusive in a sane
way upstream.

The memcpy() was only the first to hit the misaligned data in the internal
cache structure, any later access might have crashed as well (depending
on the accessed fields and compiler optimization).

Martin
Loading...