The P-Bug
From Andrey
Revision as of 08:51, 11 February 2006 Andrey (Talk | contribs) ← Previous diff |
Current revision Andrey (Talk | contribs) |
||
Line 1: | Line 1: | ||
- | Once in a while people put their names on things that they discover. We have Rorschach spots, Alzheimer's disease and Eiffel tower. Let me introduce '''Potekhin's Bug'''. | + | From: Potekhin, Andrey<br> |
+ | Tuesday, February 01, 2005 10:00 PM<br> | ||
+ | To: Dev Team<br> | ||
+ | Subject: The P-bug | ||
+ | |||
+ | |||
+ | Dear colleagues, | ||
+ | |||
+ | Once in a while people put their names on things that they discover - Rorschach spots, Alzheimer's disease, Eiffel tower. Let me introduce '''Potekhin's Bug'''. | ||
Let's put this value into database: | Let's put this value into database: | ||
Line 9: | Line 17: | ||
In the above code, we need to cast to <code>short</code> since ADO doesn't know how to cast to <code>char</code> directly. Then, we need to cast to <code>char</code> to avoid compiler's warning of a 'possible loss of data'. Since that we are storing values which are not greater than 256, we are sure that no data is lost. | In the above code, we need to cast to <code>short</code> since ADO doesn't know how to cast to <code>char</code> directly. Then, we need to cast to <code>char</code> to avoid compiler's warning of a 'possible loss of data'. Since that we are storing values which are not greater than 256, we are sure that no data is lost. | ||
- | Unfortunately, the retrieved value, as shown by debugger, is not '''130''', but '''-126'''. | + | Predictably, the retrieved value, as shown by debugger, is not '''130''', but '''-126'''. |
- | If you think that this is the bug I'm talking about, it is not. This is just a usual signed/unsigned issue. Since c is declared as a (signed) <code>char</code>, the moment the 1 hits its upper bit it becomes a negative value. This is not a bug - this is how it supposed to be. The bits are all the same. It is still '''130''' under the hood. Let's continue. | + | If you think that this is the bug I'm talking about, it is not. This is just a usual signed/unsigned issue. Since c is declared as a (signed) <code>char</code>, the moment the 1 hits its upper bit it becomes a negative value. This is not a bug, this is how it supposed to be. The bits are all the same. It is still '''130''' under the hood. Let's continue. |
Retrieve the same value as an integer: | Retrieve the same value as an integer: | ||
Line 18: | Line 26: | ||
As you probably guessed, this time the debugger shows '''130'''. | As you probably guessed, this time the debugger shows '''130'''. | ||
- | === Now, what do you think of this code: === | + | === Now, the question: what do you think will be the result of this code: === |
<pre> | <pre> | ||
if (c == i) | if (c == i) | ||
Line 30: | Line 38: | ||
</pre> | </pre> | ||
- | Which message will we get? On the one hand, this is a case of '''-126''' vs. '''130'''. On the other, we know that: | + | On the one hand, this is a case of '''-126''' vs. '''130'''. On the other, we know that: |
- | - '''Both values have same bits''' - 10000010, which are read from the same database field. | + | - '''Both values have the same bits''' - 10000010, which are read from the same database field. |
- | - '''Both values are signed values''', so there is no signed/unsigned mismatch here. | + | - '''Both values are signed values''', so there is no signed/unsigned mismatch. |
- | - When evaluating the expression, both values are converted to same type, '''int'''. So when compared, they are of '''same size'''. In other words, compiler sees it as: <code>if ((int)c == i)</code>. | + | - When evaluating the expression, both values are implicitly converted to same type, '''int'''. So, when compared, they '''are of same size'''. In other words, compiler sees it as <code>if ((int)c == i)</code>. |
- | Which message will you see? I'll tell you. Despite of these assumptions, you'll be getting the else branch. This is what we call Potekhin's bug. | + | So, which message will you see? I'll have to tell you. Despite these assumptions, you'll be getting the else branch. This is what we call Potekhin's bug. |
== Explanation of the trick == | == Explanation of the trick == | ||
- | One may say, of course, '''130''' is not '''-126''', and that's why they don't match. However, this does not explain how did we end up with such results. Here's the explanation. Compare: | + | One may say, of course, '''130''' is not '''-126''', that's why they don't match. However, this does not explain how did we end up with such results. Here's the explanation. Compare: |
Before conversion to int: | Before conversion to int: | ||
Line 56: | Line 64: | ||
When a char gets converted, its negative bit gets propagated all the way to the left. | When a char gets converted, its negative bit gets propagated all the way to the left. | ||
- | Well, I knew it. Kind of. I knew that it gets propagated. The problem is, I didn't realize that it could lead to scenarios like the described above. | + | Well, I knew it. Kind of. I knew that it gets propagated. The problem is, I didn't realize that it could lead to scenarios like one described above. |
- | Conclusion? Same old rule. Never use a signed char to store anything above 128. Or if you do, don't compare it to an integer. | + | Conclusions? Same old rule. Never use a signed char to store anything above 128. Or if you do, don't compare it to an integer :) |
'''[[Main_Page|To Main Page]]''' | '''[[Main_Page|To Main Page]]''' |
Current revision
From: Potekhin, Andrey
Tuesday, February 01, 2005 10:00 PM
To: Dev Team
Subject: The P-bug
Dear colleagues,
Once in a while people put their names on things that they discover - Rorschach spots, Alzheimer's disease, Eiffel tower. Let me introduce Potekhin's Bug.
Let's put this value into database:
130
Let's retrieve it into a char using a usual ADO call:
char c = (char)(short)f->Item["MyField"]->Value;
In the above code, we need to cast to short
since ADO doesn't know how to cast to char
directly. Then, we need to cast to char
to avoid compiler's warning of a 'possible loss of data'. Since that we are storing values which are not greater than 256, we are sure that no data is lost.
Predictably, the retrieved value, as shown by debugger, is not 130, but -126.
If you think that this is the bug I'm talking about, it is not. This is just a usual signed/unsigned issue. Since c is declared as a (signed) char
, the moment the 1 hits its upper bit it becomes a negative value. This is not a bug, this is how it supposed to be. The bits are all the same. It is still 130 under the hood. Let's continue.
Retrieve the same value as an integer:
int i = (short)f->Item["MyField"]->Value;
As you probably guessed, this time the debugger shows 130.
Now, the question: what do you think will be the result of this code:
if (c == i) { AfxMessageBox("Heaven"); } else { AfxMessageBox("Hell"); }
On the one hand, this is a case of -126 vs. 130. On the other, we know that:
- Both values have the same bits - 10000010, which are read from the same database field.
- Both values are signed values, so there is no signed/unsigned mismatch.
- When evaluating the expression, both values are implicitly converted to same type, int. So, when compared, they are of same size. In other words, compiler sees it as if ((int)c == i)
.
So, which message will you see? I'll have to tell you. Despite these assumptions, you'll be getting the else branch. This is what we call Potekhin's bug.
Explanation of the trick
One may say, of course, 130 is not -126, that's why they don't match. However, this does not explain how did we end up with such results. Here's the explanation. Compare:
Before conversion to int:
(int) 130 == 10000010 (char) -126 == 10000010
After conversion to int:
(int) 130 == 10000010 (int) -126 == 11111111111111111111111110000010
When a char gets converted, its negative bit gets propagated all the way to the left.
Well, I knew it. Kind of. I knew that it gets propagated. The problem is, I didn't realize that it could lead to scenarios like one described above.
Conclusions? Same old rule. Never use a signed char to store anything above 128. Or if you do, don't compare it to an integer :)