Archive for the ‘programming’ Category

C++ Round-Up

Sunday, March 15th, 2009

Here’s a round-up of interesting links I’ve recently come across related to C++:

dcc – A decompiler that outputs C code. It’s very restricted and the derivative project, Boomerang, appears to be abandoned. It’s hard to see how this could replace OllyDbg, but it’s an interesting piece of work.

Boost Memory Mapped Files – Boost has been solving core problems of C++ for years in a cross-platform way. I’m glad to have finally noticed this addition.

chSoftIntegration has a product that is a scripting language with C++ syntax. According to their web page, the scripting engine can read C++ source files. The idea of code re-use for scripting is interesting, but I suspect shortcomings exist. Also, with all the modules for Perl at CPAN, it’s hard to see the value.

Spec – An interesting idea of embedding specifications into C++ in a readable manner to support Behavior-Driven Development (BDD).


How Large Is a Short, Int, Long or Long Long?

Sunday, February 1st, 2009

How many bytes are in an int or a long? Most think int type sizes vary with platform between 2 or 4 bytes while the long integer is 4 bytes regardless. Though that can happen, that’s not how the C standard defines int type sizes.

The C standard allows the int types to vary as everyone expects, but C only defines a minimum size. The short integer must have at least 16 bits. Type int must also have at least 16 bits and be at least as large as the short integer. The long has a minimum of 32 bits. C99 introduced the long long type which has a minimum of 64 bits. For each type, the C Standard defines no upper boundary. Here’s a quote from the C99 draft standard:

The values given below shall be replaced by constant expressions suitable for use in #if preprocessing directives. [...] Their implementation-defined values shall be equal or greater in magnitude (absolute value) to those shown, with the same sign.

The previous passage is talking specifically about the definitions in limits.h, which, in the standard, have the following definitions:

minimum value for type short int: SHRT_MIN -32767 // −(152−1)
maximum value for type short int: SHRT_MAX +32767 // 152−1
maximum value for type unsigned short int: USHRT_MAX 65535// 162−1
minimum value for type int: INT_MIN -32767 // −(152−1)
maximum value for type int: INT_MAX +32767 // 15−1
maximum value for type unsigned int: UINT_MAX 65535 // 162−1
minimum value for type long int: LONG_MIN -2147483647 // −(312−1)
maximum value for type long int: LONG_MAX +2147483647 // 312−1
maximum value for type unsigned long int: ULONG_MAX 4294967295 // 322−1
minimum value for type long long int: LLONG_MIN -9223372036854775807 // −(632−1)
maximum value for type long long int: LLONG_MAX +9223372036854775807 // 632−1
maximum value for type unsigned long long int: ULLONG_MAX 18446744073709551615 // 642−1

The key point is the C standard defines minimum values and your platform may have different values than what the standard states, but if so, the values must be larger. The safe way to determine the minimum and maximum values for all C types on your platforms is to use limits.h, e.g.:

#include <limits.h>
printf("Minimum value for a signed int: %d\n", INT_MIN);
printf("Maximum value for a signed int: %d\n", INT_MAX);

In C++, rely on limits rather than climits. In C++, limits defines a template class, numeric_limits, specialized for each of the fundamental types (e.g. int, short, float, double, etc.). To determine the range of possible values in C++ using numeric_limits, for an int, you would do the following:

#include <limits>
#include <iostream>
using namespace std;
cout << "Minimum value for a signed int: " << numeric_limits<int>::min() << endl;
cout << "Maximum value for a signed int: " << numeric_limits<int>::max() << endl;

Good references on the topic are Harbison and Steele’s C A Reference Manual and Steve Summit’s C FAQs.


Correct Use of size_t

Saturday, January 31st, 2009

One misunderstood part of C is size_t. First appearing in ANSI C around 1990, size_t is routinely misused. People either use size_t incorrectly or they don’t don’t use it at all. The idea of its use is simple and I’ll attempt an equally simple explanation. The appropriate use for size_t is to describe the size of objects and not as an alias for unsigned int.

unsigned long min_index=min, max_index=max;
for (size_t i=min_index; i++) data_array[i]++;

The previous code can break. The assumption in the previous code sample is that size_t can store min_index and max_index, but on some platforms, size_t may be smaller than an unsigned long, causing overflow errors. The above code should use unsigned long for the index.

Using size_t as an alias for some type of int on your target architectures may seem functionally sound, but your assumptions are likely wrong and the code is semantically wrong. Since the type of size_t will shift from platform to platform, you shouldn’t rely on size_t to do anything other than describe the size of objects and types.

int obj_size = get_obj_size();
struct very_large_struct *ptr_src_vl=src, *ptr_dest_vl=dest;
ptr_src_vl = malloc(obj_size);
/* sanity checking for ptr_src_vl validity not shown */
memcpy(ptr_src_vl, ptr_dest_vl, obj_size);

The previous code is not using size_t where it should. The C standard states int types must be at least 16 bits. Therefore, the above code has an implicit limit through the size of the int type. Imagine size_t is 16-bits and has a sign (it can happen, but shouldn’t) yielding a maximum value of 32,767. The lesson is: the maximum value for an int type has no bearing on the maximum object size for a platform. However,  size_t will hold the maximum possible object size for your system (assuming your C implementation is correct). If you go with the C standard on this issue, it will help your code portability and maintainability.