The Design of Software (CLOSED)

A public forum for discussing the design of software, from the user interface to the code architecture. Now closed.

The "Design of Software" discussion group has been merged with the main Joel on Software discussion group.

The archives will remain online indefinitely.

Difference between wchar_t and WCHAR

Is there actually any difference between wchar_t and WCHAR? As far as I can tell they are the same thing but if that's the case why does wsprintf (MS version) have special formatting codes for specifiying between the two?
Stephen Caldwell Send private email
Monday, July 25, 2005
hmm. bizzare.

My answer would have been that MS defines WCHAR as 2 bytes, whereas wchar_t could be different sizes.
But looking at winnt.h, thats not true.

from winnt.h

#ifndef _MAC
typedef wchar_t WCHAR;    // wc,  16-bit UNICODE character
// some Macintosh compilers don't define wchar_t in a convenient location, or define it as a char
typedef unsigned short WCHAR;    // wc,  16-bit UNICODE character

Looks like no difference.
Monday, July 25, 2005
From what I've been digging up in MSDN, they're the same thing. WCHAR is the Windows defined datatype, which can (to my knowledge) be used interchangably with wchar_t.
Tuesday, July 26, 2005
A friend and myself discussed this yesterday. It is a  tau·tol·o·gy:

"needless repetition of an idea, statement, or word"

I call it ridiculous. I spent 2-4 hours this evening scratching my head why some simple function would not resolve by the linker. Well, the tau·tol·o·gy was not self-consistent.

This is the reason some consider typedef to be harmful. Its pathetic.


typedef unsigned long int ULONG.

Can I hear a hearty "FUCK ME"? I say, brothers and sisters, puh-leeeze, I neeed a FUCK ME!  Y-y-y-y-yeeessss. That's what ahm sayin'!
hoser Send private email
Tuesday, July 26, 2005
wchar_t is defined in the ANSI-C-Standard:

"7.17 Common definitions <stddef.h>

[...] and
which is an integer type whose range of values can represent distinct codes for all members of the largest extended character set specified among the supported locales; the null character shall have the code value zero and each member of the basic character set shall have a code value equal to its value when used as the lone character in an integer character constant."

It is supported by any conforming implementation, regardless of the platform. Its exact range of values is implementation-specific.

WCHAR is Microsoft-only, thus it might happen that wchar_t and WCHAR are identical for a Windows-specific compiler.
Tuesday, July 26, 2005
And for what reason would it ever differ?

Would an 'unsigned long int' ever be something other than an 'unsigned long int' in any other implementation, and thus the need for ULONG?

And then there is DWORD, which is not a DOUBLE WORD on anything other than 80286 or earlier - and if there ever were a reason for "abstracting" a WORD as a typedef, this should be the anti-pattern which should make people shun the notion wholesale - thinking "there must be a better way" - and there certainly are better ways.

And yet, it propagates its nasty self into truly non-windows specific documents like PCI and USB, and by extension even Linux(!). I feel a lawsuit coming on...
hoser Send private email
Tuesday, July 26, 2005
"Would an 'unsigned long int' ever be something other than an 'unsigned long int' in any other implementation, and thus the need for ULONG?"

An unsigned long int is guaranteed to have AT LEAST 32 value bits by the Standard. It might as well have 33, 64, 128 or any other amount between and above, while beeing fully conforming.

If your application relies on the unsigned long to have EXACTLY 32 bits (eg. on value wraparound), it will break when you port it to a system with a compiler using 64 bits for it. Intelligent and future-aware programmers only change one single typedef and anything will work fine. Not to speak about optimized memory consumption and such things.

We don't need to discuss the chaotic and inconsistent ways this features are used by Microsoft, of course. ;)
Tuesday, July 26, 2005
FWIW, a wchar_t is 2 bytes on MS, and 4 bytes on most versions of Unix and Linux (because they use 4-byte Unicode).
BillT Send private email
Tuesday, July 26, 2005
Hmm.. Still can't really determine why the microsoft wsprintf insists on having formatting codes for both wchar_t and WCHAR but I guess if there "could" be a difference between the two at somepoint it could posibly make sense. It would be nice if they could tell us when they are future proofing or something.
Stephen Caldwell Send private email
Tuesday, July 26, 2005
Apparantly some people haven't heard of these newfangled 64 bit processors.

It's about time for another round of "oh, the processor has changed, what do you mean an int is no longer exactly 16 bits? I assumed it would never change!  How dare the bastards keep changing everything on me!"  :)

Tuesday, July 26, 2005
That's all well and good if they want to change the size of WCHAR/wchar_t.. but it would be nice if they'd choose one and just stick with it instead of jumping back and forth betwixt them seemingly without any real reason for it.
Stephen Caldwell Send private email
Wednesday, July 27, 2005
wchar_t is a C++ type whose size (like the size of an int), is implementation-dependent.  With MS compilers it's 2 bytes (Windows uses UTF-16) -- for most Linux/Unix it's 4 bytes (UTF-32).

WCHAR is MS's non-implementation-dependent definition.  MS defines similar types (e.g., DWORD, etc.) to get around the fact that C++ types are often implementation dependent.

FWIW, MS compilers used to target several other platforms (e.g., Mac 68K & PPC, Alpha, MIPS, etc.), not just Intel.  And they still target other processors for WindowsCE.
BillT Send private email
Thursday, July 28, 2005

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics
Powered by FogBugz