diff options
author | Reid Spencer <rspencer@reidspencer.com> | 2007-07-11 17:01:13 +0000 |
---|---|---|
committer | Reid Spencer <rspencer@reidspencer.com> | 2007-07-11 17:01:13 +0000 |
commit | 5f016e2cb5d11daeb237544de1c5d59f20fe1a6e (patch) | |
tree | 8b6bfcb8783d16827f896d5facbd4549300e8a1e /NOTES.txt | |
parent | a5f182095bf2065ca94f1c86957ee91f9068964b (diff) |
Stage two of getting CFE top correct.
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@39734 91177308-0d34-0410-b5e6-96231b3b80d8
Diffstat (limited to 'NOTES.txt')
-rw-r--r-- | NOTES.txt | 218 |
1 files changed, 218 insertions, 0 deletions
diff --git a/NOTES.txt b/NOTES.txt new file mode 100644 index 0000000000..da8421112b --- /dev/null +++ b/NOTES.txt @@ -0,0 +1,218 @@ +//===---------------------------------------------------------------------===// +// Random Notes +//===---------------------------------------------------------------------===// + +C90/C99/C++ Comparisons: +http://david.tribble.com/text/cdiffs.htm + +//===---------------------------------------------------------------------===// +Extensions: + + * "#define_target X Y" + This preprocessor directive works exactly the same was as #define, but it + notes that 'X' is a target-specific preprocessor directive. When used, a + diagnostic is emitted indicating that the translation unit is non-portable. + + If a target-define is #undef'd before use, no diagnostic is emitted. If 'X' + were previously a normal #define macro, the macro is tainted. If 'X' is + subsequently #defined as a non-target-specific define, the taint bit is + cleared. + + * "#define_other_target X" + The preprocessor directive takes a single identifier argument. It notes + that this identifier is a target-specific #define for some target other than + the current one. Use of this identifier will result in a diagnostic. + + If 'X' is later #undef'd or #define'd, the taint bit is cleared. If 'X' is + already defined, X is marked as a target-specific define. + +//===---------------------------------------------------------------------===// + +To time GCC preprocessing speed without output, use: + "time gcc -MM file" +This is similar to -Eonly. + + +//===---------------------------------------------------------------------===// + + C++ Template Instantiation benchmark: + http://users.rcn.com/abrahams/instantiation_speed/index.html + +//===---------------------------------------------------------------------===// + +TODO: File Manager Speedup: + + We currently do a lot of stat'ing for files that don't exist, particularly + when lots of -I paths exist (e.g. see the <iostream> example, check for + failures in stat in FileManager::getFile). It would be far better to make + the following changes: + 1. FileEntry contains a sys::Path instead of a std::string for Name. + 2. sys::Path contains timestamp and size, lazily computed. Eliminate from + FileEntry. + 3. File UIDs are created on request, not when files are opened. + These changes make it possible to efficiently have FileEntry objects for + files that exist on the file system, but have not been used yet. + + Once this is done: + 1. DirectoryEntry gets a boolean value "has read entries". When false, not + all entries in the directory are in the file mgr, when true, they are. + 2. Instead of stat'ing the file in FileManager::getFile, check to see if + the dir has been read. If so, fail immediately, if not, read the dir, + then retry. + 3. Reading the dir uses the getdirentries syscall, creating an FileEntry + for all files found. + +//===---------------------------------------------------------------------===// + +TODO: Fast #Import: + + * Get frameworks that don't use #import to do so, e.g. + DirectoryService, AudioToolbox, CoreFoundation, etc. Why not using #import? + Because they work in C mode? C has #import. + * Have the lexer return a token for #import instead of handling it itself. + - Create a new preprocessor object with no external state (no -D/U options + from the command line, etc). Alternatively, keep track of exactly which + external state is used by a #import: declare it somehow. + * When having reading a #import file, keep track of whether we have (and/or + which) seen any "configuration" macros. Various cases: + - Uses of target args (__POWERPC__, __i386): Header has to be parsed + multiple times, per-target. What about #ifndef checks? How do we know? + - "Configuration" preprocessor macros not defined: POWERPC, etc. What about + things like __STDC__ etc? What is and what isn't allowed. + * Special handling for "umbrella" headers, which just contain #import stmts: + - Cocoa.h/AppKit.h - Contain pointers to digests instead of entire digests + themselves? Foundation.h isn't pure umbrella! + * Frameworks digests: + - Can put "digest" of a framework-worth of headers into the framework + itself. To open AppKit, just mmap + /System/Library/Frameworks/AppKit.framework/"digest", which provides a + symbol table in a well defined format. Lazily unstream stuff that is + needed. Contains declarations, macros, and debug information. + - System frameworks ship with digests. How do we handle configuration + information? How do we handle stuff like: + #if MAC_OS_X_VERSION_MAX_ALLOWED >= MAC_OS_X_VERSION_10_2 + which guards a bunch of decls? Should there be a couple of default + configs, then have the UI fall back to building/caching its own? + - GUI automatically builds digests when UI is idle, both of system + frameworks if they aren't not available in the right config, and of app + frameworks. + - GUI builds dependence graph of frameworks/digests based on #imports. If a + digest is out date, dependent digests are automatically invalidated. + + * New constraints on #import for objc-v3: + - #imported file must not define non-inline function bodies. + - Alternatively, they can, and these bodies get compiled/linked *once* + per app into a dylib. What about building user dylibs? + - Restrictions on ObjC grammar: can't #import the body of a for stmt or fn. + - Compiler must detect and reject these cases. + - #defines defined within a #import have two behaviors: + - By default, they escape the header. These macros *cannot* be #undef'd + by other code: this is enforced by the front-end. + - Optionally, user can specify what macros escape (whitelist) or can use + #undef. + +//===---------------------------------------------------------------------===// + +TODO: New language feature: Configuration queries: + - Instead of #ifdef __POWERPC__, use "if (strcmp(`cpu`, __POWERPC__))", or + some other, better, syntax. + - Use it to increase the number of "architecture-clean" #import'd files, + allowing a single index to be used for all fat slices. + +//===---------------------------------------------------------------------===// + +The 'portability' model in clang is sufficient to catch translation units (or +their parts) that are not portable, but it doesn't help if the system headers +are non-portable and not fixed. An alternative model that would be easy to use +is a 'tainting' scheme. Consider: + +int32_t +OSHostByteOrder(void) { +#if defined(__LITTLE_ENDIAN__) + return OSLittleEndian; +#elif defined(__BIG_ENDIAN__) + return OSBigEndian; +#else + return OSUnknownByteOrder; +#endif +} + +It would be trivial to mark 'OSHostByteOrder' as being non-portable (tainted) +instead of marking the entire translation unit. Then, if OSHostByteOrder is +never called/used by the current translation unit, the t-u wouldn't be marked +non-portable. However, there is no good way to handle stuff like: + +extern int X, Y; + +#ifndef __POWERPC__ +#define X Y +#endif + +int bar() { return X; } + +When compiling for powerpc, the #define is skipped, so it doesn't know that bar +uses a #define that is set on some other target. In practice, limited cases +could be handled by scanning the skipped region of a #if, but the fully general +case cannot be implemented efficiently. In this case, for example, the #define +in the protected region could be turned into either a #define_target or +#define_other_target as appropriate. The harder case is code like this (from +OSByteOrder.h): + + #if (defined(__ppc__) || defined(__ppc64__)) + #include <libkern/ppc/OSByteOrder.h> + #elif (defined(__i386__) || defined(__x86_64__)) + #include <libkern/i386/OSByteOrder.h> + #else + #include <libkern/machine/OSByteOrder.h> + #endif + +The realistic way to fix this is by having an initial #ifdef __llvm__ that +defines its contents in terms of the llvm bswap intrinsics. Other things should +be handled on a case-by-case basis. + + +We probably have to do something smarter like this in the future. The C++ header +<limits> contains a lot of code like this: + + static const int digits10 = __LDBL_DIG__; + static const int min_exponent = __LDBL_MIN_EXP__; + static const int min_exponent10 = __LDBL_MIN_10_EXP__; + static const float_denorm_style has_denorm + = bool(__LDBL_DENORM_MIN__) ? denorm_present : denorm_absent; + + ... since this isn't being used in an #ifdef, it should be easy enough to taint +the decl for these ivars. + + +/usr/include/sys/cdefs.h contains stuff like this: + +#if defined(__ppc__) +# if defined(__LDBL_MANT_DIG__) && defined(__DBL_MANT_DIG__) && \ + __LDBL_MANT_DIG__ > __DBL_MANT_DIG__ +# if __ENVIRONMENT_MAC_OS_X_VERSION_MIN_REQUIRED__-0 < 1040 +# define __DARWIN_LDBL_COMPAT(x) __asm("_" __STRING(x) "$LDBLStub") +# else +# define __DARWIN_LDBL_COMPAT(x) __asm("_" __STRING(x) "$LDBL128") +# endif +# define __DARWIN_LDBL_COMPAT2(x) __asm("_" __STRING(x) "$LDBL128") +# define __DARWIN_LONG_DOUBLE_IS_DOUBLE 0 +# else +# define __DARWIN_LDBL_COMPAT(x) /* nothing */ +# define __DARWIN_LDBL_COMPAT2(x) /* nothing */ +# define __DARWIN_LONG_DOUBLE_IS_DOUBLE 1 +# endif +#elif defined(__i386__) || defined(__ppc64__) || defined(__x86_64__) +# define __DARWIN_LDBL_COMPAT(x) /* nothing */ +# define __DARWIN_LDBL_COMPAT2(x) /* nothing */ +# define __DARWIN_LONG_DOUBLE_IS_DOUBLE 0 +#else +# error Unknown architecture +#endif + +An ideal way to solve this issue is to mark __DARWIN_LDBL_COMPAT / +__DARWIN_LDBL_COMPAT2 / __DARWIN_LONG_DOUBLE_IS_DOUBLE as being non-portable +because they depend on non-portable macros. In practice though, this may end +up being a serious problem: every use of printf will mark the translation unit +non-portable if targetting ppc32 and something else. + +//===---------------------------------------------------------------------===// |