summaryrefslogtreecommitdiffstats
path: root/NOTES.txt
diff options
context:
space:
mode:
authorReid Spencer <rspencer@reidspencer.com>2007-07-11 17:01:13 +0000
committerReid Spencer <rspencer@reidspencer.com>2007-07-11 17:01:13 +0000
commit5f016e2cb5d11daeb237544de1c5d59f20fe1a6e (patch)
tree8b6bfcb8783d16827f896d5facbd4549300e8a1e /NOTES.txt
parenta5f182095bf2065ca94f1c86957ee91f9068964b (diff)
Stage two of getting CFE top correct.
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@39734 91177308-0d34-0410-b5e6-96231b3b80d8
Diffstat (limited to 'NOTES.txt')
-rw-r--r--NOTES.txt218
1 files changed, 218 insertions, 0 deletions
diff --git a/NOTES.txt b/NOTES.txt
new file mode 100644
index 0000000000..da8421112b
--- /dev/null
+++ b/NOTES.txt
@@ -0,0 +1,218 @@
+//===---------------------------------------------------------------------===//
+// Random Notes
+//===---------------------------------------------------------------------===//
+
+C90/C99/C++ Comparisons:
+http://david.tribble.com/text/cdiffs.htm
+
+//===---------------------------------------------------------------------===//
+Extensions:
+
+ * "#define_target X Y"
+ This preprocessor directive works exactly the same was as #define, but it
+ notes that 'X' is a target-specific preprocessor directive. When used, a
+ diagnostic is emitted indicating that the translation unit is non-portable.
+
+ If a target-define is #undef'd before use, no diagnostic is emitted. If 'X'
+ were previously a normal #define macro, the macro is tainted. If 'X' is
+ subsequently #defined as a non-target-specific define, the taint bit is
+ cleared.
+
+ * "#define_other_target X"
+ The preprocessor directive takes a single identifier argument. It notes
+ that this identifier is a target-specific #define for some target other than
+ the current one. Use of this identifier will result in a diagnostic.
+
+ If 'X' is later #undef'd or #define'd, the taint bit is cleared. If 'X' is
+ already defined, X is marked as a target-specific define.
+
+//===---------------------------------------------------------------------===//
+
+To time GCC preprocessing speed without output, use:
+ "time gcc -MM file"
+This is similar to -Eonly.
+
+
+//===---------------------------------------------------------------------===//
+
+ C++ Template Instantiation benchmark:
+ http://users.rcn.com/abrahams/instantiation_speed/index.html
+
+//===---------------------------------------------------------------------===//
+
+TODO: File Manager Speedup:
+
+ We currently do a lot of stat'ing for files that don't exist, particularly
+ when lots of -I paths exist (e.g. see the <iostream> example, check for
+ failures in stat in FileManager::getFile). It would be far better to make
+ the following changes:
+ 1. FileEntry contains a sys::Path instead of a std::string for Name.
+ 2. sys::Path contains timestamp and size, lazily computed. Eliminate from
+ FileEntry.
+ 3. File UIDs are created on request, not when files are opened.
+ These changes make it possible to efficiently have FileEntry objects for
+ files that exist on the file system, but have not been used yet.
+
+ Once this is done:
+ 1. DirectoryEntry gets a boolean value "has read entries". When false, not
+ all entries in the directory are in the file mgr, when true, they are.
+ 2. Instead of stat'ing the file in FileManager::getFile, check to see if
+ the dir has been read. If so, fail immediately, if not, read the dir,
+ then retry.
+ 3. Reading the dir uses the getdirentries syscall, creating an FileEntry
+ for all files found.
+
+//===---------------------------------------------------------------------===//
+
+TODO: Fast #Import:
+
+ * Get frameworks that don't use #import to do so, e.g.
+ DirectoryService, AudioToolbox, CoreFoundation, etc. Why not using #import?
+ Because they work in C mode? C has #import.
+ * Have the lexer return a token for #import instead of handling it itself.
+ - Create a new preprocessor object with no external state (no -D/U options
+ from the command line, etc). Alternatively, keep track of exactly which
+ external state is used by a #import: declare it somehow.
+ * When having reading a #import file, keep track of whether we have (and/or
+ which) seen any "configuration" macros. Various cases:
+ - Uses of target args (__POWERPC__, __i386): Header has to be parsed
+ multiple times, per-target. What about #ifndef checks? How do we know?
+ - "Configuration" preprocessor macros not defined: POWERPC, etc. What about
+ things like __STDC__ etc? What is and what isn't allowed.
+ * Special handling for "umbrella" headers, which just contain #import stmts:
+ - Cocoa.h/AppKit.h - Contain pointers to digests instead of entire digests
+ themselves? Foundation.h isn't pure umbrella!
+ * Frameworks digests:
+ - Can put "digest" of a framework-worth of headers into the framework
+ itself. To open AppKit, just mmap
+ /System/Library/Frameworks/AppKit.framework/"digest", which provides a
+ symbol table in a well defined format. Lazily unstream stuff that is
+ needed. Contains declarations, macros, and debug information.
+ - System frameworks ship with digests. How do we handle configuration
+ information? How do we handle stuff like:
+ #if MAC_OS_X_VERSION_MAX_ALLOWED >= MAC_OS_X_VERSION_10_2
+ which guards a bunch of decls? Should there be a couple of default
+ configs, then have the UI fall back to building/caching its own?
+ - GUI automatically builds digests when UI is idle, both of system
+ frameworks if they aren't not available in the right config, and of app
+ frameworks.
+ - GUI builds dependence graph of frameworks/digests based on #imports. If a
+ digest is out date, dependent digests are automatically invalidated.
+
+ * New constraints on #import for objc-v3:
+ - #imported file must not define non-inline function bodies.
+ - Alternatively, they can, and these bodies get compiled/linked *once*
+ per app into a dylib. What about building user dylibs?
+ - Restrictions on ObjC grammar: can't #import the body of a for stmt or fn.
+ - Compiler must detect and reject these cases.
+ - #defines defined within a #import have two behaviors:
+ - By default, they escape the header. These macros *cannot* be #undef'd
+ by other code: this is enforced by the front-end.
+ - Optionally, user can specify what macros escape (whitelist) or can use
+ #undef.
+
+//===---------------------------------------------------------------------===//
+
+TODO: New language feature: Configuration queries:
+ - Instead of #ifdef __POWERPC__, use "if (strcmp(`cpu`, __POWERPC__))", or
+ some other, better, syntax.
+ - Use it to increase the number of "architecture-clean" #import'd files,
+ allowing a single index to be used for all fat slices.
+
+//===---------------------------------------------------------------------===//
+
+The 'portability' model in clang is sufficient to catch translation units (or
+their parts) that are not portable, but it doesn't help if the system headers
+are non-portable and not fixed. An alternative model that would be easy to use
+is a 'tainting' scheme. Consider:
+
+int32_t
+OSHostByteOrder(void) {
+#if defined(__LITTLE_ENDIAN__)
+ return OSLittleEndian;
+#elif defined(__BIG_ENDIAN__)
+ return OSBigEndian;
+#else
+ return OSUnknownByteOrder;
+#endif
+}
+
+It would be trivial to mark 'OSHostByteOrder' as being non-portable (tainted)
+instead of marking the entire translation unit. Then, if OSHostByteOrder is
+never called/used by the current translation unit, the t-u wouldn't be marked
+non-portable. However, there is no good way to handle stuff like:
+
+extern int X, Y;
+
+#ifndef __POWERPC__
+#define X Y
+#endif
+
+int bar() { return X; }
+
+When compiling for powerpc, the #define is skipped, so it doesn't know that bar
+uses a #define that is set on some other target. In practice, limited cases
+could be handled by scanning the skipped region of a #if, but the fully general
+case cannot be implemented efficiently. In this case, for example, the #define
+in the protected region could be turned into either a #define_target or
+#define_other_target as appropriate. The harder case is code like this (from
+OSByteOrder.h):
+
+ #if (defined(__ppc__) || defined(__ppc64__))
+ #include <libkern/ppc/OSByteOrder.h>
+ #elif (defined(__i386__) || defined(__x86_64__))
+ #include <libkern/i386/OSByteOrder.h>
+ #else
+ #include <libkern/machine/OSByteOrder.h>
+ #endif
+
+The realistic way to fix this is by having an initial #ifdef __llvm__ that
+defines its contents in terms of the llvm bswap intrinsics. Other things should
+be handled on a case-by-case basis.
+
+
+We probably have to do something smarter like this in the future. The C++ header
+<limits> contains a lot of code like this:
+
+ static const int digits10 = __LDBL_DIG__;
+ static const int min_exponent = __LDBL_MIN_EXP__;
+ static const int min_exponent10 = __LDBL_MIN_10_EXP__;
+ static const float_denorm_style has_denorm
+ = bool(__LDBL_DENORM_MIN__) ? denorm_present : denorm_absent;
+
+ ... since this isn't being used in an #ifdef, it should be easy enough to taint
+the decl for these ivars.
+
+
+/usr/include/sys/cdefs.h contains stuff like this:
+
+#if defined(__ppc__)
+# if defined(__LDBL_MANT_DIG__) && defined(__DBL_MANT_DIG__) && \
+ __LDBL_MANT_DIG__ > __DBL_MANT_DIG__
+# if __ENVIRONMENT_MAC_OS_X_VERSION_MIN_REQUIRED__-0 < 1040
+# define __DARWIN_LDBL_COMPAT(x) __asm("_" __STRING(x) "$LDBLStub")
+# else
+# define __DARWIN_LDBL_COMPAT(x) __asm("_" __STRING(x) "$LDBL128")
+# endif
+# define __DARWIN_LDBL_COMPAT2(x) __asm("_" __STRING(x) "$LDBL128")
+# define __DARWIN_LONG_DOUBLE_IS_DOUBLE 0
+# else
+# define __DARWIN_LDBL_COMPAT(x) /* nothing */
+# define __DARWIN_LDBL_COMPAT2(x) /* nothing */
+# define __DARWIN_LONG_DOUBLE_IS_DOUBLE 1
+# endif
+#elif defined(__i386__) || defined(__ppc64__) || defined(__x86_64__)
+# define __DARWIN_LDBL_COMPAT(x) /* nothing */
+# define __DARWIN_LDBL_COMPAT2(x) /* nothing */
+# define __DARWIN_LONG_DOUBLE_IS_DOUBLE 0
+#else
+# error Unknown architecture
+#endif
+
+An ideal way to solve this issue is to mark __DARWIN_LDBL_COMPAT /
+__DARWIN_LDBL_COMPAT2 / __DARWIN_LONG_DOUBLE_IS_DOUBLE as being non-portable
+because they depend on non-portable macros. In practice though, this may end
+up being a serious problem: every use of printf will mark the translation unit
+non-portable if targetting ppc32 and something else.
+
+//===---------------------------------------------------------------------===//