summaryrefslogtreecommitdiffstats
path: root/docs
diff options
context:
space:
mode:
authorHans Wennborg <hans@hanshq.net>2019-01-22 16:59:31 +0000
committerHans Wennborg <hans@hanshq.net>2019-01-22 16:59:31 +0000
commit21bbf1530d694e65d6118faba69748d3b62343ad (patch)
treedc730768f23ed91f80989f7cc9e5bbcc00685baf /docs
parentbce836466f34a8483f97aaf99129e01de5d7d3b5 (diff)
Merging r351580:
------------------------------------------------------------------------ r351580 | kli | 2019-01-18 20:57:37 +0100 (Fri, 18 Jan 2019) | 4 lines [OPENMP][DOCS] Release notes/OpenMP support updates, NFC. Differential Revision: https://reviews.llvm.org/D56733 ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/cfe/branches/release_80@351839 91177308-0d34-0410-b5e6-96231b3b80d8
Diffstat (limited to 'docs')
-rw-r--r--docs/OpenMPSupport.rst76
-rw-r--r--docs/ReleaseNotes.rst22
2 files changed, 50 insertions, 48 deletions
diff --git a/docs/OpenMPSupport.rst b/docs/OpenMPSupport.rst
index 04a9648ca2..7b567c966e 100644
--- a/docs/OpenMPSupport.rst
+++ b/docs/OpenMPSupport.rst
@@ -17,60 +17,50 @@
OpenMP Support
==================
-Clang fully supports OpenMP 4.5. Clang supports offloading to X86_64, AArch64,
-PPC64[LE] and has `basic support for Cuda devices`_.
-
-Standalone directives
-=====================
-
-* #pragma omp [for] simd: :good:`Complete`.
-
-* #pragma omp declare simd: :partial:`Partial`. We support parsing/semantic
- analysis + generation of special attributes for X86 target, but still
- missing the LLVM pass for vectorization.
-
-* #pragma omp taskloop [simd]: :good:`Complete`.
-
-* #pragma omp target [enter|exit] data: :good:`Complete`.
-
-* #pragma omp target update: :good:`Complete`.
-
-* #pragma omp target: :good:`Complete`.
+Clang supports the following OpenMP 5.0 features
-* #pragma omp declare target: :good:`Complete`.
+* The `reduction`-based clauses in the `task` and `target`-based directives.
-* #pragma omp teams: :good:`Complete`.
+* Support relational-op != (not-equal) as one of the canonical forms of random
+ access iterator.
-* #pragma omp distribute [simd]: :good:`Complete`.
+* Support for mapping of the lambdas in target regions.
-* #pragma omp distribute parallel for [simd]: :good:`Complete`.
+* Parsing/sema analysis for the requires directive.
-Combined directives
-===================
+* Nested declare target directives.
-* #pragma omp parallel for simd: :good:`Complete`.
+* Make the `this` pointer implicitly mapped as `map(this[:1])`.
-* #pragma omp target parallel: :good:`Complete`.
+* The `close` *map-type-modifier*.
-* #pragma omp target parallel for [simd]: :good:`Complete`.
-
-* #pragma omp target simd: :good:`Complete`.
-
-* #pragma omp target teams: :good:`Complete`.
-
-* #pragma omp teams distribute [simd]: :good:`Complete`.
-
-* #pragma omp target teams distribute [simd]: :good:`Complete`.
-
-* #pragma omp teams distribute parallel for [simd]: :good:`Complete`.
-
-* #pragma omp target teams distribute parallel for [simd]: :good:`Complete`.
+Clang fully supports OpenMP 4.5. Clang supports offloading to X86_64, AArch64,
+PPC64[LE] and has `basic support for Cuda devices`_.
-Clang does not support any constructs/updates from OpenMP 5.0 except
-for `reduction`-based clauses in the `task` and `target`-based directives.
+* #pragma omp declare simd: :partial:`Partial`. We support parsing/semantic
+ analysis + generation of special attributes for X86 target, but still
+ missing the LLVM pass for vectorization.
In addition, the LLVM OpenMP runtime `libomp` supports the OpenMP Tools
-Interface (OMPT) on x86, x86_64, AArch64, and PPC64 on Linux, Windows, and mac OS.
+Interface (OMPT) on x86, x86_64, AArch64, and PPC64 on Linux, Windows, and macOS.
+
+General improvements
+--------------------
+- New collapse clause scheme to avoid expensive remainder operations.
+ Compute loop index variables after collapsing a loop nest via the
+ collapse clause by replacing the expensive remainder operation with
+ multiplications and additions.
+
+- The default schedules for the `distribute` and `for` constructs in a
+ parallel region and in SPMD mode have changed to ensure coalesced
+ accesses. For the `distribute` construct, a static schedule is used
+ with a chunk size equal to the number of threads per team (default
+ value of threads or as specified by the `thread_limit` clause if
+ present). For the `for` construct, the schedule is static with chunk
+ size of one.
+
+- Simplified SPMD code generation for `distribute parallel for` when
+ the new default schedules are applicable.
.. _basic support for Cuda devices:
diff --git a/docs/ReleaseNotes.rst b/docs/ReleaseNotes.rst
index b6a405dbc7..b4f290e05d 100644
--- a/docs/ReleaseNotes.rst
+++ b/docs/ReleaseNotes.rst
@@ -233,12 +233,15 @@ ABI Changes in Clang
OpenMP Support in Clang
----------------------------------
-- Support relational-op != (not-equal) as one of the canonical forms of random
- access iterator.
+- OpenMP 5.0 features
-- Added support for mapping of the lambdas in target regions.
-
-- Added parsing/sema analysis for OpenMP 5.0 requires directive.
+ - Support relational-op != (not-equal) as one of the canonical forms of random
+ access iterator.
+ - Added support for mapping of the lambdas in target regions.
+ - Added parsing/sema analysis for the requires directive.
+ - Support nested declare target directives.
+ - Make the `this` pointer implicitly mapped as `map(this[:1])`.
+ - Added the `close` *map-type-modifier*.
- Various bugfixes and improvements.
@@ -250,6 +253,15 @@ New features supported for Cuda devices:
- Fixed support for lastprivate/reduction variables in SPMD constructs.
+- New collapse clause scheme to avoid expensive remainder operations.
+
+- New default schedule for distribute and parallel constructs.
+
+- Simplified code generation for distribute and parallel in SPMD mode.
+
+- Flag (``-fopenmp_optimistic_collapse``) for user to limit collapsed
+ loop counter width when safe to do so.
+
- General performance improvement.
CUDA Support in Clang