Skip to content

Conversation

Copy link
Contributor

Copilot AI commented Jan 26, 2026

Description

Implements vectorized asin for TensorPrimitives using AMD AOCL-LibM algorithm, with all vectorization logic contained within System.Numerics.Tensors without adding new public APIs.

Changes

TensorPrimitives Implementation

  • Updated AsinOperator<T> in TensorPrimitives.Asin.cs:
    • Set Vectorizable = true for float and double types
    • Implemented Invoke(Vector128/256/512<T>) methods with inline vectorization
    • Added private helper methods: AsinDouble and AsinSingle for each vector size (128/256/512-bit)

Algorithm (from AMD AOCL-LibM)

  • Domain split at |x| = 0.5:
    • |x| ≤ 0.5: polynomial approximation asin(x) = x + x³*R(x²)
    • |x| > 0.5: identity asin(x) = π/2 - 2*asin(√((1-|x|)/2))
  • Polynomial coefficients from AMD for optimal accuracy
  • Includes BSD-3-Clause license attribution per AMD source
  • All vectorization logic implemented directly in TensorPrimitives without exposing new public Vector APIs

Implementation Approach

  • No new public surface area added to System.Private.CoreLib
  • Vectorization contained entirely within System.Numerics.Tensors
  • Uses existing Vector128/256/512 operations (Abs, Sqrt, ConditionalSelect, etc.)
  • Private helper methods marked with [MethodImpl(MethodImplOptions.AggressiveInlining)] for optimal performance

Pattern for Remaining Operations

This implementation establishes the pattern for vectorizing the remaining 7 operations without adding public APIs:

  • Acos, Atan, Atan2 (inverse trig)
  • Asinh, Acosh, Atanh (inverse hyperbolic)
  • ILogB (already has AMD scalar, needs vectorization)

Each requires implementing the vectorization algorithm directly in the TensorPrimitives operator using private helper methods.

Example

// Before: scalar fallback for all types
TensorPrimitives.Asin(floatData, result);  // No SIMD

// After: vectorized for float/double
TensorPrimitives.Asin(floatData, result);   // Now uses SIMD via Vector128/256/512
TensorPrimitives.Asin(doubleData, result);  // Now uses SIMD via Vector128/256/512
TensorPrimitives.Asin(halfData, result);    // Still scalar (Half not vectorizable)
Original prompt

Summary

Port AMD's AOCL-LibM vectorized implementations to TensorPrimitives for the following operations that are currently not vectorized (marked with Vectorizable => false // TODO: Vectorize):

Operations to Vectorize

Based on AMD's aocl-libm-ose repository (https://github.com/amd/aocl-libm-ose), the following TensorPrimitives operations have AMD vector implementations available and should be ported:

Inverse Trigonometric Functions

  1. Asin - TensorPrimitives.Asin.cs - AMD has vrs4_asinf, vrs8_asinf, vrd2_asin
  2. Acos - TensorPrimitives.Acos.cs - AMD has vrs4_acosf, vrd2_acos
  3. Atan - TensorPrimitives.Atan.cs - AMD has vrs4_atanf, vrd2_atan
  4. Atan2 - TensorPrimitives.Atan2.cs - AMD has vector atan2 implementations

Hyperbolic Inverse Functions

  1. Asinh - TensorPrimitives.Asinh.cs
  2. Acosh - TensorPrimitives.Acosh.cs
  3. Atanh - TensorPrimitives.Atanh.cs

Other Functions

  1. ILogB - TensorPrimitives.ILogB.cs - Already has AMD-based scalar implementation

Implementation Requirements

Style/Pattern to Follow

Look at existing vectorized implementations in TensorPrimitives that are based on AMD's code for the proper style:

  • TensorPrimitives.Sin.cs - Uses vrs4_sin and vrd2_sin
  • TensorPrimitives.Cos.cs - Uses vrs4_cos and vrd2_cos
  • TensorPrimitives.Tan.cs - Uses vrs4_tan and vrd2_tan

Key Implementation Points

  1. License Header Comments: Include the AMD copyright notice as seen in existing implementations:
// This code is based on `vrs4_XXX` and `vrd2_XXX` from amd/aocl-libm-ose
// Copyright (C) 2019-2022 Advanced Micro Devices, Inc. All rights reserved.
//
// Licensed under the BSD 3-Clause "New" or "Revised" License
// See THIRD-PARTY-NOTICES.TXT for the full license text
  1. Implementation Notes: Include algorithm description comments explaining the approach

  2. Vectorizable Property: Set to true only for float and double:

public static bool Vectorizable => (typeof(T) == typeof(float))
                                || (typeof(T) == typeof(double));
  1. Vector Method Structure: Implement all three vector sizes:
public static Vector128<T> Invoke(Vector128<T> x) { ... }
public static Vector256<T> Invoke(Vector256<T> x) { ... }
public static Vector512<T> Invoke(Vector512<T> x) { ... }
  1. Reference AMD's Latest Code: Use the latest commit from https://github.com/amd/aocl-libm-ose (currently at commit ff46b4e8d145f6ce5ff4a02a75711ba3102fea98 with files dated 2025)

Example: Asin Implementation Approach

From AMD's vrs4_asinf.c:

For abs(x) <= 0.5:
    asin(x) = x + x^3*R(x^2)
    where R(x^2) is a polynomial approximation

For abs(x) > 0.5:
    asin(x) = pi/2 - 2*asin(sqrt((1-|x|)/2))
    using identity and polynomial evaluation

The polynomial coefficients from AMD should be used directly.

Files to Modify

  • src/libraries/System.Numerics.Tensors/src/System/Numerics/Tensors/netcore/TensorPrimitives.Asin.cs
  • src/libraries/System.Numerics.Tensors/src/System/Numerics/Tensors/netcore/TensorPrimitives.Acos.cs
  • src/libraries/System.Numerics.Tensors/src/System/Numerics/Tensors/netcore/TensorPrimitives.Atan.cs
  • src/libraries/System.Numerics.Tensors/src/System/Numerics/Tensors/netcore/TensorPrimitives.Atan2.cs
  • src/libraries/System.Numerics.Tensors/src/System/Numerics/Tensors/netcore/TensorPrimitives.Asinh.cs
  • src/libraries/System.Numerics.Tensors/src/System/Numerics/Tensors/netcore/TensorPrimitives.Acosh.cs
  • src/libraries/System.Numerics.Tensors/src/System/Numerics/Tensors/netcore/TensorPrimitives.Atanh.cs
  • src/libraries/System.Numerics.Tensors/src/System/Numerics/Tensors/netcore/TensorPrimitives.ILogB.cs

Reference Existing Implementations

Study these existing vectorized implementations for the proper pattern:

  • TensorPrimitives.Sin.cs - SinOperator<T> and SinOperatorSingle/SinOperatorDouble
  • TensorPrimitives.Cos.cs - Same pattern
  • TensorPrimitives.Tan.cs - TanOperator<T> and separate TanOperatorSingle/TanOperatorDouble
  • TensorPrimitives.Exp.cs - For polynomial evaluation patterns

Polynomial Evaluation

Use the existing POLY_EVAL_* style macros/methods as seen in AMD's code, adapted to C# vector operations.

This pull request was created from Copilot chat.


💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.

- Add AsinDouble and AsinSingle methods to VectorMath.cs
- Add Asin methods to Vector64/128/256/512 classes
- Update TensorPrimitives.Asin to use vectorization for float/double
- Add AMD license attribution and implementation notes
- Add API compatibility suppressions for new Vector Asin methods

Co-authored-by: stephentoub <2642209+stephentoub@users.noreply.github.com>
Copilot AI changed the title [WIP] Port AMD's AOCL-LibM vectorized implementations to TensorPrimitives Add vectorized Asin implementation to TensorPrimitives Jan 26, 2026
Copilot AI requested a review from stephentoub January 26, 2026 03:04
- Removed all Vector64/128/256/512 public Asin methods
- Reverted all System.Private.CoreLib changes
- Reverted API compatibility suppressions
- Implemented AsinDouble and AsinSingle helper methods directly in TensorPrimitives.Asin.cs
- Algorithm now contained entirely within System.Numerics.Tensors
- No new public surface area added

Co-authored-by: stephentoub <2642209+stephentoub@users.noreply.github.com>
@dotnet-policy-service
Copy link
Contributor

Tagging subscribers to this area: @dotnet/area-system-numerics-tensors
See info in area-owners.md if you want to be subscribed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants