Skip to content

feature : Add support for the tuple datatype#173

Open
tmdeveloper007 wants to merge 1 commit intoarxlang:mainfrom
tmdeveloper007:ISSUE-32
Open

feature : Add support for the tuple datatype#173
tmdeveloper007 wants to merge 1 commit intoarxlang:mainfrom
tmdeveloper007:ISSUE-32

Conversation

@tmdeveloper007
Copy link
Contributor

@tmdeveloper007 tmdeveloper007 commented Feb 27, 2026

Added support for tuples to the irx compiler

  • We can now use tuples in the code.
  • The compiler now knows how to convert these tuples into LLVM structs. It handles them efficiently by using constants when possible or stack memory when needed.

@github-actions
Copy link

OSL ChatGPT Reviewer

NOTE: This is generated by an AI program, so some comments may not make sense.

src/irx/builders/llvmliteir.py

  • Correctness/semantics: Non-constant path pushes an alloca pointer while the constant path pushes a first-class struct value. This pointer/value mix on result_stack is a footgun and likely to break downstream code expecting uniform rvalues. It also leaks mutability/identity for tuples that should be immutable values. Suggest loading the aggregate and pushing the value instead of the pointer:
    (L.1709)
    loaded: ir.Value = self._llvm.ir_builder.load(alloca, name="tuple.val")
    self.result_stack.append(loaded)
    Also update the docstring to reflect value semantics instead of “pushes the pointer.” (L.1651)

  • Performance: The current alloca+store path allocates in the entry block and will frequently escape (since you return the pointer), preventing mem2reg and causing avoidable stack traffic. The above change (load then push the value) makes the alloca trivially promotable (or removable), mitigating this. For an even better approach, consider building the aggregate in SSA without any alloca:
    (near this class, add a helper)
    def _build_tuple_value(self, elem_vals: list[ir.Value]) -> ir.Value:
    """Build tuple aggregate value without alloca."""
    elem_tys = [v.type for v in elem_vals]
    struct_ty = ir.LiteralStructType(elem_tys)
    agg: ir.Value = ir.Constant.undef(struct_ty)
    for idx, v in enumerate(elem_vals):
    agg = self._llvm.ir_builder.insert_value(agg, v, idx)
    return agg

  • Subtle lifetime/identity bug potential: Allocating in the entry block gives a single storage per function invocation; if the pointer identity escapes in loops, different tuple literals across iterations alias the same address. Returning a value (above) avoids this class of bugs.


@github-actions
Copy link

OSL ChatGPT Reviewer

NOTE: This is generated by an AI program, so some comments may not make sense.

src/irx/builders/llvmliteir.py

  • Major correctness: Non-constant path returns a pointer to a stack alloca while constant path returns a first-class struct value. This representation mismatch can cause UB if the pointer escapes (e.g., returned or stored) and will likely break downstream consumers expecting a uniform value. Always return a struct value. Replace the final append with a load of the aggregate (L.1712):
    """Return struct value instead of pointer to avoid escaping stack alloca"""
    val: ir.Value = self._llvm.ir_builder.load(alloca, name="tuple.lit.val")
    self.result_stack.append(val)

  • Docstring mismatch: It currently states that the non-constant path “pushes the pointer.” Update to “pushes the value” to reflect the fix above (L.1651).

  • Performance follow-up (optional but impactful): After fixing the above, consider constructing the tuple via insertvalue instead of alloca+store+load to stay in SSA and avoid stack traffic.


@github-actions
Copy link

OSL ChatGPT Reviewer

NOTE: This is generated by an AI program, so some comments may not make sense.

src/irx/builders/llvmliteir.py

  • High risk: Tuple representation changes based on constness. Constant path returns a value struct, non-constant path returns a pointer (alloca). This makes the IR type of the same AST tuple unstable, will break downstream ops/ABI, and is a latent correctness bug (e.g., nested tuples become ptr-vs-value depending on folding). Make it uniform:

    • Prefer SSA aggregate: build an aggregate value with insert_value for non-constant path instead of alloca+store (L.1683-L.1698), and keep the constant path as-is. Example helper:
      def _build_tuple_aggregate(self, values: list[ir.Value]) -> ir.Value:
      """Build SSA aggregate for tuple"""
      struct_ty = ir.LiteralStructType([v.type for v in values])
      agg = ir.Constant.undef(struct_ty)
      for i, v in enumerate(values):
      agg = self._llvm.ir_builder.insert_value(agg, v, i)
      return agg
    • If you must keep pointer semantics, then also allocate for the all-constant case and store constants, and always push the pointer (L.1677-L.1681), to keep the type stable.
  • UB risk: Returning/passing this tuple literal by value will currently leak a pointer to stack memory (alloca in entry block) if the non-constant path is taken. Either switch to the SSA aggregate approach above, or ensure all escape paths materialize a value aggregate before leaving the function (L.1698-L.1700).

  • Insertion point clobbering: Temporarily moving the shared builder to the entry block and then back to the end of cur_bb can reorder instructions unexpectedly. Use a dedicated IRBuilder for the entry block to emit the alloca without touching the current insertion point (L.1683-L.1689):

    """Alloca at entry without clobbering insertion point"""

    entry_builder = ir.IRBuilder(entry_bb)
    alloca = entry_builder.alloca(struct_ty, name="tuple.lit")


@github-actions
Copy link

OSL ChatGPT Reviewer

NOTE: This is generated by an AI program, so some comments may not make sense.

src/irx/builders/llvmliteir.py

  • Correctness: Mixed representation. Constant path returns a by-value struct, non-constant path returns a pointer. This will cause type mismatches in downstream uses (e.g., PHI, function args) and non-deterministic ABI. Suggest: after filling the alloca, load the struct and push the value to keep a consistent by-value tuple representation.

    • Change (L.1698): replace
      self.result_stack.append(alloca)
      with
      v_loaded = self._llvm.ir_builder.load(alloca, name="tuple.val")
      self.result_stack.append(v_loaded)
    • Update docstring to reflect “pushes the value” instead of pointer (L.1659).
  • Safety of builder position: Temporarily moving the global builder to the entry block and back risks inserting into a terminated block. At minimum, guard and fall back to position_before the terminator.

    • Add before position_at_end(cur_bb) (L.1688):
      if cur_bb.terminator is not None:
      self._llvm.ir_builder.position_before(cur_bb.terminator)

Alternatively, avoid mutating the current builder by allocating with a dedicated entry builder:

  • Change (L.1683):
    entry_builder = ir.IRBuilder(entry_bb)
    alloca = entry_builder.alloca(struct_ty, name="tuple.lit")

@github-actions
Copy link

OSL ChatGPT Reviewer

NOTE: This is generated by an AI program, so some comments may not make sense.

src/irx/builders/llvmliteir.py

  • High risk: Mixed representation. For all-constant tuples you return a first-class struct value, but for non-constant you return an alloca pointer. This will break downstream code that expects a uniform aggregate representation (e.g., GEP+load vs extractvalue). Make the result uniform. Easiest fix: always materialize an alloca in the function entry block and store elements (including the all-constant and empty cases), and push the pointer. Replace the constant fast-path and empty-tuple return with the alloca path. (L.28-L.31, L.38-L.40)
    Suggested change:
    def always_materialize_tuple_pointer(self, struct_ty: ir.Type, llvm_vals: list[ir.Value]) -> ir.AllocaInstr:
    """Always return an alloca pointer for tuple literals."""
    builder = self._llvm.ir_builder
    entry_bb = builder.function.entry_basic_block
    cur_bb = builder.block
    builder.position_at_start(entry_bb)
    alloca = builder.alloca(struct_ty, name="tuple.lit")
    builder.position_at_end(cur_bb)
    i32 = ir.IntType(32)
    for idx, v in enumerate(llvm_vals):
    field_ptr = builder.gep(alloca, [ir.Constant(i32, 0), ir.Constant(i32, idx)], inbounds=True)
    builder.store(v, field_ptr)
    return alloca

    Then:

    • For n == 0: allocate ir.LiteralStructType([]) and push the alloca instead of a constant. (L.28-L.31)
    • Remove the all-constant fast path and use the same alloca+store path. (L.38-L.40)
  • Correctness: Crash if invoked outside a function. Accessing self._llvm.ir_builder.function.entry_basic_block will fail at global/module scope. Add an explicit guard to raise a clear compile-time error. (L.43)
    Suggested change:
    def ensure_function_context(self) -> None:
    """Ensure tuple lowering with non-constant elements occurs inside a function."""
    if self._llvm.ir_builder.function is None:
    raise RuntimeError("LiteralTuple with non-constant elements must be lowered inside a function")

    Call ensure_function_context() before using entry_basic_block. (L.43)


tests/test_literal_tuple.py

  • Add an explicit check that the lowered struct is not packed to avoid silent ABI/layout bugs.

    • Suggest adding a small helper and using it in all tests:
      • Insert after HAS_LITERAL_TUPLE (L.16):
        def _assert_unpacked_literal_struct(const: ir.Constant) -> None:
        """Assert literal struct is unpacked."""
        assert isinstance(const, ir.Constant)
        assert isinstance(const.type, ir.LiteralStructType)
        assert not const.type.packed
      • Then call _assert_unpacked_literal_struct(const) after popping const in each test (L.31, L.59, L.91, L.114).
  • Assert the evaluation stack is empty after consuming the result to catch stray pushes/leaks in the visitor:

    • Insert after each pop:
      def _assert_empty_stack(visitor: LLVMLiteIRVisitor) -> None:
      """Assert translator result stack is empty after evaluation."""
      assert len(visitor.result_stack) == 0
    • Then call _assert_empty_stack(visitor) after the existing assertions (e.g., after L.32, L.65, L.95, L.116).
  • Ensure the heterogeneous case is truly f32 (not accidentally f64). Add:

    • assert const.type.elements[1] == ir.FloatType() (L.95).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant